Ideas Engineered for Tomorrow
We Engineer Services & Solutions for Your Business Needs
Home About
Products
Services
Hire
Industries
Consulting
Partners
Articles Careers Contact
Software Development

Elasticsearch: Building Powerful Search for Your Application

Your users expect Google-quality search. SQL LIKE '%query%' won't cut it. Here's how to build fast, relevant, typo-tolerant search with Elasticsearch — from first index to production deployment.

🔍 Database & Data February 5, 2026 14 min read

In This Guide

Search is the feature users notice most when it's bad and least when it's good. A search that returns irrelevant results, can't handle typos, or takes 2 seconds to respond will drive users away faster than almost any other UX problem. Elasticsearch solves this — but it requires understanding how full-text search actually works.

1. Why Elasticsearch (and When Not to Use It)

Search Need Best Solution Why
Simple keyword filter on < 100K rowsPostgreSQL full-text searchNo extra infrastructure
Typo-tolerant search with rankingElasticsearch or MeilisearchPurpose-built for relevance
E-commerce product searchElasticsearch (or Algolia)Facets, filters, boosting, analytics
Log search and analyticsElasticsearch (ELK stack)Designed for log ingestion and analysis
Semantic / vector searchElasticsearch 8+ or Weaviate/PineconekNN vector search built-in
Small site search (< 10K docs)Meilisearch or TypesenseSimpler, faster to set up

2. Core Concepts — Indices, Mappings, Analyzers

How Text Search Works (Inverted Index)

Document 1: "The quick brown fox jumps"
Document 2: "Quick brown dogs leap over fences"

Analyzer pipeline: lowercase → remove stopwords → stem

Inverted Index:
    "quick"  → [Doc 1, Doc 2]
    "brown"  → [Doc 1, Doc 2]
    "fox"    → [Doc 1]
    "jump"   → [Doc 1]         ← "jumps" stemmed to "jump"
    "dog"    → [Doc 2]
    "leap"   → [Doc 2]
    "fenc"   → [Doc 2]         ← "fences" stemmed to "fenc"

Search for "quick fox" → finds Doc 1 (matches both terms, higher score)
                       → also finds Doc 2 (matches "quick")

Index Mapping — Define Your Schema

PUT /products
{
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
                "analyzer": "standard",
                "fields": {
                    "keyword": { "type": "keyword" },     // For exact match & sorting
                    "autocomplete": {                       // For search-as-you-type
                        "type": "text",
                        "analyzer": "autocomplete_analyzer"
                    }
                }
            },
            "description": { "type": "text", "analyzer": "english" },
            "category": { "type": "keyword" },             // Exact match only (facets)
            "price": { "type": "float" },
            "rating": { "type": "float" },
            "in_stock": { "type": "boolean" },
            "created_at": { "type": "date" },
            "location": { "type": "geo_point" }            // For geo search
        }
    },
    "settings": {
        "analysis": {
            "analyzer": {
                "autocomplete_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "autocomplete_filter"]
                }
            },
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 2,
                    "max_gram": 15
                }
            }
        }
    }
}

3. Indexing Strategies

Bulk Indexing — The Right Way

// Node.js — bulk index 10,000 products
const { Client } = require('@elastic/elasticsearch');
const client = new Client({ node: 'http://localhost:9200' });

// WRONG: Individual index calls (slow — 1 HTTP request per document)
for (const product of products) {
    await client.index({ index: 'products', body: product });  // 💥 N requests
}

// RIGHT: Bulk API (fast — 1 HTTP request per batch)
const body = products.flatMap(product => [
    { index: { _index: 'products', _id: product.id } },
    product
]);

const { body: bulkResponse } = await client.bulk({
    refresh: true,  // Make documents searchable immediately
    body
});

if (bulkResponse.errors) {
    const erroredDocs = bulkResponse.items.filter(item => item.index.error);
    console.error('Failed documents:', erroredDocs);
}

// Optimal batch size: 5-15 MB per bulk request (typically 1,000-5,000 docs)
Sync Strategy How It Works Latency Best For
Sync on writeApp writes to DB + ES simultaneously< 1 secondSimple apps, low write volume
Event-drivenDB write → event → consumer indexes ES1-5 secondsMicroservices, decoupled systems
CDC (Debezium)DB transaction log → Kafka → ES connector1-10 secondsNo app changes, reliable sync
Periodic reindexCron job reads DB, bulk indexesMinutes to hoursInfrequent changes, full consistency

4. Query DSL — Search That Actually Works

E-Commerce Product Search — Complete Query

GET /products/_search
{
    "query": {
        "bool": {
            "must": [
                {
                    "multi_match": {
                        "query": "wireless headphones",
                        "fields": ["name^3", "description", "category^2"],
                        "type": "best_fields",
                        "fuzziness": "AUTO",
                        "prefix_length": 2
                    }
                }
            ],
            "filter": [
                { "term": { "in_stock": true } },
                { "range": { "price": { "gte": 50, "lte": 300 } } },
                { "terms": { "category": ["electronics", "audio"] } }
            ],
            "should": [
                { "range": { "rating": { "gte": 4.0, "boost": 2 } } },
                { "term": { "featured": { "value": true, "boost": 5 } } }
            ]
        }
    },
    "highlight": {
        "fields": {
            "name": {},
            "description": { "fragment_size": 150 }
        }
    },
    "aggs": {
        "categories": { "terms": { "field": "category", "size": 20 } },
        "price_ranges": {
            "range": {
                "field": "price",
                "ranges": [
                    { "to": 50 },
                    { "from": 50, "to": 100 },
                    { "from": 100, "to": 200 },
                    { "from": 200 }
                ]
            }
        },
        "avg_rating": { "avg": { "field": "rating" } }
    },
    "size": 20,
    "from": 0
}

Key concepts in the query above:

5. Relevance Tuning

Default Elasticsearch relevance (BM25) is good but not great. Here's how to make search results feel right.

Technique How When to Use
Field boosting"name^3" — weight title matches higherAlways — titles are more relevant than body text
Function scoreBoost by popularity, recency, or ratingWhen freshness or popularity matters
SynonymsCustom synonym filter in analyzerDomain-specific terms ("laptop" = "notebook")
Pinned queriesForce specific docs to top for queriesPromoted products, editorial picks
Decay functionsLower score for older/farther resultsNews, events, location-based search

Function Score — Boost by Popularity + Recency

GET /articles/_search
{
    "query": {
        "function_score": {
            "query": { "match": { "content": "database scaling" } },
            "functions": [
                {
                    "field_value_factor": {
                        "field": "view_count",
                        "modifier": "log1p",
                        "factor": 0.5
                    }
                },
                {
                    "gauss": {
                        "published_at": {
                            "origin": "now",
                            "scale": "30d",
                            "decay": 0.5
                        }
                    }
                }
            ],
            "score_mode": "multiply",
            "boost_mode": "multiply"
        }
    }
}

6. Common Patterns — Autocomplete, Facets, Geo

Search-as-You-Type (Autocomplete)

// Index with search_as_you_type field (ES 7.2+)
PUT /products
{
    "mappings": {
        "properties": {
            "name": {
                "type": "search_as_you_type"  // Creates name, name._2gram, name._3gram
            }
        }
    }
}

// Query — matches partial words as user types
GET /products/_search
{
    "query": {
        "multi_match": {
            "query": "wire head",
            "type": "bool_prefix",
            "fields": ["name", "name._2gram", "name._3gram"]
        }
    }
}
// Matches: "Wireless Headphones", "Wired Headset", etc.

Geo Search — Find Nearby

GET /stores/_search
{
    "query": {
        "bool": {
            "must": { "match": { "type": "restaurant" } },
            "filter": {
                "geo_distance": {
                    "distance": "5km",
                    "location": { "lat": 19.076, "lon": 72.877 }  // Mumbai
                }
            }
        }
    },
    "sort": [
        {
            "_geo_distance": {
                "location": { "lat": 19.076, "lon": 72.877 },
                "order": "asc",
                "unit": "km"
            }
        }
    ]
}

7. Scaling and Operations

Scale Cluster Size Key Settings Monthly Cost
< 1M docs1 node (8GB RAM)1 primary, 0 replicas$50-100
1-10M docs3 nodes (16GB each)5 shards, 1 replica$300-600
10-100M docs5-10 nodes (32GB each)Time-based indices, ILM$1,000-3,000
> 100M docs10+ nodes, dedicated mastersHot-warm-cold architecture$3,000+
What We've Learned Running Elasticsearch: The #1 operational issue is JVM heap pressure. Set heap to 50% of RAM but never exceed 31GB (compressed oops boundary). Monitor cluster health daily — yellow means replicas aren't allocated (missing nodes), red means primary shards are missing (data loss risk). Use Index Lifecycle Management (ILM) to automatically roll over and delete old indices.

Frequently Asked Questions

Elasticsearch vs Meilisearch vs Typesense?

Elasticsearch for complex search requirements (facets, aggregations, geo, analytics) and large-scale deployments. Meilisearch for simple, fast search with minimal configuration — great for small to medium apps. Typesense for type-ahead search with out-of-the-box relevance. Pick based on complexity needs, not hype.

Should I use Elasticsearch or OpenSearch?

OpenSearch is the AWS-backed fork of Elasticsearch 7.10. Both are capable. Use OpenSearch if you're on AWS (native Amazon OpenSearch Service). Use Elasticsearch if you want the latest features (vector search, ESQL) or use Elastic Cloud. The APIs are 95% compatible.

How do I keep Elasticsearch in sync with my database?

For most apps: write to both DB and ES on every change (sync-on-write). For larger systems: use CDC with Debezium — it captures every database change and streams it to ES via Kafka. This is more reliable than app-level dual writes because it catches changes from any source (migrations, admin tools, other services).

Can Elasticsearch replace my database?

No. Elasticsearch is not a primary data store — it lacks ACID transactions, has eventual consistency, and can lose data during cluster issues. Always keep your source of truth in a proper database (PostgreSQL, MySQL, etc.) and use ES as a secondary search index.

How much RAM does Elasticsearch need?

Rule of thumb: 1GB heap per 20GB of index data. Give ES 50% of available RAM as JVM heap (max 31GB), and leave the other 50% for the filesystem cache (critical for search performance). A node with 16GB RAM should get 8GB heap and can comfortably handle ~160GB of index data.

🔍

Pillai Infotech LLP

We implement search solutions — from simple site search to complex e-commerce search with facets, autocomplete, and relevance tuning. Let's build your search.

Related Articles

NoSQL Databases Guide: MongoDB, Redis, Cassandra, and DynamoDB → Redis Caching Patterns: Beyond Simple Key-Value → Database Scaling Strategies: Sharding, Replication, and Caching →