Start using OpenSearch with NodeJS

Great search and aggregation features can make a big difference to your application. Read on to see how to use OpenSearch with your NodeJS.

21 December 2021
Olena Kutsenko
Olena Kutsenko RSS Feed
Developer Advocate at Aiven

OpenSearch is here to add a high-quality, fully open source search engine to your application, and its NodeJS client is just making it so much easier. Time to take those two for a spin!

And what is a better dataset example than the one full of gourmet meals and cooking instructions! We'll be using a dataset from Epicurious with over 20k of different recipes to search and aggregate the data. We'll look for most unexpected food combinations, find the least frequent items, create a date histogram, calculate moving average and do so many other things. We'll probably leave you very hungry, but also inspired!

Starting with the basics of creating an OpenSearch cluster, we'll bring you step by step through a process of setting up your playground before we jump into writing search and aggregation queries.

Try different flavours of search queries

At the heart of OpenSearch lies a power to create flexible search queries for three main groups of requests: term-level, full-text and boolean. Term-level queries are handy when we need to find exact matches without additional analysis. Full-text queries allow a smarter search for matches in analysed text fields in order to return results sorted by relevance. And last, but not least, boolean queries are useful when combining multiple queries together.

For example, if you want to find a soup with tomatoes, garlic and dill, match function comes handy:

    /**
     * Finding matches sorted by relevance.
     * run-func search match title 'Tomato-garlic soup with dill'
     */
    module.exports.match = (field, query) => {
      console.log(`Searching for ${query} in the field ${field}`);
      const body = {
        query: {
          match: {
            [field]: {
              query,
            },
          },
        },
      };
      client.search(
        {
          index: indexName,
          body,
        },
        logTitles
      );
    };

Or, maybe you want to take into account spelling mistakes which your users can accidentally make, then fuzziness property is helpful:

/**
 * Specifying fuzziness to account for typos and misspelling.
 * run-func search fuzzy title pinapple 2
 */
module.exports.fuzzy = (field, value, fuzziness) => {
  console.log(
    `Search for ${value} in the ${field} with fuzziness set to ${fuzziness}`
  );
  const query = {
    query: {
      fuzzy: {
        [field]: {
          value,
          fuzziness,
        },
      },
    },
  };
  client.search(
    {
      index: indexName,
      body: query,
    },
    logTitles
  );
};

If we got you intrigued, find other examples of search queries for matching values, ranges, fuzzy phrases, combination of clauses and more in our article How to write search queries with OpenSearch and NodeJS.

Learn how to aggregate data

If you're familiar with search queries, then you're ready to embark on a next adventure and learn how to run aggregation requests and read the results. In particular, we'll look at three different aggregation types: metric, bucket and pipeline.

Metric aggregation are helpful for such computations as finding minimum or maximum value, calculating an average or collecting statistics about field values. For example, this is how we can use metric aggregations to calculate percentile ranges:

/**
 * Get metric aggregations for the field
 * Examples: avg, min, max, stats, extended_stats, percentiles, terms
 * run-func aggregate metric percentiles calories
 */
module.exports.metric = (metric, field) => {
  const body = {
    aggs: {
      [`aggs-for-${field}`]: { // aggregation name, which you choose
        [metric]: { // one of the supported aggregation types
          field,
        },
      },
    },
  };
  client.search(
    {
      index,
      body,
      size: 0, // ignore `hits`
    },
    logAggs.bind(this, `aggs-for-${field}`) // callback to log the aggregation output
  );
};

Can you guess how calories level ranges over our recipes? Follow the article and run run-func aggregate metric percentiles calories to see it. Spoiler alert - Chocolate Plum Cake is one of the most caloric dishes available (and perhaps one of the most delicious too!).

Even though metric aggregations are great, we have another no less exciting type of aggregations - the bucket aggregation. It distributes documents over a set of buckets based on provided criteria. Bucket aggregations can be used for a variety of things, for example, we can use it to find the most rare items in our dataset, such as the least frequently used categories:

/**
 * Group recipes into buckets to find the most rare items
 * `run-func aggregate rareTerms categories.keyword 3`
 */
module.exports.rareTerms = (field, max) => {
  const body = {
    aggs: {
      [`rare-terms-aggs-for-${field}`]: {
        rare_terms: {
          field,
          max_doc_count: max, // get buckets that contain no more than max items
        },
      },
    },
  };
  client.search(
    {
      index,
      body,
      size: 0,
    },
    logAggs.bind(this, `rare-terms-aggs-for-${field}`)
  );
};

If you run the query you'll see some definitely unexpected results!

And finally, the pipeline aggregations combine several aggregations to build more complex flows and calculate such things as moving averages, cumulative sums and perform a variety of other mathematical calculations over the data in the documents. In particular we'll look at how to calculate a moving average of number of recipes added throughout the years.

You can find all these examples and many others in our article Use Aggregations with OpenSearch and NodeJS.

Examples and other resources

And, finally, sign up for a free trial to start using Aiven for OpenSearch and follow us on Twitter to stay up-to-date with product and feature-related news.

tipsopensearchtutorials
orange decoration
yellow decoration

Start your free 30 day trial!

Build your platform, and throw in any data you want for 30 days, with no ifs, ands, or buts.

orange decoration
yellow decoration

Start your free 30 day trial!

Build your platform, and throw in any data you want for 30 days, with no ifs, ands, or buts.

Products

Aiven for Apache Kafka®Aiven for Apache Kafka Connect®Aiven for Apache Kafka MirrorMaker 2®Aiven for Apache Flink® BetaAiven for M3Aiven for M3 AggregatorAiven for Apache Cassandra®Aiven for OpenSearchAiven for PostgreSQLAiven for MySQLAiven for RedisAiven for InfluxDBAiven for Grafana

Let‘s connect

Apache Kafka, Apache Kafka Connect, Apache Kafka MirrorMaker 2, Apache Flink and Apache Cassandra are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. ClickHouse, M3, M3 Aggregator, OpenSearch, PostgreSQL, MySQL, Redis, InfluxDB, Grafana are trademarks and property of their respective owners. All product and service names used in this website are for identification purposes only and do not imply endorsement.