Write search queries with Python and OpenSearch® to find delicious recipes
Read on to learn how to use OpenSearch® to perform both simple and advanced searches on semi-structured recipe data, and from that produce the perfect menu.
A Python-perfect dinner party with OpenSearch®
Introduction
When I plan a dinner party, I want my guests to have a great experience, and definitely, I do not want anyone hungry. I need to check the ingredients, my guests' diet restrictions, and preferences. If you also feel that planning that special dinner can be challenging, you are in for a treat.
In this blog, I'll show how to find delicious recipes in the pythonic way. We just need data, Python and the powers of OpenSearch® to plan a perfect dinner party!
Our support material in this learning journey will be a useful CLI application that lets you explore common types of OpenSearch query, and even run them yourself.
Getting started
Some may agree that our search results can be only as good as our dataset. But do not worry, we will be using a high-quality dataset from Epicurious which contains over 20.000 full recipes, ratings, and nutrition information for us.
I'll be using Aiven's fully managed OpenSearch service to get our cluster up and running. I've prepared a demo that contains all the code to connect, send data and perform the search queries.
Everything is explained in the project README.rst, so you can just focus on understanding the queries.
Ingest data to the OpenSearch cluster
The first step is to load the data into our OpenSearch cluster, so we can start to query. Check out how to load the data to your cluster using the Python client.
Now, we can start to play with data!
Finding recipes
Rumors are that my grandfather is coming to the dinner, so I know I should have at least one recipe that is low in sodium. Low sodium meals are recommended to reduce blood pressure, but in general, this is a good healthy option for everyone.
We can use the range
function to help us to find documents where the field value (in this case, sodium
) is within a certain range.
Recipes under 140 ms of sodium per serving are considered low sodium meals. So let’s look for recipes around 100 - 140 mg, and build this query as:
{ "query":{ "range":{ "sodium":{ "gte":100, "lte":140 } } } }
We can use the demo program to see the range
query in action by running:
python search.py "sodium" 100 140
I'm curious to see what kind of recipes, we get under this condition, and here we go:
['Salsa Verde ', 'Green Bean and Red Onion Salad with Warm Cider Vinaigrette ', 'Toasted-Pecan Pie ', 'Provençal Chicken and Tomato Roast ', 'Sauteed Cod Provençale ', 'Roasted Potatoes and Asparagus with Parmesan ', 'Sweet-and-Sour Baby Carrots ', 'Ricotta Puddings with Glazed Rhubarb ', 'Butternut Squash and White Bean Soup ', 'Turkish Zucchini Pancakes ']
'Turkish Zucchini Pancakes' seems like a delicious recipe, so this would be my choice.
Also, I need to find a delicious salad recipe for the occasion 🥗. It's radish season and this vegetable goes really well in summer salads, so let's use match_phrase
to look for a "title" containing "Salad with Radish".
{ "query":{ "match_phrase":{ "title":{ "query":"Salad with radish" } } } }
We can run this query using the demo program:
python search.py match-phrase "title" "Salad with Radish"
and here is our result:
['Green Bean and Red Onion Salad with Radish Dressing ']
We only got one match and it seems like radish is only part of the dressing. The reason is that the order of words is important when you use match_phrase
. In this case, the phrase Salad with Radish
only appeared once, hence our single result.
We can fix that by adding some flexibility to our search. There is a powerful feature on match_phrase
that allows us to define the distance that the search words can be from each other. This parameter is called slop
(default=0). So let's try again with the slop
parameter set to 3.
{ "query":{ "match_phrase":{ "title":{ "query":"Salad with radish", "slop":3 } } } }
We can run this query using the demo program:
python search.py match-phrase "title" "Salad with Radish" --slop 3
Not surprisingly, we got more results this time:
['Green Bean and Red Onion Salad with Radish Dressing ', 'Winter Salad with Black Radish, Apple, and Escarole ', 'Avocado Radish Salad with Lime Dressing ', 'Chickpea Salad Sandwich With Creamy Carrot-Radish Slaw ', 'Mâche, Frisée, and Radish Salad with Mustard Vinaigrette ', 'Frisée and Radish Salad with Goat Cheese Croutons ', 'Endive, Mâche, and Radish Salad with Champagne Vinaigrette ', 'Butter Lettuce and Radish Salad with Fresh Spring Herbs ', 'Butter Lettuce and Radish Salad with Lemon-Garlic Vinaigrette ', 'Shaved Carrot and Radish Salad With Herbs and Pumpkin Seeds ']
Now, our results match with "Radish Salad with", "Salad with <something else>
Radish" and so on.
We can pick one and move forward to find a desert.
Let's explore how the match
function works, building a query to find "Chocolate Carrot Cake" in our "title".
{ "query":{ "match":{ "title":{ "query":"Chocolate Carrot Cake", "operator": "and" } } } }
The match
parameter will report results in a sorted order of how closely they relate to "Chocolate Carrot Cake" 🥕. By default match
uses the "OR" operator, giving results for "Chocolate" or "Carrot" or "Cake". However, I want to have all these terms included in the "title" when we search. We can use the "AND" operator for that:
python search.py match "title" "Chocolate Carrot Cake" --operator "and"
Here are our cake results.
['Chocolate-Orange Carrot Cake ', 'Milk Chocolate Semifreddo with Star Anise Carrot Cake ']
Everything seems delicious and we are ready for the party 🥳! And what's for your dinner? You can play around writing your own search queries and find your own perfect dinner.
Happy meal, everyone!
Examples and other resources
- Find the code sample on the GitHub repository
- Continue the tutorial to learn how to query data
- Find more documentation resources for Aiven for OpenSearch
You can sign up for a free trial to start using Aiven for OpenSearch.