Whisk is harnessing the power of artificial intelligence (AI) to enable consumers to instantly convert a recipe into a smart shopping list. If that list is connected to a grocery store, you can go from recipe to shopping cart in just a few seconds. The magic of shoppable content is that it’s seamless for the end user. But the technology backend to make it happen is driven by machine learning and the ability to process and analyze massive amounts of data in real time.
You can think of all of the stuff that goes into a recipe as the DNA for whatever you want to create. In nature, all of that genetic information is called a genome. The Whisk Food Genome is an always evolving food ontology that enables machine learning algorithms to map the relationships between ingredients, products, and recipes.
The Whisk system facilitates the creation of personalized, data-driven content that connects food content with commerce. It’s a food technology platform that provides recipe content creators, brands, and grocers with the tools they need to support the entire consumer food journey.
Mapping the AI food universe
Whisk, which is part of the Samsung Next product team, created its online shopping platform to make any food content shoppable by connecting recipes, product pages, and even advertisements with grocers. The Whisk Food Genome reflects a complex data hierarchy that can be used to map a recipe into code, and to analyze what ingredients are needed, and in what quantity.
Think about all the variables involved in a recipe. There are the ingredients, of course. All those ingredients have to be apportioned just right, so there are measurements to consider. Sourcing those elements means a visit to a virtual or brick and mortar store. Nutrition is a factor, and you need to know calories, sugar content, and shelf life. Finally, you have to understand how to put it all together, and how to cook it.
The Food Genome dynamically learns as the Whisk ecosystem grows. Whisk bots are hard at work, 24/7, scraping recipe names, prep and cook times, yields, images, descriptions, ingredients, instructions and methods. With this information, the system can automatically enhance recipes already in the Food Genome database.
One of our goals, on the front end, is to be able to populate recipes with all the important information for the consumer, ranging from images and prep time to ingredients and instructions. On the backend, we annotate each recipe in code. Our developers use scripts and manual processes to tag and sort all the data so that it populates the appropriate fields in a recipe.
Harnessing the power of big data
Organization is the key ingredient when it comes to managing recipes. The Food Genome ontology, also known as a food graph, includes 23,000 products and 160,000 lexicals, which are synonyms for describing how an ingredient appears in different recipes. There are about 7.6 million unique ingredient lines in the ontology, and about 1.2 million lines have been manually annotated.
We use natural language processing technologies to understand unstructured food content and enrich it with detailed data. Our algorithms process more than 15 gigabytes of recipe data daily, and we map ingredients to more than 140,000 store items available from grocers in the U.S., UK, Germany, and Australia. This gives shoppers a streamlined path from recipe inspiration to food purchase.
Our 7.5 million ingredient lines are connected to various recipes. For example, one tablespoon of vegetable oil can appear in more than 50 recipes, but in our database it starts with one entry.
Precise product matching is important because one ingredient can be written in many different ways. Products added to the ontology database have to align with how the data maps to recipes. If any information in the food graph is mis-categorized, it will affect related recipes.
The ontology also facilitates categorization and enables the system to determine a specific product is stocked by a partner grocer. We also have categorized data that enables us to define where an item should be stored, such as in the refrigerator, freezer, or pantry.
Our ontology database also includes a hierarchy of products. This is an important feature because end-users often shop by reviewing ingredients that, for instance, relate to dietary preferences. The ontology reflects a complex network of relationships. The end result is that if a user doesn’t, for example, want to see any recipes with pork included, we can filter based on that ingredient. Constraints can be inherited throughout the hierarchy or overridden for exceptions.
Mapping items from store to ontology products is what enables the Food Genome to automatically convert any recipe into shoppable content. We also have special algorithms that give priority to preferred store products, and that suggest alternatives when a specific retailer does not have a particular product.
In order to improve the functionality for suggesting alternatives, we try to map equivalents whenever possible. These are products that are technically the same as the original product, but have minor differences. For example, the equivalent for a vegetable stock cube would be 250ml of vegetable stock.
Users browsing recipes also can take advantage of Whisk’s ability to map whether a product is liquid or solid, and to easily toggle between metric and imperial measurements. In this recipe for Simple Bolognese, for example, clicking “convert units” lets the user choose how the recipe is displayed. Going from imperial to metric, a 28 ounce can of crushed tomatoes is displayed as 790 grams.
Another challenge of food mapping is labels. The Whisk Food Genome currently bases labeling on cooking technique, meal type, or type of cuisine. In addition to providing information to consumers on the front end, labeling facilitates recipe filtering. Some labels relate to how a food has been processed, while others are connected to the type of appliances used for cooking.
Automating labeling based on technique – such as simmering or sautéing – is especially difficult because it often requires parsing data from recipes that don’t use a standard way to describe how something should be cooked.
The food graph includes four different label sets. Some are configured manually. Others are parsed automatically from the recipe pages. Another set includes additional information beyond what is retrieved from a source. Finally, we have predictive labels that are created by our machine learning algorithms.
The Whisk Food Genome learns from both parsed information and manual data. As the amount of data grows, the machine learning models get better at automatically creating labels. On the backend, our goal is to reduce the amount of manual labor and coding needed to update the system.
Nutritional information for meal planning apps
The Whisk Food Genome automates nutrition calculations for recipes. End users rely on this data when choosing recipes. Our B2C partners also need trusted data they can share with consumers. The ontology enables the calculation of nutrition information, and provides for ingredients to be excluded based on a user’s dietary needs. Meal planners can see constraints on food items related to allergies, avoidances, and preferences. This functionality also helps users find recipes aligned with certain diets, such as vegan or low-carb.
Another innovative feature of our platform is the Whisk health score, calculated on a range from 2-10. Our complex algorithm factors-in ratios between micro and macronutrients, and recommends daily intake amounts. The score represents the overall nutritional balance of a recipe, ingredient, or dish.
As the algorithms we develop become more sophisticated, the accuracy of our automated nutritional values more closely matches the data that our B2C partners calculate manually. Better data enables us to predict nutrition values more precisely, and to create accurate labels. For example, we currently classify all size apples the same way. That means all apples are 182 grams, regardless of size. But soon, we’ll have an algorithm that classifies small, medium, and large size apples. One small apple is 149 grams and 77 calories compared with a large apple that is 182 grams and 95 calories. The difference seems small with apples, but it adds up.
Measures and density are also defined at the product level. For example, one slice of tomato is not the same as one slice of bread, and one cup of flour is not the same as one cup of milk. Density is also an important part of the unit transformations feature in Whisk. Different foods have different densities. One cup of flour, for instance, translates into 120 grams, while one cup of sugar equals 200 grams.
Another way we are improving the ability to automate nutrition information is by expanding our density data, which is the measure of mass per unit of volume. We normalize ingredients into grams, but that doesn’t necessarily reflect differences between cooked and uncooked ingredients.
For example, if we normalize for rice, three cups uncooked and three cups cooked both yield 635.76 grams. But three cups of cooked rice should be more specific, 211 grams. The difference in nutritional data is significant. Three cups of uncooked rice converts to 2317 calories compared to 710 calories for three cups cooked.
The reason for the difference is that when you put rice in water and cook it, you will get more volume. This is also important information for a recipe because you only need only one cup of uncooked rice to produce three cups of cooked rice.
Finally, our algorithms have to account for yield when calculating nutritional data. Yield is the percentage of the edible part of the product. So, for example beef ribs have meat and bones, but only the meat is edible. So, one kilogram of beef ribs does not equal one kilogram of meat. For more accuracy, we need to calculate a nutritional value for only the 500 grams of meat, factoring in the weight of the bone.
Many flavors of analytics
As the Whisk Food Genome ontology grows in size, our ability to automate analytics improves. Plus, as we map more recipes, the number of product and ingredient relationships grow.
We also are expanding our analytics ability by mapping data in different languages, including English, French, German, Hindi, Italian, Korean, Portuguese, and Spanish. Different countries have very specific products that are often unique to their region and result in new data populating our tables. For example, to support German recipes, we’ve added new varieties of sausages. French recipes have required an expanded lexicon of cheese, and Korean recipes require classifications for different types of seaweed.
Today, the number of lexicals we have defined is five times greater than it was three-years-ago. Every time we add a new product, we have to add a lexical corresponding to that product. We also need to link existing lexicals to those products. Finally, we have to map all of these for each unique product in the database.
All of the data we collect is used as a training set for our machine learning models. At this point, Whisk can automate ingredient predictions for more than four million recipes. This automation gives us a real-time view of how well we are matching stores items with recipes on B2B and B2C websites. Automation also enables us to make sure that we have mapped all of the products being sold by our grocer partners.
The artificial intelligence cooking platform
Our AI-powered platform connects the entire food ecosystem, from brands and recipe content creators to grocers and, of course, consumers. Our Web API and Mobile SDK give our partners the power to do more with recipes, shopping lists, and personal preferences. Whisk’s ontology and semantic analysis of recipes enable us to deliver relevant advertising based on a wide range of factors, such as recipe type, ingredient, taste, nutrition, and price.
The Whisk Food Genome relies on a comprehensive ontology of products that reflects a deep understanding of different food components and how they are connected. Machine learning enables us to improve the platform with a map of food, recipes, and products so that consumers can enjoy a true recipe-to-table eating experience.