Perfect Pizza: A Customer Research Tool Case Study

Think about the last time you ate a tasty pizza. What was it like? Did it have the ideal amount of toppings? Was the flavor intensity just right? Was it served at the perfect temperature—hot, but not too hot? Did it look appealing too?

Taste is, of course, paramount in any food-related business, directly influencing customer loyalty and retention, and therefore revenue. And, despite being a deeply personal, subjective experience, taste can be measured. When I was working as a product manager for a global pizza company, my team created an innovative digital solution to quantify and track taste using an e-commerce platform.

Together, we developed a feature that collected customer feedback for free. This provided a large volume of data that allowed the company to track quality issues, continually refine ingredients, and improve customer satisfaction at each of its more than 800 locations.

Accounting for Taste

Sensory testing is a hugely time-consuming and expensive process that companies in the food industry often outsource to specialized agencies. It can take several months for researchers to recruit participants, conduct in-person testing, and process results. The cost of researching a single product differs according to region and other factors, but can quickly reach tens of thousands of dollars. For an international restaurant chain getting feedback across an entire menu, those figures can balloon to the millions.

We’ve also seen the limitations of traditional sensory testing. An example in product lore was the change to Coca-Cola’s signature formula: New Coke—a sweeter version of the original—was released in 1985 to consumer furor. Despite solid customer research prior to launch, only 13% of actual consumers liked the new taste. The change was reversed just 79 days later.

The larger the business, the higher the cost for such missteps: If you add too much sugar to a soft drink or too few toppings on a new type of pizza, those errors can measure in the millions.

Taking Taste Research Online

The aim of our new tool was to bring sensory research online and in-house. The traditional method asks participants to taste the product and answer a series of standardized questions evaluating quality. We took this style of questionnaire and built it directly into the company’s food delivery and in-restaurant ordering application.

After an order or delivery is completed, the app prompts customers to provide feedback about a specific pizza from their order. Once they accept, the app loads the taste evaluation survey. The app asks users to evaluate products by varying characteristics including appearance, overall taste, temperature, taste intensity, amount of toppings, and juiciness. The final question is on a criteria specific to the type of pizza, such as spiciness for pepperoni. We found that between seven and nine questions were optimal, taking around 20 seconds total to complete—any longer and users may become disengaged.

To measure the answers, we decided on a just about right (JAR) scale, aiming for a JAR score between 70% and 80% to allow for personal taste differences. This animation illustrates a typical survey:

After receiving their order, customers are prompted to answer this series of questions about a product using a JAR scale.

Sensory research agencies usually collect around 400 responses per one survey product. In the first year after launching our in-house tool, we received more than 600,000 responses—a 3% conversion rate.

Analysis and Action: How We Used the Customer Research Data

Using the rich data the tool provided, the product team tracked taste weekly to monitor any problems, which is particularly important when using fresh ingredients. When a sharp dip occurred, as in the sample dashboard below, the company was able to immediately investigate. In one instance, it found that the blue cheese supplier had delivered a batch that was not up to its usual standard, which had dramatically affected the taste of the four-cheese pizza.

A graph titled Sample Dashboard: Overall Taste Score by Week shows simulated data. On the vertical axis is the taste metric, running from 6.2 to 6.7. On the horizontal axis are dates representing weeks, ranging from June 6, 2021, to September 19, 2021. The data points plotted are relatively consistent, between 6.4 and 6.6, with the exception of the week of July 4, for which there is a sharp dip in the taste metric plotted below 6.3. — This graph depicts an overall taste score by week using simulated data.

The survey indicators enabled us to create a taste profile for each product, which allowed us to make simple adjustments. For example, if a particular pizza scored low on juiciness, we introduced more tomato sauce. We were also able to identify popular tastes and use them in new recipes.

The tool enabled us to optimize recipes and drive value for the business. We tested removing quantities of certain ingredients, such as slices of pepperoni, and monitored changes in taste perception. If taste scores remained unaffected, we kept the altered recipes in place, yielding financial savings across the restaurant chain.

We took the same data-driven approach to new product launches, responding to customer preferences by changing or even withdrawing products based on feedback. When introducing a carbonara pizza, for example, we tracked metrics during the first week and saw that the average score for taste was 5.94 out of a possible 7. The average score across all pizzas was 6.3.

The other data points revealed the problem: Almost 48% of respondents thought the amount of toppings was insufficient. The company quickly changed the recipe, adding more bacon (which had the side effect of also increasing juiciness and taste intensity). The following week, the average taste increased from 5.94 to 6.

An illustration titled Sample Metrics Comparison for Carbonara Pizza depicts two bar charts based on customer research data, representing week 1 and week 2, respectively. On the vertical axis is the percentage of respondents, running from 0 to 100. The first bar shows four categories on the JAR survey: the first bar is taste intensity, the second bar is juiciness, the third bar is amount of toppings, and the fourth bar is temperature. The bars are split into five colors, each representing a different response on the JAR survey. In week 1, the amount of toppings was rated as insufficient by almost 48% of respondents. In week 2, this percentage had decreased significantly due to the company’s addition of bacon. The bars showing juiciness and taste intensity also depict slightly improved scores as a result. — A JAR score of 3 for “Amount of toppings” from 49% of respondents in Week 1 revealed the cause of the low overall taste score. Increasing the amount of toppings resulted in a JAR score of 3 from 69% of respondents in Week 2, as well as improvements to the JAR scores of some other indicators.

We used the tool to explore numerous hypotheses, such as the correlation between temperature and taste. As expected, if a pizza was rated cold, the taste score plummeted to 4. The data points displayed in the following table illustrate that if the pizza strayed outside of “Just about right,” even on the hot side, the taste score also decreased. This told us that temperature is a crucial factor in customer satisfaction.

A table entitled Sample Data: The Relationship Between Taste and Temperature shows the relationship between taste and temperature. The first column lists five temperature ratings: too hot, slightly too hot, just about right, slightly too cold, and too cold. The second column shows the average number of orders in a single delivery. The third column shows the time the pizza spent on the heating rack in minutes. The fourth column shows the total cooking and delivery time in minutes. The fifth column shows the overall taste score out of 7. The data points in the table indicate that when a customer’s pizza was rated too hot, there were fewer orders in the delivery, it spent less time on the rack, the total cooking and delivery time was less, and the taste score was higher. Conversely, when the pizza was rated too cold, there were more orders in the delivery, it spent more time on the rack, the total cooking and delivery times were more, and the overall taste score was much lower.

The company had been considering purchasing thermal delivery bags with heating elements. Each bag cost around $3,000 and five to 10 bags were needed for each pizzeria—a significant investment. But we now had a business case for the expense: Keeping pizza warm would result in better taste scores.

Through our analysis, we also found a direct correlation between the visual appearance of a pizza and its perceived taste (i.e., the more attractive the pizza, the higher the taste score), as well as between taste and dryness. We extracted ERP system data showing how long a pizza had been on a hot shelf prior to customer consumption and correlated it with our survey data. The results confirmed that if a pizza is on the shelf longer than 60 minutes, it becomes too dry and receives a lower taste score. To remedy this, restaurants reduced the amount of pizza they prepared in advance of busy periods.

Within a franchised business, it is generally difficult to ensure all branches are following brand guidelines. Previously, when a restaurant had a low rating, we had little insight into the reason. It could be because the pizzeria is not following protocol, delivery is slow, or perhaps the customers in that region have a particular taste preference. The tool we developed facilitated a deeper analysis of the customer experience by pizzeria. We closely monitored each branch, investigated low ratings, and implemented fast, effective improvements.

By tracking the average temperature for each pizzeria, for example, we could see that if one scored particularly low, couriers were likely carrying too many orders in a single trip, and orders delivered later on the route arrived cold.

A Direct Line to Customer Satisfaction

By leveraging the go-to ordering method for customers, we were able to develop an in-app solution, conducting a massive amount of research in a way that was not intrusive for the user—and at no additional cost to the company.

The tool may seem simple, but the instant data it generated dramatically increased the level of insight the company had into the experiences and preferences of its customers and allowed it to take a responsive, evidence-based approach to operations. Through small adjustments to recipes and processes, we were able to deliver products that customers found tastier, improving their satisfaction and making them more likely to order again.

Our e-commerce-enabled solution worked like a magnifying glass, offering a granular view of quality at a large-scale franchise that made managing more than 800 locations around the world much easier and much more efficient.

Following the development of this customer feedback tool at the pizza franchiser, I went on to implement a similar platform at a large food retailer that had an app with about 10 million users, which garnered similar results.

Taste isn’t simple, but our tool showed that it can be broken down into a reproducible formula—like any successful recipe.

Special thanks to my former colleague Gleb Kotlyarov, a research specialist who developed the idea for this innovative tool.