In the spring of 2001 Kara and I held a pizza contest in our new home. Here are the results…
- Which pizzas competed?
- The nine contestants were Papa’s Pizza, Papa John’s, Izzy’s, The
Headliner, Pizza Pipeline, Woodstock’s, Pizza Hut, American Dream, Cirello’s
- What about my favorite pizza, Brand X?
- The basic rule for entry was that the pizza place had to both cook and deliver the pizza. This eliminated Papa Murphy’s who don’t bake or deliver and Little Caesar’s who refused to deliver to our house. Figaro’s was eliminated by an executive decision.
- What toppings/sizes were ordered?
- The size of the pizza ordered was always the smallest that each individual establishment would deliver. For fairness and mostly universal appeal, each pizza was ordered with only olives and mushrooms.
- OK, but how much did each pizza cost!
- The two question marks are due to the vendor not knowing the size of their pizza in inches…
Vendor Size Cost American Dream 14 inch $13.35 Cirello’s 12 inch $11.00 Izzy’s 10(?) inch $12.99 Papa John’s 10 inch $9.28 Papa’s Pizza 10 inch $9.00 Pizza Hut 12 inch $12.84 Pizza Pipeline 10 inch $7.75 The Headliner 12(?) inch $10.35 Woodstock’s 12 inch $10.50
- How where the pizzas arranged on the testing tables?
- The pizzas were arranged in the order of delivery, plus a little randomness. Also, each pizza was ordered with a slightly different delivery time, and most were delivered at the incorrect time, so this further added randomness (the worst The Headliner who delivered an hour early) Each pizza was stored in a warmed oven until testing time at which point they were placed out on the tables covered in foil to further keep them warm.
- What were the judging criteria?
- Each pizza was judged on 9 criteria in 4 sub-categories and then given a separate “overall” score. Each criterion was assigned a score from 1 to 5, where 1 meant poor, insufficient, etc. and 5 meant excellent, appropriate, etc. The actual criterion were: Crust – flavor, structure, Sauce – Flavor, Quantity, Toppings – Freshness, Fixedness, Quantity, Cheese – Quantity, Quality, and Over-All
- Who came up with those crazy criterion?
- Aliens. No really. They beamed me into their spaceship, and threatened me with vicious paper cuts. In a moment of bold defiance, I snatched the paper and fought my way out. When I returned home and looked at the paper, their were the judging criterion. Really.
- I want to have my own pizza taste test, can I get a copy of your judging sheet?
- Click here to see the official judging sheet.
- How were the results compiled?
- Basically, the scores were entered into a DB and then a simple average was taken of all the individual category scores. Then the category averages were averaged with equal weight. The highest score won. I also compared this average against the average of the “overall” category. But, you can pick your own winner using the query tool below.
- I don’t agree with the results!
- Do you also have problems with authority? Honestly, I’m a little puzzled by the results as well. However, different people like different pizzas. At least one pizza was given a score of “5” in every category by one person and a score of “1” in every category by another person. I’m betting those two judges are married.
There is also the fact that pizzas are not consistent. One of my personal favorites, American Dream, had a terribly poor showing the evening of the test with an uncooked and soggy crust.
Another possible factor is that not every judge tested every pizza, and sometimes the judges didn’t fill out every score. I was honest to the score sheets – if the score was left blank, I inserted a null into the DB, which does not alter the average. I made no attempt to interpret judges scores; if the chad was merely dimpled, then it wasn’t counted!
- Can I see the score sheets?
- Have a good time, they are all scanned in here:
- I want to do my own analysis of the data – can I get a copy?
- Sure, just send cash or a money order in the amount of $19.95 to… Just kidding, you can use the “Category Computator” at the end of this document.
And The Winner is…
|Avg of all Categories except Over_All||Category: Over_All|
Notice the interesting discrepancy between the results for the average of all category scores except Over_All and the average of the Over_All scores. This discrepancy could be a result of insufficient categories. For example, perhaps some pizzas just “looked” nasty, which would influence the Over_All score, but wouldn’t influence the category average since there was not an “appearance” category. Or perhaps this just reflects the inaccuracy of human subjectivity. You decide.