Image Image Image Image Image Image Image Image Image Image

Elliott Morss | July 24, 2014

Scroll to top

Top

No Comments

Judgments of Paris, Princeton, and Lenox, Part 2

Judgments of Paris, Princeton, and Lenox, Part 2
© Elliott R. Morss, Ph.D.

Introduction

It was striking in 1976 when Californian wines “sort of” beat French wines in Paris tastings. That tasting was memorialized in George Taber’s great book Judgment of Paris. I say “sort of” because Orley Ashenfelter and Richard Quandt obtained the results from the Paris tasting and found that while Californian wines got the highest ranking in both red and white categories, judges found very little difference between the French and US red and whites overall.

Ashenfelter and Quandt went on to establish the “liquid asset” website, the “Rolls Royce” of all wine tasting sites. 

Taber, Ashenfelter, and Quandt arranged a re-enactment of the Paris tastings last summer at the annual meeting of the American Association of Wine Economists in Princeton. But in Princeton, French wines were compared to New Jersey wines rather than Californian. The New Jersey wines did quite well, but again, the judges’ rankings differed significantly.

The Lenox Wine Club

In November 2012, the Lenox Wine Club (LWC) was created. Consisting of 14 “veteran” wine drinkers, it decided to start with four tastings: “heavy whites”, “heavy reds”, “light whites”, and “light reds”.  All tastings address the following questions:

  1. Among comparably-priced wines, are the judgments of the veteran drinkers similar enough to identify a significant preference among the wines, and
  2. Does price matter?

Blind tastings were done at a restaurant with very light hors d’oeuvres. Tasters were asked to rate the wines on a scale of 1-5 with the best wine rated 5. Ties were given the average rating of the wines that tied. The tasters then had dinner together drinking the 5 wines with their food. The ratings were completed before dinner.  How the wines tasted with food was not incorporated in the ratings. This is admittedly a shortcoming inasmuch as most wine is consumed with food.

The “Heavy Whites” Tasting

In November 2012, the “heavy white” tasting was held. The ranking results and prices are presented in Table 1. That’s right – the 3-Liter Box Set won the tasting.  

Table 1. – Lenox Wine Club Scores, “Heavy Whites” (5 – best, 1 – worst)

The Box Set price equivalent for a regular 750 ML wine bottle is only $4.47. The Drouhin costing $84.63 was included because it got the highest score among whites at the Princeton tasting last summer. It came in next to last in the Lenox tasting.

These conclusions were reported and reflected on in a recent article

The “Heavy Reds” Tasting

Earlier this month, the LWC held its second tasting. It was limited to wines with Cabernet Sauvignon the dominant grape, meaning Bordeaux wines were not included. Aside from the Bota Box, all wines originally selected for the tasting were rated 90 or higher by Wine Spectator.

Once again, the overall winner was the boxed wine! Well okay, the statisticians in our group might again say that its final score is not significantly different from the Catena. They are very close: the Bota Box got 5 top votes and the Catena got 4. The Mulderbosch and Four Sisters are right in the middle while the Neyers (by far the most expensive) came in a definitive last.

Table 2. – Lenox Wine Club Scores, “Heavy Reds” (5 – best, 1 – worst)

In Table 3, the correlation between each taster’s choices and the average rating for both the “Heavy White” and “Heavy Red” tastings are presented. A high positive number indicates a taster is close to the overall average. For example KM’s correlation of 1.00 in the “Heavy Red” tasting means KM’s rankings are the same as the overall average. Low or negative numbers indicates the opposite.  

 Table 3. – How Tasters’ Rankings Correlated to Average Ranking

It turns out there was very little correlation between the conformity of each taster to the overall rating between the two tastings (-0.08).

Table 4 is a correlation matrix showing how tasters’ ratings compare to other tasters For example, BG, TB and SG appear to have similar tastes while MS and LS gave opposite ratings.

Table 4. – Tasters’ Correlation Matrix

 

One other measure is worth mentioning. The Kendall W statistic indicates how much congruence there is among the ratings of the tasters. The Kendall W for the “Heavy Reds tasting was only 0.214, indicating there should be very little confidence in the judges’ overall ratings.

Submit a Comment