Chapter 5 Results

In the results section, we are going to start by exploring the brewery data. There are a few angles that we can go about this. Firstly, we will explore the distribution of the breweries across the US?; Secondly we will look at what styles of beer are most popular in different states because we think different states tend to brew or specialize in different styles. Last but not the least, we will investigate whether there is a difference in terms of the user rating behaviors across states because we suspect that people from different states might have different rating behaviors.

After exploring the brewery data, we will look at the beer data, and the specific traits of the beers we are going after include beer style, beer alcohol by volume (ABU), beer international bitterness units (IBU), average user ratings, the big question is that why are certain beer styles getting attention and better ratings than others? We will aim to answer this question using a series of plots to understand different aspects of the beer data.

5.1 Where are the breweries located?

Let’s first take a look at a sample of the brewery data below. The dataset contains the brewery name, description, address, geo coordinates, average user ratings of the brewery (calculated by the average rating of all beer ratings), the number of beers, the number of ratings given the brewery received in total, and etc.

When you click on each state, a popup window will appear to show additional information such as the number of breweries, number of beers made, the average brewery rating for that state. Based on this map, it’s clear to see that most of the breweries are located in the northeast region, west coast, and Texas and Colorado. In particular, California has the most breweries (547 in total), which surpasses the runner-up Washington state (274 in total). If you want to have a full craft beer experience, I guess California would be the place to go!

We used tigris to get spatial data for state boundaries in US and t_map for generating an interactive map to show the number of breweries across the US.

5.3 Do people rate beers/breweries differently across the US?

One of the questions we have long suspected is that people have different rating behaviors across regions, and this question was actually motivated by a conversation with our beer friends in the Netherlands, where our Dutch friends criticized that Americans always give high ratings to beers whereas Dutch people are more objective when it comes to beer rating, as a consequence, American beers tend to receive high ratings than the Dutch/European beers. It was quite an accusation and at the same time an interesting hypothesis, so this idea has been stuck with us for a long time, now it’s time to put it to test! Unfortunately, the data we’ve collected is only limited to the US, but the good news is that the same idea still applies to this data, therefore we decided to check whether people across different states have slightly different beer rating behaviors. So are people from certain states more generous or strict with ratings?

To answer this question, we will go back to our brewery data, where it contains the rating of every brewery (calculated as the average of all beer ratings) from every state, we could easily dump all of the brewery ratings in ridgeline plot by state to look at rating behaviors across the country. To avoid overplotting, we will only show the top 20 states with the most breweries in this analysis.

Apparently, most of the states follow a uni-modal distribution, and their peaks are centered around the same rating, we don’t observe any differences in terms of rating behaviors across the states. So this observation contrasts with our Dutch friend’s hypothesis, perhaps he is wrong? After some serious pondering, we think our dear Dutch friend might be biased in their assessment, let’s try to explain the rationale here. Our Dutch beer friend is a beer connoisseur and is on a mission to drink the best beers in the world, he has had more than 8000 unique beers at the time of writing based on his Untappd profile. Our theory is that the American beers he finds at the local beer stores in the Netherlands or beer ordering websites only carry the best American beers, in addition, he is most likely only looking for American beers with high ratings and blatantly ignoring those with low-ratings, as a consequence, his sample of American beers might be biased towards the high rating ones. To test this theory, we are going to look at the same data again, however, this time we will simulate our friends’ beer purchasing behavior where one would only order beers from the top 50 most highly rated breweries of each state.

Now we can clearly see that there is a difference in terms of the distribution of the brewery ratings across the top 20 states. This confirmes our hypothesis about our friend’s biased assessment of the American beers, and we don’t think there is a regional difference in terms of the beer rating behaviors.

5.5 What are the highly rated beer styles?

So far we have looked at the popular beer styles, however, as stated previously a popular beer style doesn’t always receive a high rating, it only suggests that people are drinking this style of beer a lot more than other styles. Okay now, let’s find out the most highly rated beer styles, we are going to show this using a boxplot. Again we will limit this analysis to the top 20 most popular beers.

This is interesting! The most deserved beer styles are New England IPA (including the Double version), Stout (including Imperial Stout and Milk Stout), Sour (Including Sour - Fruited, and Sour - Other), and also Double IPA. The most popular beer style American IPA is not rated very favorably, on top of that, it has a lot of outlines on the left, indicating there are a lot of badly rated American IPA beers. Although the median rating of the Farmhouse Ale is not very high, it seems to have a lot of outliers on the right side, suggesting there are a lot of high-rated Farmhouse Ales.

5.5.1 What makes a beer more desirable?

We couldn’t help but wonder why some beers are rated higher than others, perhaps the highly desired beers share the same characteristics? If so what are those characteristics? We set out to answer this question by looking at beer IBU and beer ABV. Before we jump into the analysis, we want to clarify one thing, which is that different beer styles have fundamentally different characteristics, e.g. Imperial Stout is inherently more alcoholic and sweeter than American IPA, so it would inevitably introduce some noise if we treat all beer styles the same. As seasoned Untappd users ourselves, we would use different benchmarks when it comes to rating different beer styles. Due to this reason, we decided to focus on IPA styles only for answering this question (defined as the beer styles that contain the mention of IPA).

We used our beer experience to discretize beer IBU, beer ABU, and beer rating using the following rules, and then create a mosaic plot.

  • If beer ABV value

    1. < 6% then low

    2. >= 6% and < 8% then median

    3. >= 8% then high

  • If beer IBU value

    1. < 30 then low

    2. >= 30 and < 50 then median

    3. >= 50 then high

  • If beer rating value

    1. < 3.6 then low

    2. >= 3.6 and < 4 then median

    3. >= 4 then high

The mosaic plot shows that IPAs with higher ABVs tend to receive a higher rating. In addition, the IPAs with low IBUs tend to get higher ratings as well. In other words, people tend to prefer a strong but not too hoppy IPA (hopefully very fruity), this actually makes sense to us because we all love to drink for both flavor and effect at the same time.

5.6 What are the best breweries and what beers are they making?

We have looked at the popular beer styles and their corresponding ratings. The next question is very relevant for us because we really want to know what the best breweries are and what they are making, this will facilitate us in our beer hunting adventure. However, there are so many breweries in the dataset, we will limit this analysis to the top 50 most highly rated breweries across the US.

It looks like most of the top breweries are making a lot of IPAs at the first glance, this actually corroborates with what we’ve observed so far, which is that IPAs are the most popular beer styles and some of the IPA styles (New England or Double IPA) have the best ratings in the US. There are some all-around breweries such as Cypress Creek Southern Ales that brew all kinds of beer styles. On the other hand, more than half of the top breweries seem to specialize in certain beer styles, which is probably the reason why they are so good at what they are doing. For example, Trillium, Treehouse, and Other Half only focus on IPAs, whereas breweries like Side Project, and Ladd & Lass brewing focus on the sour and farmhouse styles. We do want to emphasize the fact that we only scraped the top 24 beers for each brewery due to the limited information Untappd allows us to pull, however, we think 24 beers could be a good proxy for all beers of each brewery because the beers from the same brewery do not deviate too much from what they’ve done in the past.