Chapter 4 Missing values

Too find out the missing values in our cleaned data sets we used a missing_value_plot function we created as part of our course assignment. The missing_value plot function can be found at plot_missing.R

4.1 Missing Values in Beer Ratings Dataset

The more percentage of missing values in the cleaned_all_beers dataset are in beer_ibu, beer_desc and beer_abv columns. We can ignore beer_desc column as that column will not help in out analysis. In case of beer_abv and beer_ibu we are going to to remove the null value rows while utilizing these variables in our analysis as we cannot add the data to the missing rows using any of the generalization methods. beer_ibu and beer_abv can tell a lot about a beer taste so we do not want to drop these columns as well.

4.2 Missing Values in Brewery Ratings Dataset

The more percentage of missing values in the cleaned_brewery_dataset is in latitude and Longitude columns. We are ignoring these columns as we have other columns we can utilize to get spatial maps namely city and state.