Chapter 4 Missing values
Too find out the missing values in our cleaned data sets we used a missing_value_plot function we created as part of our course assignment. The missing_value plot function can be found at plot_missing.R
4.1 Missing Values in Beer Ratings Dataset
The more percentage of missing values in the cleaned_all_beers dataset are in beer_ibu
, beer_desc
and beer_abv
columns. We can ignore beer_desc
column as that column will not help in out analysis. In case of beer_abv
and beer_ibu
we are going to to remove the null value rows while utilizing these variables in our analysis as we cannot add the data to the missing rows using any of the generalization methods. beer_ibu
and beer_abv
can tell a lot about a beer taste so we do not want to drop these columns as well.
4.2 Missing Values in Brewery Ratings Dataset
The more percentage of missing values in the cleaned_brewery_dataset is in latitude
and Longitude
columns. We are ignoring these columns as we have other columns we can utilize to get spatial maps namely city
and state
.