Wine Quality Prediction

Aadil Aftab Shaik
Analytics Vidhya
Published in
3 min readDec 19, 2020

--

This is my Naive Bayes project; data analysis and prediction of wine quality based on the data. I used Pandas, NumPy, Seaborn, and, Scikit learn libraries to analyse the data and build the model.

I imported the wine quality data frame as data with the help of pandas.

Now comes the cleaning, fortunately, our data is already cleaned and ready to do the analysis.

data.dtypes
data.isnull().sum()

Now comes analysing the data, with the help of describing function I check if there are any outliers and with the help of pair plot I visualize the data.

data.describe().transpose()
sns.pairplot(data, diag_kind = ‘kde’, hue = ‘Quality’)

Thereafter, comes the build of the model, I separated data into an independent variable (x) and target variable (y).

I further divided them into x_train, x_test, y_train and, y_test with the help of ‘train_test_split’ function form Scikit Learn.

I build a model on x_train and y_train data by using ‘GaussianNB’ from Scikit Learn and, by the help of ‘score’ function and confusion matrix - we can see that models prediction rate is pretty good.

model.score(x_test,y_test)
Confusion Matrix
metrics.classification_report(y_test,y_pred)

CONCLUSION

I did all this to practice Naive Bayes Classifier, and the following are all the things I learned while doing this project:

Learned how to use and when to use Naive Bayes Classifier.

Learned advantages of Naive Bayes Classifer over other algorithms.

Project Github: Wine Quality Prediction

--

--