Red Wine Classification (with Python)

less than 1 minute read

image-center

Can we use the physicochemical characteristics of a wine to predict his quality?

From the last post, we will continue with the wine dataset. Here, I will apply machine learning technique to classify it. Just to remember, we have 3 categories: low, medium and high. I used five algorithms from scikit learn package: KNeighbors, Random Forest, Gaussian NB, ExtraTrees and DecisionTree. Here is the accuracy on the training set.

The random forest model gives the highest accuracy around 88% on the training set and we can have also the most important predictors.

What about the weight of the 5 most important features?


The most important features for red wine classification are : alcohol, volatile acidity, sulphates, density and total sulfur dioxide. Indeed, these are important parameters to evaluate the quality of a wine.