Disease Predictor
This is my deep learning project; data analysis and prediction of 41 diseases based on the symptoms given in the data. I used Pandas, NumPy, Seaborn, TensorFlow, Keras and, Scikit learn libraries to analyse the data and build the model.
I imported the CSV as ‘data’ variable with the help of the pandas library and analyzed it.
After analyzing data, we can see that after ‘symptom_5’ feature there are NaN values mostly, so i removed those features and replace the remaining NaN values with 0.
Now comes the feature engineering, I replaced the symptoms with their value counts and replaced the diseases with numbers (0 to 40).
I divided the data into x and y variable and split them into train-test data i.e., ‘x_train’, ‘x_test’, ‘y_train’, ‘y_test’. I performed scaling on them to normalize the data before sending it through the artificial neural network. I did one-hot encoding on ‘y_train’ and ‘y_test’.
I built an artificial neural network to classify these diseases and made a classification report of it, it got 91% accuracy on average for 41 diseases.
CONCLUSION
I did all this to practice artificial neural network, and the following are all the things I learned while doing this project:
Learned how to use and when to use artificial neural network.
Learned advantages and disadvantages of deep learning over other basic machine learning algorithms.
Project Github: Disease Prediction