Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study

Project Code :TCMAPY668

Objective

The primary goal of this project is to determine the foodborne disease whether there is disease or not and to know this we have used Random Forest, Decision tree, Gradient Boosting and AdaBoost Classifiers to classify.

Abstract

Foodborne diseases have a high global incidence; thus, they place a heavy burden on public health and the social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in their clinical features, and there is a low proportion of actual clinical pathogen detection in real life. We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens for cases where the pathogen is not known or tested. We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationships between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of four models to obtain the pathogen prediction model with the highest accuracy. The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens: Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that these features play important roles in classifying foodborne disease pathogens. Data analysis can reflect the distribution of some features of foodborne diseases and the relationships among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.

Keywords: FoodBorne Disease, Random Forest, Decission Tree, Gradient Boosting and AdaBoost Classifier.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

Hardware:

  • Operating system :  Windows 7 or 7+
  • RAM :  8 GB
  • Hard disc or SSD:  More than 500 GB  
  • Processor:  Intel 3rd generation or high or Ryzen with 8 GB Ram

Software:

  • Software’s :  Python 3.6 or high version
  • IDE:  PyCharm.
  • Framework : Flask  

Learning Outcomes

·         About Classification in machine learning.

·         About preprocessing techniques.

·         About Random Forest Classifier.

·         About Decision Tree Classifier.

·         Knowledge on PyCharm Editor.

 

 

 

Demo Video

mail-banner
call-banner
contact-banner
Request Video