A Novel Ensemble Learning Paradigm for Medical Diagnosis With Imbalanced Data

Project Code :TCPGPY365

Objective

The aim of this study is to propose a novel ensemble learning paradigm for medical diagnosis that performs at the same level or better than the other state of-the-art comparison methods.

Abstract

With the help of machine learning (ML) techniques, the possible errors made by the pathologists and physicians, such as those caused by inexperience, fatigue, stress and so on can be avoided, and the medical data can be examined in a shorter time and in a more detailed manner. However, while the conventional ML techniques, such as classification, achieved excellent performance in classification accuracy when applied in medical diagnoses, they have a fatal shortcoming of poor performance since the imbalanced dataset, especially for the detection of the minority category. 

To tackle the shortcomings of conventional classification approaches, this study proposes a novel ensemble learning paradigm for medical diagnosis with imbalanced data, which consists of three phases: data pre-processing, training base classifier and final ensemble. In the first data pre-processing phase, we introduce the extension of Synthetic Minority Oversampling Technique (SMOTE) by integrating it with cross-validated committees filter (CVCF) technique, which can not only synthesize the minority sample and thereby balance the input instances, but also filter the noisy examples so as to perform well in the process of classification. In the classification phase, we introduce ensemble support vector machine (ESVM) classification technique, which were constructed by multiple diversity structures of SVM classifiers and thus has the advantages of strong generalization performance and classification precision.

Keywords: Support Vector Machine; Imbalanced Data; Ensemble Learning; Medical Diagnosis.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

HARDWARE SPECIFICATIONS:

  • Processor- I3/Intel Processor
  •  RAM- 4GB (min)
  • Hard Disk- 128 GB
  • Key Board-Standard Window
  •  Keyboard. Mouse-Two or Three Button Mouse.
  • Monitor-Any.

SOFTWARE SPECIFICATIONS:

  • Operating System: Windows 7+
  • Technology: Python 3.6+
  •  IDE: PyCharm IDE
  •  Libraries Used: Pandas, NumPy, Scikit-Learn, Matplotlib.

Learning Outcomes


  • Importance of PyCharm IDE.
  • How ensemble models works.
  • Process of debugging a code.
  • The problem with imbalanced dataset.
  • Benefits of SMOTE technique.
  • Input and Output modules
  • How test the project based on user inputs and observe the output
  • Project Development Skills:
    • Problem analyzing skills.
    • Problem solving skills.
    • Creativity and imaginary skills.
    • Programming skills.
    • Deployment.
    • Testing skills.
    • Debugging skills.
    • Project presentation skills.
    • Thesis writing skills.

Demo Video

mail-banner
call-banner
contact-banner
Request Video

Related Projects

Final year projects