DNA classification by using machine learning

Project Code :TCMAPY990

Objective

The Machine Learning objective of DNA classification using machine learning is to categorize DNA sequences into specific classes or groups. This enables the understanding of genetic structures, functions, and relationships, facilitating advancements in personalized medicine, disease prediction, evolutionary studies, and other biological applications.

Abstract

This project aims to classify the DNA of humans, dogs, and chimpanzees using various machine learning algorithms, employing an early classification-based approach for fault classification. The primary objective is to determine the DNA classification accurately using Decision Tree (DR), random forest (RF), and Logistic regression techniques. To accomplish this, a dataset consisting of DNA sequences from humans, dogs, and chimpanzees is utilized. The DNA sequences are preprocessed to extract relevant features and eliminate noise. Subsequently, the extracted features are used as inputs for the machine learning algorithms. The Decision Tree algorithm employs a tree-based model to classify the DNA sequences, while the random forest technique constructs an ensemble of decision trees to improve classification accuracy. Logistic regression, on the other hand, utilizes a logistic function to predict the probability of DNA sequence belonging to a particular class. The early classification-based approach is employed to identify faults during the classification process. This enables the identification and correction of misclassified DNA sequences early on, thereby improving the overall accuracy of the classification. Experimental results and comparative analysis demonstrate the effectiveness of the Decision Tree, random forest, and Logistic regression techniques for DNA classification. The findings of this study contribute to the advancement of machine learning algorithms for DNA classification and provide insights into the classification of genetic information across species.

Keywords: Decision Tree, random forest and Logistic regression.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications


S/W Configuration:

Operating System: Windows 7/8/10      .          

Server side Script: HTML, CSS & JS.IDE  : Pycharm.

Libraries Used: Numpy, IO, OS, Flask, Keras, Tensorflow

Technology: Python 3.6+.

H/W Configuration:

RAM: 8GB

Processor: I3/ Intel processor

Hard Disk: 160GB

Demo Video

mail-banner
call-banner
contact-banner
Request Video