Characterization and Prediction of Popular Projects on GitHub

Project Code :TCMAPY1480

Objective

This project aims to analyze and predict the popularity of GitHub repositories based on various features such as the number of stars, forks, issues, programming language, description, and contributor count. The system scrapes data from the GitHub API, processes it, and trains machine learning models to predict the number of stars a repository may receive. The application includes a Flask web app for easy interaction with users.

Abstract

Characterization and Prediction of Popular Projects on GitHub

 

ABSTRACT

GitHub is a widely used platform for hosting open-source and private repositories, where the popularity of a repository is often determined by the number of stars it receives. This project, Characterization and Prediction of Popular Projects on GitHub, aims to analyze repository features and predict their popularity using machine learning techniques. The system extracts data from GitHub using its API, including key attributes such as stars, forks, watchers, issues, contributors, tags, programming language, and description.  The collected data undergoes preprocessing, feature engineering, and model training using multiple regression algorithms, including Linear Regression, Random Forest, SGDRegressor, Ridge Regression, Lasso Regression, and Elastic Net Regression.

A Flask web application is developed to allow users to input a GitHub username and repository name to get real-time predictions of repository popularity. The application also includes user authentication and a database for storing results. This system provides valuable insights for developers, researchers, and organizations to understand the factors contributing to GitHub repository success and improve project visibility. Future enhancements include deep learning integration and real-time trend analysis.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

REQUIREMENT ANALYSIS

Hardware Requirements:

Processor                                 - I3/Intel Processor

Hard Disk                                - 160GB

Key Board                              - Standard Windows Keyboard

Mouse                                     - Two or Three Button Mouse

Monitor                                   - SVGA

RAM                                       - 8GB

 

S/W CONFIGURATION:

Operating System                   :  Windows 7/8/10

Server side Script                    :  HTML, CSS, Bootstrap & JS

Programming Language         :  Python

Libraries                                  :  Flask, Pandas, Mysql.connector, Numpy

IDE/Workbench                      :  VSCode

Technology                             :  Python 3.10.8

Server Deployment                 :  Xampp Server

Demo Video

mail-banner
call-banner
contact-banner
Request Video