This project aims to analyze and predict the popularity of GitHub repositories based on various features such as the number of stars, forks, issues, programming language, description, and contributor count. The system scrapes data from the GitHub API, processes it, and trains machine learning models to predict the number of stars a repository may receive. The application includes a Flask web app for easy interaction with users.
Characterization and Prediction of Popular Projects on GitHub
ABSTRACT
GitHub is a widely used platform for hosting open-source and private repositories, where the popularity of a repository is often determined by the number of stars it receives. This project, Characterization and Prediction of Popular Projects on GitHub, aims to analyze repository features and predict their popularity using machine learning techniques. The system extracts data from GitHub using its API, including key attributes such as stars, forks, watchers, issues, contributors, tags, programming language, and description. The collected data undergoes preprocessing, feature engineering, and model training using multiple regression algorithms, including Linear Regression, Random Forest, SGDRegressor, Ridge Regression, Lasso Regression, and Elastic Net Regression.
A Flask web application is developed to allow users to input a GitHub username and repository name to get real-time predictions of repository popularity. The application also includes user authentication and a database for storing results. This system provides valuable insights for developers, researchers, and organizations to understand the factors contributing to GitHub repository success and improve project visibility. Future enhancements include deep learning integration and real-time trend analysis.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

REQUIREMENT ANALYSIS
Hardware Requirements:
Processor - I3/Intel Processor
Hard Disk - 160GB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
RAM - 8GB
S/W CONFIGURATION:
Operating System : Windows 7/8/10
Server side Script : HTML, CSS, Bootstrap & JS
Programming Language : Python
Libraries : Flask, Pandas, Mysql.connector, Numpy
IDE/Workbench : VSCode
Technology : Python 3.10.8
Server Deployment : Xampp Server