File De Duplication

Project Code :TCMAPY841

Objective

The primary goal of this project is to find the duplicate data that is entered into the database.

Abstract

At present, data de-duplication on the metadata management and read/write rate. In order to achieve higher de-duplication elimination ratio, the traditional way is to expand the range of data for data de-duplication, but that would make metadata fields longer and increase the number of metadata entries. When detecting the redundant data, metadata needs to be constantly imported and exported into the memory and access bottleneck will be produced. So it is necessary to detect similar documents to classify valuable data for de-duplication. In this paper, we propose a new method of block-level data de-duplication combined with similar file detection. At the time of guaranteeing the de-duplication elimination ratio, we narrow the range of data to reduce the metadata and eliminate performance bottlenecks. We present a detailed evaluation of our method and other existing data deduplication methods, and we show that our method meets its design goals as it improves the de-duplication ratio while reducing overhead costs.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

H/W CONFIGURATION:

Processor - I3/Intel Processor

Hard Disk - 160GB

Key Board - Standard Windows Keyboard

Mouse - Two or Three Button Mouse

Monitor - SVGA

RAM - 8GB



S/W CONFIGURATION:

Operating System :  Windows 7/8/10

Server side Script :  HTML, CSS, Bootstrap & JS

Programming Language :  Python

Libraries :  Flask, Pandas, Mysql.connector, Os, Smtplib, Numpy

IDE/Workbench :  PyCharm

Technology :  Python 3.6+

Server Deployment :  Xampp Server


Demo Video

mail-banner
call-banner
contact-banner
Request Video