This project classifies ambulance, firetruck, and traffic sounds from 5?second, 16 kHz audio clips using log Mel spectrograms. Two architectures are trained: VAS Compass Net (ensemble of EfficientNet?B0, MobileNetV2, GhostNet with learnable fusion) and AudioMamba (state?space model). Both achieve high accuracy and F1 score. A Flask web app with authentication provides upload, prediction, saliency heatmaps, and automated email alerts for emergency vehicles.
Colon polyp detection plays a crucial role in early cancer diagnosis and prevention. This research proposes a novel framework for accurate colon polyp segmentation using a Pyramid Vision Transformer (PVT) and CNN Decoder. The system integrates multi-scale feature fusion with the transformer model to capture fine-grained features at various scales. A CNN decoder is then applied to refine the segmentation output. The framework is trained and evaluated on the PolypDB dataset, which contains colonoscopy images and annotated polyps. The Pyramid Vision Transformer excels in handling diverse image scales, while the CNN Decoder enhances the final segmentation by incorporating high-resolution details. This approach aims to improve segmentation accuracy and provide a reliable solution for automated polyp detection, reducing the workload for medical professionals and increasing the chances of early detection. Experimental results show promising segmentation performance with improved accuracy compared to conventional methods. The proposed system is an essential tool for aiding in the diagnosis of colon polyps, facilitating timely medical interventions.
NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.