Kidney Tumor Detection in CT scans Combining 3D CNN and Transformer Networks

Project Code :TCMAPY1673

Objective

The primary objective of this project is to develop a robust and efficient hybrid deep learning model for accurate detection and classification of kidney tumors in CT scan volumes. By integrating 3D Convolutional Neural Networks (CNNs) with transformer-based attention mechanisms, the system aims to capture both fine-grained spatial features and long-range inter-slice dependencies across multi-slice CT inputs. The goal is to enhance diagnostic accuracy, particularly in identifying malignant lesions, while ensuring computational efficiency for real-time clinical deployment. The project also seeks to provide interpretable insights through attention visualization, supporting radiologists in making informed diagnostic decisions.

Abstract

This paper introduces a hybrid deep learning architecture for kidney tumor detection in CT scans, integrating 3D convolutional operations with transformer-based sequence modeling. Our volumetric processing framework combines the local feature extraction capabilities of 3D CNNs with the global contextual understanding of transformers, specifically designed for multi-slice CT analysis. The system processes 16-slice CT volumes through a ConvNeXt-inspired 3D backbone that maintains spatial relationships across imaging planes, followed by a transformer encoder that captures long-range dependencies between slices through learned attention patterns.

The architecture employs several key innovations: (1) a depth-aware patch embedding layer that preserves inter-slice relationships when transitioning from 3D convolutions to 2D attention maps, (2) adaptive positional encodings that account for variable slice spacing in medical CT acquisitions, and (3) a multi-scale feature fusion module that combines hierarchical representations from both pathways. We demonstrate the system's clinical utility through interpretable attention visualizations that highlight diagnostically relevant regions across slice sequences.

Trained on a diverse dataset of kidney CT volumes encompassing normal anatomy and pathological findings (cysts, tumors, and stones), our approach shows particular sensitivity in detecting malignant lesions while maintaining specificity across all classes. The model's computational efficiency enables practical deployment, with inference times suitable for clinical workflows. Comparative evaluations against conventional 3D CNNs and pure transformer architectures reveal significant improvements in handling slice-to-slice variations while preserving spatial detail. This work advances volumetric medical image analysis by demonstrating how hybrid architectures can leverage both local texture patterns and global contextual information in multi-slice diagnostic imaging.


Keywords: kidney tumor detection, 3D CNN, vision transformer, CT scan classification, volumetric medical imaging, deep learning in radiology, hybrid neural networks, multi-slice analysis, ConvNeXt-Transformer fusion, diagnostic AI.

NOTE: Without the concern of our team, please don't submit to the college. This Abstract varies based on student requirements.

Block Diagram

Specifications

SOFTWARE REQUIREMENS

Operating System                              :  Windows 7/8/10

Server side Script                               :  HTML, CSS, Bootstrap & JS

Programming Language                   :  Python

Libraries                                              :  Flask, Pandas, Torch, Keras, Sklearn, Numpy , Seaborn

IDE/Workbench                                  :  VSCode

Server Deployment                            :  Xampp Server

Database                                              :  MySQL    

 

HARDWARE REQUIREMENTS

Processor                            - I3/Intel Processor

RAM                                      - 8GB (min)

Hard Disk                             - 128 GB

Key Board                            - Standard Windows Keyboard

Mouse                                  - Two or Three Button Mouse

Monitor                                - Any

Demo Video