The objective of this project is to develop a real-time factory unsafe behavior recognition system based on deep learning techniques, specifically YOLOv8-CBAM and RT-DETR-Swin. The project aims to enhance safety monitoring by detecting factory smoking behavior using images captured through surveillance systems. By leveraging the capabilities of YOLOv8-CBAM for object detection, RT-DETR-Swin for improved multi-object tracking, and Adaptive Fusion for enhanced localization, the system aims to automatically identify smoking behavior as they occur. The integration of CBAM will provide visual interpretability by highlighting regions in the image that influence the model's decisions. The goal is to create an efficient, automated solution for factory unsafe behavior recognition, enabling rapid response to unsafe behaviors and enhancing safety management systems.