New Deep Learning Model ShadowFormer++ Advances Intelligent Image Shadow Removal

Images captured in real-world environments often contain shadows caused by sunlight, buildings, trees, or artificial lighting. These shadows can hide important details and reduce the clarity of images. While humans can usually distinguish objects under shadows, computer systems often struggle to interpret such images correctly. This creates challenges for technologies such as self-driving vehicles, intelligent surveillance systems, robotics, and medical imaging systems that rely on accurate visual information.

To combat this, Dr Rajiv Senapati, and Dr Sanjay Kumar, Assistant Professors from the Department of Computer Science and Engineering, SRM AP, along with Ms Stutee Mohanty, a PhD scholar, has developed a framework, published in the Q1 journal of ‘Scientific Reports’ having an impact score of 3.9, titled “ShadowFormer++: Multi-Scale Shadow Priors and Diffusion-Guided Refinement for High-Fidelity Shadow Removal”. This research focuses on developing an artificial intelligence system capable of automatically removing shadows from images while preserving the natural appearance of the scene. The proposed system, called ShadowFormer++, studies the relationship between shadowed and illuminated regions in an image and reconstructs the hidden details in a visually realistic manner. Unlike conventional approaches, the proposed framework not only removes shadows but also preserves textures, colors, object boundaries, and structural details. The system combines advanced transformer-based learning techniques with diffusion-inspired image refinement strategies to produce cleaner and more natural-looking images. The framework is also computationally efficient, enabling potential deployment in real-time applications such
as autonomous navigation systems, smart surveillance cameras, and robotic vision platforms.

Abstract

This research presents ShadowFormer++, a novel deep learning framework for high-fidelity shadow removal in digital images. Shadows often degrade image quality and negatively impact computer vision tasks such as object detection, semantic segmentation, robotics, surveillance, and autonomous navigation. Existing shadow removal techniques frequently suffer from inaccurate shadow localization, loss of fine-grained textures, illumination inconsistencies, and high computational complexity. To address these limitations, the proposed framework integrates transformer-based global contextual learning with diffusion- inspired refinement mechanisms. ShadowFormer++ consists of three key modules: the Multi-Scale Local Shadow Perception Module (MS-LSPM) for extracting shadow priors across multiple receptive fields, the Shadow-Aware Transformer Encoder (SATE) for preserving global structural consistency, and the Diffusion-Inspired Refinement Module (DIRM) for progressive restoration of fine image details. The proposed model was evaluated on benchmark datasets including ISTD, ISTD+, and SRD, where it demonstrated superior quantitative and qualitative performance compared to existing state-of-the-art approaches. Experimental results show improvements in PSNR, SSIM, and MAE metrics while maintaining computational efficiency suitable for real-time applications.

Practical Implications

The proposed research has significant practical applications across several domains of artificial intelligence, computer vision, and intelligent imaging systems. In autonomous vehicles and robotics, shadowed environments often create difficulties in object detection and scene understanding. By improving image clarity and illumination consistency,
ShadowFormer++ can enhance the reliability and safety of autonomous navigation systems. In surveillance and security applications, the proposed framework can improve the visibility of shadow-affected regions in CCTV footage, enabling more accurate monitoring and threat detection. The research also has applications in remote sensing and aerial imaging, where shadows from buildings, terrain, or clouds can interfere with the interpretation of satellite and drone imagery. The proposed framework can improve the quality of such images, supporting urban planning, environmental monitoring, agriculture, and disaster management.

Furthermore, in medical and scientific imaging, improved image restoration techniques can assist in enhancing image interpretability in scenarios where illumination inconsistencies obscure important visual information. The framework also demonstrates how advanced deep learning architectures can be designed to balance both performance and computational efficiency, making them suitable for real-world deployment in edge devices and resource-constrained systems.

Future Plans

Our future research plans focus on developing more robust and intelligent image security and enhancement frameworks for real-world applications. I aim to extend this work by integrating deep learning, federated learning, and zero-watermarking techniques for secure image transmission and copyright protection. I also plan to explore hyperspectral and thermal image watermarking for defence and remote sensing applications, with emphasis on robustness against AI-based attacks, privacy preservation, and real-time implementation in large-scale practical systems.