Total: 1
This paper presents a comprehensive pipeline that integrates state-of-the-art techniques to achieve high-quality cartoon style transfer for educational images and videos. The proposed approach combines the Inversion-based Style Transfer (InST) framework for both image and video style stylization, the Pre-Trained Image Processing Transformer (IPT) for post-denoising, and the Domain-Calibrated Translation Network (DCT-Net) for more consistent video style transfer. By fine-tuning InST with specific cartoon styles, applying IPT for artifact reduction, and leveraging DCT-Net for temporal consistency, the pipeline generates visually appealing and educationally effective stylized content. Extensive experiments and evaluations using the scenery and monuments dataset demonstrate the superiority of the proposed approach in terms of style transfer accuracy, content preservation, and visual quality compared to the baseline method, AdaAttN. The CLIP similarity scores further validate the effectiveness of InST in capturing style attributes while maintaining semantic content. The proposed pipeline streamlines the creation of engaging educational content, empowering educators and content creators to produce visually captivating and informative materials efficiently.