COMPARATIVE ANALYSIS OF CONVOLUTIONAL NEURAL NETWORKS AND VISION TRANSFORMERS FOR SATELLITE IMAGE SEGMENTATION

Authors

  • T.B. Khachatryan National Polytechnic University of Armenia "SYNOPSYS ARMENIA" CJSC Author

Keywords:

: satellite image segmentation, Vision Transformer, Convolutional Neural Networks, deep learning, land cover classification

Abstract

The rapid advancement of deep learning has revolutionized satellite image analysis with both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) showing promising results. This study presents a comprehensive comparative analysis of these architectures for land cover segmentation in satellite imagery, addressing the critical need for understanding performance-efficiency trade-offs in real-world deployment scenarios. Four state-of-the-art models were fine-tuned and evaluated: ResNet50 and EfficientNet-B0 representing CNNs, ViT-B/16 and Swin Transformer representing the transformer family. Using the DeepGlobe Land Cover Classification dataset containing 803 high-resolution satellite images with seven land cover classes, model performance was assessed across multiple metrics including mean Intersection over Union (mIoU), F1-score, and pixel accuracy. Additionally, computational requirements including inference speed, model size, and memory consumption were analyzed on an NVIDIA RTX 4070 GPU to simulate practical deployment constraints. The results demonstrate that while Vision Transformers achieve superior segmentation accuracy with Swin-T reaching 74.2% mIoU compared to 71.8% for EfficientNet-B0, CNNs maintain significant advantages in inference speed and memory efficiency. EfficientNet-B0 processes images 2.3 times faster than ViT-B/16 while using 40% less GPU memory. Class-wise analysis reveals that transformers particularly excel in complex scenarios like urban areas and forest boundaries, while all models achieve over 89% IoU for water body segmentation. These findings provide practical insights for selecting appropriate architecture based on specific deployment constraints, highlighting the trade-offs between accuracy and computational efficiency in satellite image analysis applications for environmental monitoring, urban planning, and disaster response.

Downloads

Published

21.02.2026

Issue

Section

Articles