SmartSpatial: Enhancing 3D Spatial Awareness in Stable Diffusion with a Novel Evaluation Framework
SmartSpatial: Enhancing 3D Spatial Awareness in Stable Diffusion with a Novel Evaluation Framework
Mao Xun Huang, Brian J Chan, Hen-Hsen Huang
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
AI, Arts & Creativity. Pages 10099-10107.
https://doi.org/10.24963/ijcai.2025/1122
Stable Diffusion models have made remarkable strides in generating photorealistic images from text prompts but often falter when tasked with accurately representing complex spatial arrangements, particularly involving intricate 3D relationships.
To address this limitation, we introduce SmartSpatial, an innovative approach that not only enhances the spatial arrangement capabilities of Stable Diffusion but also fosters AI-assisted creative workflows through 3D-aware conditioning and attention-guided mechanisms.
SmartSpatial incorporates depth information injection and cross-attention control to ensure precise object placement, delivering notable improvements in spatial accuracy metrics.
In conjunction with SmartSpatial, we present SmartSpatialEval, a comprehensive evaluation framework that bridges computational spatial accuracy with qualitative artistic assessments.
Experimental results show that SmartSpatial significantly outperforms existing methods, setting new benchmarks for spatial fidelity in AI-driven art and creativity.
Keywords:
Application domains: Images, movies and visual arts
Theory and philosophy of arts and creativity in AI systems: Computational paradigms, architectures and models for creativity
Theory and philosophy of arts and creativity in AI systems: Evaluation and curation of artistic or creative artefacts
Methods and resources: Machine learning, deep learning, neural models, reinforcement learning
