MOVIS: Modular Visual Intelligence System for Human Profiling & Contextual Analysis
Authors:
SWAYAM GAJANAN KAMBLE (Vishwakarma Institute of Information Technology, Pune)
Vedant Shingade (Vishwakarma Institute of Information Technology Pune,)
Samiya Patel (Vishwakarma Institute of Information Technology Pune)
Dr. Priya Shelke (Vishwakarma Institute of Information Technology Pune)
Abstract

We introduce MOVIS, a comprehensive and modular visual intelligence framework aimed at detailed human profiling and contextual scene analysis. The system incorporates deep learning-based components for facial identification, demographic inference, image description generation, and context augmentation using external knowledge sources such as Wikipedia. Designed for real-time operation, MOVIS features adaptive learning mechanisms that allow it to incorporate user feedback for recognizing unfamiliar individuals. Linking detected faces with biographical insights from online encyclopedic sources enhances the clarity and relevance of its outputs. MOVIS demonstrates its utility in domains like surveillance, AI-driven personal assistants, and interactive systems that are aware of individual id-entities. Performance is assessed both at the mod-ule level and as a unified pipeline, showing strong results in accuracy and contextual interpretation a-cross diverse visual scenarios.

📄 Download Full Paper (PDF)
Published in: NCAIDT 2025 Proceedings
DOI: 10.63169/NCAIDT2025.p7
Paper ID: NCAIDT2025-0416