Enhancing the Socioeconomic Understanding of Foundation Models with Urban Mobility
2026-06-01 • Social and Information Networks
Social and Information Networks
AI summaryⓘ
The authors studied how big computer models, called foundation models, predict things about cities like income or crime. Usually, these models look at places as if they were standing still and don't consider how people move between them. The authors created MobFusion, which adds information about how people travel in cities to help these models understand connections between places better. They tested this in three cities with real movement data and found that including human mobility made the predictions more accurate.
foundation modelsurban socioeconomic predictionmobility networkslarge language models (LLM)multimodal learninggeospatial datasatellite imagerypoint of interest (POI)graph embeddingszero-shot learning
Authors
Baoshen Guo, Donghang Li, Zhiqing Hong, Kailai Sun, Heye Huang, Alok Prakash, Shenhao Wang
Abstract
Foundation models have recently been applied to urban socioeconomic prediction using POI text, satellite imagery, and geospatial descriptions. However, these models mostly rely on static attributes of individual places, while ignoring the mobility patterns that reveal how places are functionally connected. To address this gap, we explore whether mobility networks can elicit the geospatial capabilities of foundation models by explicitly encoding connectivity among urban entities. We propose \textit{MobFusion}, a modular mobility-enhanced foundation model fusion paradigm, and instantiate it through three complementary designs: (i) mobility networks as contexts for zero-shot LLM prompting, (ii) as graph connectors for fusing geospatial visual embeddings with textual embeddings, and (iii) as structured tokens for multimodal LLM reasoning. Using anonymized large-scale mobility datasets from three U.S. metropolitan areas, we find that \textit{MobFusion} improves urban prediction tasks (e.g., median household income, population density, and crime prediction) across three instantiations, demonstrating that incorporating human mobility can effectively improve the socioeconomic understanding of foundation models.