讲座简介:
This talk presents frontier research on controllable multimodal generation and knowledge discovery in Earth system big data. We systematically elucidate how innovative neural architecture designs—particularly residual relation attention mechanisms and cross-modal alignment frameworks—enhance AI's capacity to interpret ambiguous scientific instructions and generate domain-consistent outputs.
First, we introduce our breakthrough in AIGC-assisted creative design: a cheongsam generation platform that surpasses mainstream text-to-image foundation models (e.g., Midjourney) in domain-specific evaluation metrics, demonstrating the effectiveness of fine-tuned multimodal alignment for specialized tasks. Second, we share BotDetect (https://botdetection.aminer.cn/robotmain), a widely adopted social bot detection platform ranking first in Google search results, which validates our graph neural network approaches for large-scale heterogeneous network analysis.
Critically, we propose a methodological transfer framework: adapting social network AI techniques—including anomalous subgraph detection, multimodal fusion, and interpretable representation learning—to Earth system science challenges. Potential applications include: (1) multimodal foundation models for remote sensing and climate data integration; (2) spatiotemporal graph networks for modeling atmospheric-terrestrial-oceanic coupling processes; (3) self-supervised pretraining strategies for scarce-label geoscience scenarios.
This lecture offers theoretical perspectives and empirical case studies for researchers advancing cross-media intelligent processing in Earth system dynamics, with implications for carbon neutrality monitoring, extreme event attribution, and human-environment coupling analysis.
主讲人简介:
Dr. Ming Zhou holds a Ph.D. in Computer Science from Tsinghua University, where he was advised by Prof. Jie Tang. His research focuses on large-scale model training and inference, multimodal data mining, and graph neural networks. Prior to his doctoral studies, he worked at Baidu’s Speech Product Innovation Lab, contributing to the human-computer interaction system for the Xiaodu Smart Speaker. He led the development of a far-field intelligent voice system and proposed a Quality-of-Service (QoS)-aware Forward Error Correction (FEC) algorithm, which has been successfully deployed in Baidu’s smart voice ecosystem.
Since joining Tsinghua CS in 2019, Dr. Zhou has published extensively in top-tier venues such as IEEE TKDE, ACM CIKM, and ECML PKDD. As first author, he received the Best Student Paper Award at ECML PKDD 2023. Leveraging his expertise in foundation models and heterogeneous data analysis, he has developed and deployed two notable systems: (1) an AIGC-assisted Cheongsam Design System, which experimental validation shows outperforms leading text-to-image foundation models (e.g., Midjourney) in both accuracy and diversity for domain-specific generation; and (2) BotDetect, a social bot detection platform currently ranked #1 in Google search results and successfully integrated into China’s National Key R&D Program.