Jie Cai

PhD Student in Computer Science | University of Southern California

Researching Large Vision-Language Models, Multimodal Learning, and Mechanistic Interpretability

About Me

I'm a second-year PhD student at the GLAMOR Lab in the Thomas Lord Computer Science Department at the University of Southern California. I am fortunate to be advised by Professor Jesse Thomason.

My research interests span Large Vision-Language Models (VLMs), multimodal learning, interpretability. I am particularly interested in how VLMs perceive and reason in real-world 3D environments, and in developing interpretability methods to uncover the mechanisms behind why certain VLMs exhibit stronger spatial reasoning abilities, as well as why others struggle with specific spatial tasks. I'm also interested in how Multimodal Large Language Models integrate diverse modalities to form coherent representations that support complex reasoning.

Research Interests

  • Large Vision-Language Models
  • Multimodal Learning
  • Interpretability & XAI

Education

PhD in Computer Science (In Progress)

University of Southern California

Advisor: Prof. Jesse Thomason | Focus: Large Vision-Language Models and Multimodal Learning

Master's Degree in Computer Science

Tsinghua University

Advisors: Prof. Wenwu Zhu, Prof. Xin Wang

Bachelor's Degree in Mathematics and Applied Mathematics

South China University of Technology

Publications

Wen Ye, Wei Yang, Defu Cao, Yizhou Zhang, Lumingyuan Tang, Jie Cai, Yan Liu
TMLR, 2026
Jie Cai, Xin Wang, Haoyang Li, Ziwei Zhang, Wenwu Zhu
AAAI, 2024
Ziwei Zhang, Yijian Qin, Zeyang Zhang, Chaoyu Guan, Jie Cai, Heng Chang, Jiyan Jiang, Haoyang Li, Zixin Sun, Beini Xie, Yang Yao, Yipeng Zhang, Xin Wang, Wenwu Zhu
arXiv, 2024
Heng Chang, Jie Cai, Jia Li
WWW, 2023
Jie Cai, Xin Wang, Chaoyu Guan, Yateng Tang, Jin Xu, Bin Zhong, Wenwu Zhu
WWW, 2022

Get In Touch

I'm always interested in discussing research opportunities, collaborations, or just talking about AI and LLM!

Feel free to reach out via email or connect with me on the platforms below.