Sara Ghazanfari

Sara Ghazanfari is a Ph.D. candidate in the Department of Electrical and Computer Engineering at the NYU Tandon School of Engineering, where she is advised by Professors Siddharth Garg and Farshad Khorrami. Since beginning her doctoral studies in January 2023, her research has focused on advancing the visual understanding capabilities of multimodal large language models (MLLMs), with an emphasis on enhancing spatio-temporal reasoning in video LLMs, and improving visual alignment through more effective multimodal representation adaptation. Broadly, her goal is to make MLLMs more capable, interpretable, and robust across diverse visual understanding and reasoning tasks. In the summer of 2025, Sara served as a Research Scientist at Adobe, where she worked on strengthening the visual understanding capabilities of generative models for image synthesis. More recently, her work has extended toward unified large multimodal models that integrate both visual understanding and generation within a single framework to better assess and advance their holistic visual intelligence. Through this line of research, she aims to contribute to the broader goal of building more general and reliable visual reasoning systems. Her work has been published at top-tier venues, including the International Conference on Machine Learning, International Conference on Learning Representations, the Computer Vision and Pattern Recognition Conference, and the Conference on Neural Information Processing Systems.

Ph.D. Candidate

Profile details