Hi, I am Jehanzeb Mirza. I am a Post doctoral Researcher at MIT CSAIL, in the Spoken Langugage Systems Group, led by Dr. James Glass. I received my PhD. in Computer Science (Computer Vision) from TU Graz, Austria, where I was advised by Professor Horst Bischof, and Professor Serge Belongie served as an external referee. I did my Masters from KIT, Germany in Electrical Engineering and Information Technology (ETIT) and received my Bachelors in Electrical Engineering (EE) from NUST, Pakistan.
I am particularly interested in self-supervised learning for uni-modal models and multi-modal learning for vision-language models, with a focus on improving fine-grained understanding. I am actively looking for collaborators in the area of multi-modal learning. Please do not hessitate to write me an email, even if you just want an opinion on your work! :)
jmirza [at] mit.edu
Boston, USA.
Ph.D. in Computer Vision (2021 - 2024)
TU Graz, Austria.
MS in ETIT (2017 - 2020)
KIT, Germany.
BS in EE (2013 - 2017)
NUST, Pakistan
Teaching VLMs to Localize Specific Objects from In-context Examples
Arxiv 2024
[ Paper | Project Page | Code ]
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Arxiv 2024
[ Paper | Project Page | Code ]
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
*Irene Huang, *Wei Lin, *M. Jehanzeb Mirza, Jacob Hansen, Sivan Doveh, Victor Ion Butoi, Roei Herzig, Assaf Arbelle, Hilde Kuehne, Trevor Darrel, Chuang Gan, Aude Oliva, Rogerio Feris, Leonid Karlinsky (*Equal Contribution),NeurIPS 2024
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
ECCV 2024
[ Paper | Project Page | Code ]
Are Vision Language Models Texture or Shape Biased and Can We Steer Them?
ArXiv Preprint 2024
[ Paper ]
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
NeurIPS 2023
[ Paper | Project Page | Code ]
MATE: Masked Autoencoders are Online 3D Test-Time Learners
ICCV 2023
[ Paper | Project Page | Code ]
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
CVPR 2023
[ Paper | Project Page | Code ]
An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions
CVPRW 2022
[ Paper ]