Hi, I am Jehanzeb Mirza. I am a Post doctoral Researcher at MIT CSAIL, in the Spoken Langugage Systems Group, led by Dr. James Glass. I received my PhD. in Computer Science (Computer Vision) from TU Graz, Austria, where I was advised by Professor Horst Bischof, and Professor Serge Belongie served as an external referee. I did my Masters from KIT, Germany in Electrical Engineering and Information Technology (ETIT) and received my Bachelors in Electrical Engineering (EE) from NUST, Pakistan.
My PhD. research mainly focused on learning from unlabeled data. During my Phd., I have worked with various data modalities including images, point clouds, videos, radar signals and most recently natural language. I am particularly interested in Self-Supervised learning for uni-modal models and also Multi-Modal Learning for vision-language models.
jmirza [at] mit.edu
Boston, USA.
Ph.D. in Computer Vision (2021 - 2024)
TU Graz, Austria.
MS in ETIT (2017 - 2020)
KIT, Germany.
BS in EE (2013 - 2017)
NUST, Pakistan
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
*Irene Huang, *Wei Lin, *M. Jehanzeb Mirza, Jacob Hansen, Sivan Doveh, Victor Ion Butoi, Roei Herzig, Assaf Arbelle,NeurIPS 2024
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
ECCV 2024
[ Paper | Project Page | Code ]
Are Vision Language Models Texture or Shape Biased and Can We Steer Them?
ArXiv Preprint 2024
[ Paper ]
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models
ArXiv Preprint 2024
[ Paper ]
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
NeurIPS 2023
[ Paper | Project Page | Code ]
MATE: Masked Autoencoders are Online 3D Test-Time Learners
ICCV 2023
[ Paper | Project Page | Code ]
ActMAD: Activation Matching to Align Distributions for Test-Time-Training
CVPR 2023
[ Paper | Project Page | Code ]
An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions
CVPRW 2022
[ Paper ]