Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time. We propose to perform this adaptation via Activation Matching (ActMAD): We analyze activations of the model and align activation statistics of the OOD test data to those of the training data. In contrast to existing methods, which model the distribution of entire channels in the ultimate layer of the feature extractor, we model the distribution of each feature in multiple layers across the network. This results in a more fine-grained supervision and makes ActMAD attain state of the art performance on CIFAR-100C and Imagenet-C. ActMAD is also architecture- and task-agnostic, which lets us go beyond image classification, and score 15.4% improvement over previous approaches when evaluating a KITTI-trained object detector on KITTI-Fog. Our experiments highlight that ActMAD can be applied to online adaptation in realistic scenarios, requiring little data to attain its full performance.
Given a pre-trained model and statistics of the clean activations from the training data, our ActMAD aligns the activation responses from the shifted test data to the clean activations at test-time. We model the activation distributions in terms of the means and variances of each activation, such that the statistics have the same shape as the feature maps. The statistics of the training activations are pre-computed on the training set, or computed on unlabelled data without distribution shift.
@InProceedings{mirza2023actmad,
author = {Mirza, M. Jehanzeb and Soneira, Pol Jane and Lin, Wei and Kozinski, Mateusz and Possegger, Horst and Bischof, Horst},
title = {ActMAD: Activation Matching to Align Distributions for Test-Time Training},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2023}
}