This repository contains PyTorch implementation for CVPR2024 paper JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models
This code is built on top of the CoOp. Please follow their steps to configure the runtime environment. Many thanks for their contributions!
Follow CoOp to install the datasets. Note that the Food101N dataset needs to be downloaded separately. Food101N uses the same test set as Food101.
We provide the running scripts in scripts/joapr
. You can adjust the hyperparameters in the config file in configs/trainer/rn50.yaml
. Below we provide examples on how to run JoAPR on Caltech101.
JoAPR(Caltech101, Symflip):
- 4 FP:
bash scripts/joapr/main.sh caltech101 4 symflip
- 8 FP:
bash scripts/joapr/main.sh caltech101 8 symflip
- 12 FP:
bash scripts/joapr/main.sh caltech101 12 symflip
JoAPR(Caltech101, Pairflip):
- 4 FP:
bash scripts/joapr/main.sh caltech101 4 pairflip
- 8 FP:
bash scripts/joapr/main.sh caltech101 8 pairflip
- 12 FP:
bash scripts/joapr/main.sh caltech101 12 pairflip
To calculate the average results for the folder rn50_16shots_4FP_symflip/nctx16_cscFalse_ctpend/
, you can run
python parse_test_res.py output/caltech101/rn50_16shots_4FP_symflip/nctx16_cscFalse_ctpend
Dataset | Noise Type Config\Noise Ratio | Symflip | Pairflip | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
12.5% | 25.0% | 37.5% | 50.0% | 62.5% | 75.0% | 12.5% | 25.0% | 37.5% | 50.0% | 62.5% | 75.0% | ||
ImageNet | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
SUN397 | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
Caltech101 | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
Flowers102 | Warmup epochs | 20 | 20 | 20 | 20 | 20 | 1 | 20 | 20 | 20 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
StanfordCars | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
FGVCAircraft | Warmup epochs | 150 | 150 | 150 | 150 | 150 | 150 | 150 | 150 | 150 | 150 | 150 | 150 |
α₁ | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.5 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
OxfordPets | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
UCF101 | Warmup epochs | 20 | 20 | 20 | 20 | 20 | 1 | 20 | 20 | 1 | 1 | 1 | 1 |
α₁ | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
EuroSAT | Warmup epochs | 10 | 10 | 10 | 1 | 1 | 1 | 10 | 10 | 10 | 10 | 1 | 1 |
α₁ | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.1 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.1 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
DTD | Warmup epochs | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
α₁ | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.2 | 0.1 | |
α₂ | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
If you use this code in your research, please kindly cite the following paper.
@inproceedings{guo2024joapr,
title={JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models},
author={Guo, Yuncheng and Gu, Xiaodong},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={28695--28705},
year={2024}
}