This project provides a benchmark framework for evaluating Vision-Language Models (VLMs) on emotion understanding, emotion reasoning and emotion-guided content generation tasks. It is designed for standardized evaluation across multiple datasets and task formulations.
Install the minimal runtime environment:
# Install in editable mode (recommended for CLI use)
pip install -e .
# Or traditional method
pip install -r requirements.txt
To contribute or extend this project, follow the development setup below:
# 1. Create and activate a virtual environment (recommended)
conda create -n aica-vlm
conda activate aica-vlm
# 2. Install core and dev dependencies
pip install -r requirements.txt -r requirements-dev.txt
# 3. Set up pre-commit hooks
pre-commit install
Run pre-commit on all files:
pre-commit run --all-files
Once installed, use the CLI tool aica-vlm to run dataset construction and instruction generation.
aica-vlm build-dataset run benchmark_datasets/example.yaml --mode random
- mode: random(default), balanced
# For Base instruction generation
aica-vlm build-instruction run benchmark_datasets/example.yaml
# For Chain of Thought (CoT) generation
aica-vlm build-instruction run-cot benchmark_datasets/example_CoT.yaml
aica-vlm benchmark benchmark_datasets/example.yaml