Skip to content

Files

Latest commit

b1238b9 · Jan 24, 2025

History

History
298 lines (215 loc) · 21 KB

README.md

File metadata and controls

298 lines (215 loc) · 21 KB

InternImage for Semantic Segmentation

This folder contains the implementation of the InternImage for semantic segmentation.

Our segmentation code is developed on top of MMSegmentation v0.27.0.

Installation

  • Clone this repository:
git clone https://github.com/OpenGVLab/InternImage.git
cd InternImage
  • Create a conda virtual environment and activate it:
conda create -n internimage python=3.9
conda activate internimage

For examples, to install torch==1.11 with CUDA==11.3:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
  • Install other requirements:

    note: conda opencv will break torchvision as not to support GPU, so we need to install opencv using pip.

conda install -c conda-forge termcolor yacs pyyaml scipy pip -y
pip install opencv-python
  • Install timm, mmcv-full and `mmsegmentation':
pip install -U openmim
mim install mmcv-full==1.5.0
mim install mmsegmentation==0.27.0
pip install timm==0.6.11 mmdet==2.28.1
  • Install other requirements:
pip install opencv-python termcolor yacs pyyaml scipy
# Please use a version of numpy lower than 2.0
pip install numpy==1.26.4
pip install pydantic==1.10.13
  • Compile CUDA operators

Before compiling, please use the nvcc -V command to check whether your nvcc version matches the CUDA version of PyTorch.

cd ./ops_dcnv3
sh ./make.sh
# unit test (should see all checking is True)
python test.py
  • You can also install the operator using precompiled .whl files DCNv3-1.0-whl

Data Preparation

Prepare datasets according to the guidelines in MMSegmentation.

Released Models

Dataset: ADE20K
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
UperNet InternImage-T 512x512 47.9 / 48.1 59M 944G config ckpt | log
UperNet InternImage-S 512x512 50.1 / 50.9 80M 1017G config ckpt | log
UperNet InternImage-B 512x512 50.8 / 51.3 128M 1185G config ckpt | log
UperNet InternImage-L 640x640 53.9 / 54.1 256M 2526G config ckpt | log
UperNet InternImage-XL 640x640 55.0 / 55.3 368M 3142G config ckpt | log
UperNet InternImage-H 896x896 59.9 / 60.3 1.12B 3566G config ckpt | log
Mask2Former InternImage-H 896x896 62.6 / 62.9 1.31B 4635G config ckpt | log
Dataset: Cityscapes
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
UperNet InternImage-T 512x1024 82.58 / 83.40 59M 1889G config ckpt | log
UperNet InternImage-S 512x1024 82.74 / 83.45 80M 2035G config ckpt | log
UperNet InternImage-B 512x1024 83.18 / 83.97 128M 2369G config ckpt | log
UperNet InternImage-L 512x1024 83.68 / 84.41 256M 3234G config ckpt | log
UperNet* InternImage-L 512x1024 85.94 / 86.22 256M 3234G config ckpt | log
UperNet InternImage-XL 512x1024 83.62 / 84.28 368M 4022G config ckpt | log
UperNet* InternImage-XL 512x1024 86.20 / 86.42 368M 4022G config ckpt | log
SegFormer* InternImage-L 512x1024 85.16 / 85.67 220M 1580G config ckpt | log
SegFormer* InternImage-XL 512x1024 85.41 / 85.93 330M 2364G config ckpt | log
Mask2Former* InternImage-H 1024x1024 86.37 / 86.96 1094M 7878G config ckpt | log

* denotes the model is trained using extra Mapillary dataset.

Dataset: COCO-Stuff-164K
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
Mask2Former InternImage-H 896x896 52.6 / 52.8 1.31B 4635G config ckpt | log
Dataset: COCO-Stuff-10K
method backbone resolution mIoU (ss/ms) #params FLOPs Config Download
Mask2Former InternImage-H 512x512 59.2 / 59.6 1.28B 1528G config ckpt | log
Dataset: Pascal-Context-59
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
Mask2Former InternImage-H 480x480 69.7 / 70.3 1.07B 867G config ckpt | log
Dataset: NYU-Depth-V2
method backbone resolution mIoU (ss/ms) #param FLOPs Config Download
Mask2Former InternImage-H 480x480 67.1 / 68.1 1.07B 867G config ckpt | log
Dataset: Mapillary
method backbone resolution #param FLOPs Config Download
UperNet InternImage-L 512x1024 256M 3234G config ckpt
UperNet InternImage-XL 512x1024 368M 4022G config ckpt
SegFormer InternImage-L 512x1024 220M 1580G config ckpt
SegFormer InternImage-XL 512x1024 330M 2364G config ckpt
Mask2Former InternImage-H 896x896 1094M 7878G config ckpt

Evaluation

To evaluate our InternImage on ADE20K val, run:

sh dist_test.sh <config-file> <checkpoint> <gpu-num> --eval mIoU

For example, to evaluate the InternImage-T with a single GPU:

python test.py configs/ade20k/upernet_internimage_t_512_160k_ade20k.py pretrained/upernet_internimage_t_512_160k_ade20k.pth --eval mIoU

For example, to evaluate the InternImage-B with a single node with 8 GPUs:

sh dist_test.sh configs/ade20k/upernet_internimage_b_512_160k_ade20k.py pretrained/upernet_internimage_b_512_160k_ade20k.pth 8 --eval mIoU

Training

To train an InternImage on ADE20K, run:

sh dist_train.sh <config-file> <gpu-num>

For example, to train InternImage-T with 8 GPU on 1 node (total batch size 16), run:

sh dist_train.sh configs/ade20k/upernet_internimage_t_512_160k_ade20k.py 8

Manage Jobs with Slurm

For example, to train InternImage-XL with 8 GPU on 1 node (total batch size 16), run:

GPUS=8 sh slurm_train.sh <partition> <job-name> configs/ade20k/upernet_internimage_xl_640_160k_ade20k.py

Image Demo

To inference a single/multiple image like this. If you specify image containing directory instead of a single image, it will process all the images in the directory.

CUDA_VISIBLE_DEVICES=0 python image_demo.py \
  data/ade/ADEChallengeData2016/images/validation/ADE_val_00000591.jpg \
  configs/ade20k/upernet_internimage_t_512_160k_ade20k.py  \
  checkpoint_dir/seg/upernet_internimage_t_512_160k_ade20k.pth  \
  --palette ade20k

Export

Install mmdeploy at first:

pip install mmdeploy==0.14.0

To export a segmentation model from PyTorch to TensorRT, run:

MODEL="model_name"
CKPT_PATH="/path/to/model/ckpt.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info

For example, to export upernet_internimage_t_512_160k_ade20k from PyTorch to TensorRT, run:

MODEL="upernet_internimage_t_512_160k_ade20k"
CKPT_PATH="/path/to/model/ckpt/upernet_internimage_t_512_160k_ade20k.pth"

python deploy.py \
    "./deploy/configs/mmseg/segmentation_tensorrt_static-512x512.py" \
    "./configs/ade20k/${MODEL}.py" \
    "${CKPT_PATH}" \
    "./deploy/demo.png" \
    --work-dir "./work_dirs/mmseg/${MODEL}" \
    --device cuda \
    --dump-info