[BUG] AttributeError: 'Accelerator' object has no attribute 'deepspeed_config'

**Describe the bug**
on_train_end, raise AttributeError: 'Accelerator' object has no attribute 'deepspeed_config' 

**To Reproduce**
None


**Expected behavior**
A clear and concise description of what you expected to happen.

**ds_report output**
[2023-08-14 18:02:42,266] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
 [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
 [WARNING]  using untested triton version (2.0.0), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/torch']
torch version .................... 2.0.1+cu117
deepspeed install path ........... ['/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/deepspeed']
deepspeed info ................... 0.10.0, unknown, unknown
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 11.3
deepspeed wheel compiled w. ...... torch 2.0, cuda 11.7

**Screenshots**
Traceback (most recent call last):
  File "main.py", line 430, in <module>
    main()
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
    return f(*args, **kwargs)
  File "main.py", line 374, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/trainer.py", line 1539, in train
    return inner_training_loop(
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/trainer.py", line 1971, in _inner_training_loop
    self.control = self.callback_handler.on_train_end(args, self.state, self.control)
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/trainer_callback.py", line 356, in on_train_end
    return self.call_event("on_train_end", args, state, control)
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/trainer_callback.py", line 397, in call_event
    result = getattr(callback, event)(
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/integrations.py", line 770, in on_train_end
    fake_trainer.save_model(temp_dir)
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/transformers/trainer.py", line 2758, in save_model
    state_dict = self.accelerator.get_state_dict(self.deepspeed)
  File "/home/maojianguo/anaconda3/envs/mjg_torch2.0.1/lib/python3.8/site-packages/accelerate/accelerator.py", line 2829, in get_state_dict
    if self.deepspeed_config["zero_optimization"]["stage"] == 3:
AttributeError: 'Accelerator' object has no attribute 'deepspeed_config'


**System info (please complete the following information):**
 - OS: Ubuntu 18.04 
 - GPU count and types : one machine with x8 A800s
 - Python version: 3.8
-  transformers: 4.31.0
-  deepspeed: 0.10.0
- accelerator: 2023.7.18.dev1


**Launcher context**
{
  "train_micro_batch_size_per_gpu": "auto",
  "zero_allow_untested_optimizer": true,
  "fp16": {
    "enabled": "auto",
    "loss_scale": 0,
    "initial_scale_power": 16,
    "loss_scale_window": 1000,
    "hysteresis": 2,
    "min_loss_scale": 1
  },
  "zero_optimization": {
    "stage": 1,
    "allgather_partitions": true,
    "allgather_bucket_size": 5e8,
    "overlap_comm": false,
    "reduce_scatter": true,
    "reduce_bucket_size": 5e8,
    "contiguous_gradients" : true,
    "offload_optimizer": {
        "device": "cpu",
        "pin_memory": true,
        "buffer_count": 4,
        "fast_init": false
    }
  },
  "gradient_accumulation_steps": "auto",
  "steps_per_print": "auto",
  "bf16": {
    "enabled": "auto"
  }
}




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] AttributeError: 'Accelerator' object has no attribute 'deepspeed_config' #4143

ds_report output
[2023-08-14 18:02:42,266] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)

DeepSpeed C++/CUDA extension op report

NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja
ninja .................. [OKAY]

op name ................ installed .. compatible

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] AttributeError: 'Accelerator' object has no attribute 'deepspeed_config' #4143

Description

ds_report output [2023-08-14 18:02:42,266] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)

DeepSpeed C++/CUDA extension op report

NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja ninja .................. [OKAY]

op name ................ installed .. compatible

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ds_report output
[2023-08-14 18:02:42,266] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)

NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja
ninja .................. [OKAY]