`fStreamerImpl not properly initialized` error at runtime

Running a recent HLT menu in `CMSSW_15_0_0_pre2`, I see the runtime error in [1].

Some facts and circumstancial evidence.
 - The issue is not fully reproducible (so, right now I don't really have a reproducer).
 - The issue happens pretty frequently in the HLT workflow I'm testing. The latter consists in running 8 jobs in parallel, each with 32 threads and 24 concurrent events, on a machine with the same hardware as a "2022 HLT node", e.g. `hilton-c2b02-44-01` (2 AMD Milan CPUs + 2 NVIDIA GPUs). Fwiw, a readme + example of what I'm running is in [2] and [3] (the recipe assumes the use of one of the HLT/GPU nodes in the CMS online network; the instructions could be adapted to `lxplus` if needed).
 - I have seen the issue with and without offloading to GPUs ("without" meaning `options.accelerators = ["cpu"]`).
 - I have seen the issue starting with `CMSSW_15_0_0_pre2`, and I see it also in more recent `15_0_X` IBs.
 - I ran the same workflow more than once in `CMSSW_15_0_0_pre1`, and I have not seen this runtime error in that pre-release so far.
 - Just for the record, I saw two PRs (https://github.com/cms-sw/cmssw/pull/47068 and https://github.com/cms-sw/cmssw/pull/47073) that entered `15_0_0_pre2` and seemed to me loosely related to output modules, I locally reverted both PRs on top of `15_0_0_pre2`, and I still see the same runtime error as [1].
 - So far, I failed to reproduce the problem with simpler configurations (as opposed to a full-blown HLT menu running on multiple jobs using all threads).

Talking to @fwyzard and @makortel, the issue looks compatible with a race condition.

One suggestion by @makortel was to check if the error occurs at the very beginning of the job or later. I managed to reproduce the error enabling more `MessageLogger` outputs, and it happened ~60 events into the job (using 32 threads and 24 streams in the job), so early on in the job but not at the very beginning.

FYI: @cms-sw/hlt-l2

_Edit-1_ (Feb-09): a script used to reproduce the error on `lxplus` was added in https://github.com/cms-sw/cmssw/issues/47287#issuecomment-2646354465.

[1]
```
----- Begin Fatal Exception 26-Jan-2025 10:02:31 CET-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Processing  Event run: 386593 lumi: 94 event: 213402124 stream: 6
   [1] Running path '@finalPath'
   [2] Calling method for module GlobalEvFOutputModule/'hltOutputParkingSingleMuon8'
   Additional Info:
      [a] Fatal Root Error: @SUB=TClass::StreamerDefault
fStreamerImpl not properly initialized (0)

----- End Fatal Exception -------------------------------------------------
```

[2]
```bash
dirName=MY_TEST_DIR
cmsswRel=CMSSW_15_0_0_pre2

export SCRAM_ARCH=el8_amd64_gcc12
source /cvmfs/cms.cern.ch/cmsset_default.sh
export SITECONFIG_PATH="/opt/offline/SITECONF/local"

eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519

kinit $(logname)@CERN.CH
ssh -f -N -D18081 $(logname)@cmsusr.cms

mkdir -p /fff/user/"${USER}"/"${dirName}"
cd /fff/user/"${USER}"/"${dirName}"

cmsrel "${cmsswRel}"
cd "${cmsswRel}"/src
cmsenv
scram b
cd "${OLDPWD}"

git clone git@github.com:missirol/hltThroughputUtils.git -o missirol -b master

cd hltThroughputUtils

git clone git@github.com:missirol/patatrack-scripts.git -o missirol -b master_old
```

[3]
```bash
#!/bin/bash -e

[ $# -eq 1 ] || exit 1

jobLabel="${1}"
runNumber=386593
outDir=/fff/user/"${USER}"/output/hltThroughputUtils

run() {
  [ ! -d "${3}" ] || exit 1
  mkdir -p $(dirname "${3}")
  foo=$(printf "%125s") && echo ${foo// /-} && unset foo
  printf " %s\n" "${3}"
  foo=$(printf "%125s") && echo ${foo// /-} && unset foo
  rm -rf run"${runNumber}"
  ${2}/benchmark "${1}" -E cmsRun -r 4 -j "${4}" -t "${5}" -s "${6}" -e 40100 -g 1 -n --no-cpu-affinity -l "${3}" -k resources.json --tmpdir "${outDir}"/tmp |& tee "${3}".log
  ./merge_resources_json.py "${3}"/step*/pid*/resources.json > "${3}".json
  mv "${3}".log "${3}".json "${3}"
  cp "${1}" "${3}"
}

https_proxy=http://cmsproxy.cms:3128/ \
hltConfigFromDB --configName /dev/CMSSW_14_2_0/GRun/V11 > tmp.py

cp /gpu_data/store/data/Run2024*/EphemeralHLTPhysics/FED/run"${runNumber}"_cff.py .

# ensure MPS is disabled at the start
./stop-mps-daemon.sh

### Intermediate configuration file
cat <<@EOF >> tmp.py

process.load('run${runNumber}_cff')

from HLTrigger.Configuration.customizeHLTforCMSSW import customizeHLTforCMSSW

process = customizeHLTforCMSSW(process)

process.GlobalTag.globaltag = '150X_dataRun3_HLT_v1'

process.PrescaleService.lvl1DefaultLabel = '2p0E34'
process.PrescaleService.forceDefault = True

process.hltPixelTracksSoA.CAThetaCutBarrel = 0.00111685053
process.hltPixelTracksSoA.CAThetaCutForward = 0.00249872683
process.hltPixelTracksSoA.hardCurvCut = 0.695091509
process.hltPixelTracksSoA.dcaCutInnerTriplet = 0.0419242041
process.hltPixelTracksSoA.dcaCutOuterTriplet = 0.293522194
process.hltPixelTracksSoA.phiCuts = [
    832, 379, 481, 765, 1136,
    706, 656, 407, 1212, 404,
    699, 470, 652, 621, 1017,
    616, 450, 555, 572
]

# remove check on timestamp of online-beamspot payloads
process.hltOnlineBeamSpotESProducer.timeThreshold = int(1e6)

# same source settings as used online
process.source.eventChunkSize = 200
process.source.eventChunkBlock = 200
process.source.numBuffers = 4
process.source.maxBufferedFiles = 2

# taken from hltDAQPatch.py
process.options.numberOfConcurrentLuminosityBlocks = 2

# write a JSON file with the timing information
if hasattr(process, 'FastTimerService'):
    process.FastTimerService.writeJSONSummary = True

# remove HLTAnalyzerEndpath if present
if hasattr(process, 'HLTAnalyzerEndpath'):
    del process.HLTAnalyzerEndpath
@EOF

### Final configuration file (dump)
edmConfigDump tmp.py > "${jobLabel}"_dump.py
rm -rf tmp.py

### Throughput measurements (benchmark)
jobDirPrefix="${jobLabel}"-"${CMSSW_VERSION}"

## GPU MPS
unset CUDA_VISIBLE_DEVICES
./start-mps-daemon.sh
sleep 1
run "${jobLabel}"_dump.py ./patatrack-scripts "${outDir}"/"${jobDirPrefix}"-gpu_mps 8 32 24
./stop-mps-daemon.sh
sleep 1

rm -rf "${jobLabel}"*{cfg,dump}.py
rm -rf run"${runNumber}"
rm -rf run"${runNumber}"_cff.py
rm -rf __pycache__ tmp
rm -rf tmp.py
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`fStreamerImpl not properly initialized` error at runtime #47287

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fStreamerImpl not properly initialized error at runtime #47287

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`fStreamerImpl not properly initialized` error at runtime #47287