Restructure and comment the tau embedding method #48408

winterchristian · 2025-06-25T14:35:18Z

PR description:

This PR restructures and cleans the tau embedding method (TauAnalysis/MCEmbeddingTools) so that it is possible to produce RUN 3 tau embedding samples with CMS submission workflows.
One goal was to remove the need of --customise or --customise_commands in the tau embedding cmsDriver.py commands.
This was possible due to modifiers and the possibility to add python config fragments to some cmsDriver.py steps.
Only the HLT step required additional modification of the cmsDriver.py config builder.

The other goal of this PR is to have more understandable, better commented code and therefore sustainable code.

Tau embedding `cmsDriver.py` commands

With the changes introduced in this PR, tau embedding samples can be produced using the following cmsDriver.py commands.

Selection

cmsDriver.py \
--step RAW2DIGI,L1Reco,RECO,PAT,FILTER:TauAnalysis/MCEmbeddingTools/Selection_FILTER_cff.makePatMuonsZmumuSelection \
--processName SELECT \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingSelection \
--datatier RAWRECO \
--filein "root://cmsdcache-kit-disk.gridka.de:1094//store/data/Run2024C/Muon0/RAW/v1/000/380/115/00000/00979445-916c-42e2-8038-428d7bd4f176.root" \
--fileout ...

LHE and Cleaning

cmsDriver.py \
--step USER:TauAnalysis/MCEmbeddingTools/LHE_USER_cff.embeddingLHEProducerTask,RAW2DIGI,RECO:TauAnalysis/MCEmbeddingTools/Cleaning_RECO_cff.reconstruction \
--processName LHEembeddingCLEAN \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingCleaning \
--datatier RAWRECO \
--procModifiers tau_embedding_mu_to_mu \
--filein ... \
--fileout ...

Simulation Gen

cmsDriver.py TauAnalysis/MCEmbeddingTools/python/Simulation_GEN_cfi.py \
--step GEN,SIM,DIGI,L1,DIGI2RAW \
--processName SIMembeddingpreHLT \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimGen \
--datatier RAWSIM \
--procModifiers tau_embedding_mu_to_mu \
--filein ... \
--fileout ...

Simulation HLT

cmsDriver.py \
--step HLT:TauAnalysis/MCEmbeddingTools/Simulation_HLT_customiser_cff.embeddingHLTCustomiser.Fake2 \
--processName SIMembeddingHLT \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimHLT \
--datatier RAWSIM \
--filein ... \
--fileout ...

Simulation Reco

cmsDriver.py \
--step RAW2DIGI,L1Reco,RECO:TauAnalysis/MCEmbeddingTools/Simulation_RECO_cff.reconstruction,RECOSIM \
--processName SIMembedding \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimReco \
--datatier RAW-RECO-SIM \
--filein ... \
--fileout ...

Merging

cmsDriver.py \
--step USER:TauAnalysis/MCEmbeddingTools/Merging_USER_cff.merge_step,PAT \
--processName MERGE \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingMerge \
--datatier USER \
--inputCommands 'keep *_*_*_*' \
--filein ... \
--fileout ...

NanoAOD

cmsDriver.py \
--step NANO:TauAnalysis/MCEmbeddingTools/Nano_cff.embedding_nanoAOD_seq \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingNANOAOD \
--datatier NANOAODSIM \
--filein ... \
--fileout ...

PR validation:

I had to disable the tests, as I first have to get RAW samples from tape to create a small set of samples I can use. The test .root files right now are only useable for the old structure.
I will deliver tests in the next pull request.

UPDATE
I reactivated the unit tests as I realized I can use the RAW samples from the release validation tests.
Both, the unit tests as the release validation tests worked locally.

…cmsDriver HLT step

…mbedding table

…naming

…the past

cmsbuild · 2025-06-25T14:35:52Z

cms-bot internal usage

cmsbuild · 2025-06-25T14:37:53Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48408/45306

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/python/Merging_cff.py:
  - Added: 4276cf0
  - Deleted: 0928b23
There are other open Pull requests which might conflict with changes you have proposed:
- File Configuration/Applications/python/ConfigBuilder.py modified in PR(s): Use RNTuple output module for NanoAOD #48162

cmsbuild · 2025-06-25T14:37:57Z

A new Pull Request was created by @winterchristian for master.

It involves the following packages:

Configuration/Applications (operations)
Configuration/ProcessModifiers (operations)
TauAnalysis/MCEmbeddingTools (simulation)

@antoniovilela, @civanch, @cmsbuild, @davidlange6, @fabiocos, @kpedro88, @mandrenguyen, @mdhildreth, @rappoccio can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @fabiocos, @makortel, @missirol, @mmusich this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

TauAnalysis/MCEmbeddingTools/docs/README.md

kpedro88 · 2025-06-25T15:00:20Z

TauAnalysis/MCEmbeddingTools/docs/cmsDriver_cmds_DATA_RUN3.sh

@@ -0,0 +1,167 @@
+#!/bin/bash


is there a reason this is in the docs folder rather than the test folder?

I will add it to the test folder once I have prepared a 2024 RAW file with events that are not filtered out and that can then be used as input. I thought it is still a good idea to provide a script to execute the tau embedding code.

TauAnalysis/MCEmbeddingTools/docs/cmsDriver_cmds_DATA_RUN3.sh

TauAnalysis/MCEmbeddingTools/plugins/MuMuForEmbeddingSelector.cc

TauAnalysis/MCEmbeddingTools/test/run_2016postVFPUL_workflow_tests.sh

kpedro88 · 2025-06-25T15:07:23Z

@smuzaffar will RAW files used as input for unit tests in TauAnalysis/MCEmbeddingTools/test automatically get cached?

kpedro88 · 2025-07-01T17:06:25Z

please test

cmsbuild · 2025-07-01T19:01:41Z

-1

Failed Tests: UnitTests RelVals-INPUT
Size: This PR adds an extra 48KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-825b5c/47036/summary.html
COMMIT: 3f549be
CMSSW: CMSSW_15_1_X_2025-07-01-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48408/47036/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test test-das-selected-lumis had ERRORS

RelVals-INPUT

136.902136.902_RunDoubleMuon2016H/step2_RunDoubleMuon2016H.log
136.903136.903_RunDoubleMuon2017B/step2_RunDoubleMuon2017B.log
136.901136.901_RunDoubleMuon2016C/step2_RunDoubleMuon2016C.log

Expand to see more relval errors ...

136.904

Comparison Summary

Summary:

You potentially removed 3 lines from the logs
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 4067195
DQMHistoTests: Total failures: 8170
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4059005
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
Checked 215 log files, 184 edm output root files, 50 DQM output files
TriggerResults: found differences in 2 / 48 workflows

cmsbuild · 2025-07-02T14:33:21Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48408/45387

Found files with invalid states:
- TauAnalysis/MCEmbeddingTools/python/Merging_cff.py:
  - Added: 4276cf0
  - Deleted: 0928b23
There are other open Pull requests which might conflict with changes you have proposed:
- File Configuration/Applications/python/ConfigBuilder.py modified in PR(s): Use RNTuple output module for NanoAOD #48162
- File Configuration/EventContent/python/EventContent_cff.py modified in PR(s): TICL-barrel: run CLUE in the barrel calorimeters and first workflows #47859

cmsbuild · 2025-07-02T14:33:44Z

Pull request #48408 was updated. @AdrianoDee, @Moanwar, @antoniovilela, @civanch, @cmsbuild, @davidlange6, @DickyChant, @fabiocos, @kpedro88, @mandrenguyen, @mdhildreth, @miquork, @rappoccio, @srimanob, @subirsarkar can you please check and sign again.

winterchristian · 2025-07-02T14:34:32Z

I forgot to adapt the release validation tests, sorry. I updated them in cee48c2, and they should now work (at least they work locally).
With 2deb8be I also reactivated the unit tests, as I realized I can use the RAW datasets from the release validation as input for the unit tests.

Please run the tests again.

civanch · 2025-07-03T12:33:42Z

please test

cmsbuild · 2025-07-03T14:39:50Z

+1

Size: This PR adds an extra 116KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-825b5c/47068/summary.html
COMMIT: 2deb8be
CMSSW: CMSSW_15_1_X_2025-07-03-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48408/47068/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially added 3 lines to the logs
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 4067195
DQMHistoTests: Total failures: 66
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4067109
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
Checked 215 log files, 184 edm output root files, 50 DQM output files
TriggerResults: no differences found

Christian Winter added 13 commits April 11, 2025 15:15

disable InitialRecoCorrection

bb3c257

add tau embedding process modifiers

d444d56

first version without customize, but not HLT step

4276cf0

rewrite tau embedding HLT step, by adding customise functionality in …

049052d

…cmsDriver HLT step

add config fragment for embedding NanoAOD step with special NanoAOD e…

b181b12

…mbedding table

remove not needed python config fragments and switch to a consistent …

0928b23

…naming

remove not needed functionalities, which were used for some tests in …

6806f33

…the past

add comments to filter python config fragment

07ae6b0

add comments to tau embedding python config fragments

e85256f

move tau embedding header files to interface folder

93b19a7

add some documentation to the tau embedding method

fdd6c24

format code

c3d3198

change tests to new structure

0be347b

cmsbuild added this to the CMSSW_15_1_X milestone Jun 25, 2025

cmsbuild added simulation-pending operations-pending pending-signatures tests-pending orp-pending code-checks-pending labels Jun 25, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Jun 25, 2025

kpedro88 reviewed Jun 25, 2025

View reviewed changes

fix typos

ee8f3b4

cmsbuild added code-checks-pending and removed code-checks-approved labels Jun 25, 2025

cmsbuild added tests-started and removed tests-pending labels Jul 1, 2025

cmsbuild added tests-rejected and removed tests-started labels Jul 1, 2025

Christian Winter added 2 commits July 2, 2025 13:55

fix spelling

ed3dfa7

adapt cmsDriver commands for release validation and add tests for 2022

cee48c2

cmsbuild mentioned this pull request Jul 2, 2025

TICL-barrel: run CLUE in the barrel calorimeters and first workflows #47859

Open

switch to IBEos relval datasets in unittests and reactivate them

2deb8be

cmsbuild added tests-pending pdmv-pending upgrade-pending code-checks-pending and removed tests-rejected code-checks-approved labels Jul 2, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Jul 2, 2025

cmsbuild added tests-started and removed tests-pending labels Jul 3, 2025

cmsbuild added operations-approved tests-approved and removed operations-pending tests-started labels Jul 3, 2025

cmsbuild mentioned this pull request Jul 4, 2025

Moving 2024 Data Wfs HLT Key to hltKey2024 #48479

Open

Restructure and comment the tau embedding method #48408

Are you sure you want to change the base?

Restructure and comment the tau embedding method #48408

Uh oh!

Conversation

winterchristian commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description:

Tau embedding cmsDriver.py commands

PR validation:

Uh oh!

cmsbuild commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented Jun 25, 2025

Uh oh!

cmsbuild commented Jun 25, 2025

Uh oh!

Uh oh!

Uh oh!

kpedro88 Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

winterchristian Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kpedro88 commented Jun 25, 2025

Uh oh!

kpedro88 commented Jul 1, 2025

Uh oh!

cmsbuild commented Jul 1, 2025

Unit Tests

RelVals-INPUT

Comparison Summary

Uh oh!

cmsbuild commented Jul 2, 2025

Uh oh!

cmsbuild commented Jul 2, 2025

Uh oh!

winterchristian commented Jul 2, 2025

Uh oh!

civanch commented Jul 3, 2025

Uh oh!

cmsbuild commented Jul 3, 2025

Comparison Summary

Uh oh!

Uh oh!

winterchristian commented Jun 25, 2025 •

edited

Loading

Tau embedding `cmsDriver.py` commands

cmsbuild commented Jun 25, 2025 •

edited

Loading