Skip to content

Restructure and comment the tau embedding method #48408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: master
Choose a base branch
from

Conversation

winterchristian
Copy link
Contributor

@winterchristian winterchristian commented Jun 25, 2025

PR description:

This PR restructures and cleans the tau embedding method (TauAnalysis/MCEmbeddingTools) so that it is possible to produce RUN 3 tau embedding samples with CMS submission workflows.
One goal was to remove the need of --customise or --customise_commands in the tau embedding cmsDriver.py commands.
This was possible due to modifiers and the possibility to add python config fragments to some cmsDriver.py steps.
Only the HLT step required additional modification of the cmsDriver.py config builder.

The other goal of this PR is to have more understandable, better commented code and therefore sustainable code.

Tau embedding cmsDriver.py commands

With the changes introduced in this PR, tau embedding samples can be produced using the following cmsDriver.py commands.

Selection
cmsDriver.py \
--step RAW2DIGI,L1Reco,RECO,PAT,FILTER:TauAnalysis/MCEmbeddingTools/Selection_FILTER_cff.makePatMuonsZmumuSelection \
--processName SELECT \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingSelection \
--datatier RAWRECO \
--filein "root://cmsdcache-kit-disk.gridka.de:1094//store/data/Run2024C/Muon0/RAW/v1/000/380/115/00000/00979445-916c-42e2-8038-428d7bd4f176.root" \
--fileout ...
LHE and Cleaning
cmsDriver.py \
--step USER:TauAnalysis/MCEmbeddingTools/LHE_USER_cff.embeddingLHEProducerTask,RAW2DIGI,RECO:TauAnalysis/MCEmbeddingTools/Cleaning_RECO_cff.reconstruction \
--processName LHEembeddingCLEAN \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingCleaning \
--datatier RAWRECO \
--procModifiers tau_embedding_mu_to_mu \
--filein ... \
--fileout ...
Simulation Gen
cmsDriver.py TauAnalysis/MCEmbeddingTools/python/Simulation_GEN_cfi.py \
--step GEN,SIM,DIGI,L1,DIGI2RAW \
--processName SIMembeddingpreHLT \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimGen \
--datatier RAWSIM \
--procModifiers tau_embedding_mu_to_mu \
--filein ... \
--fileout ...
Simulation HLT
cmsDriver.py \
--step HLT:TauAnalysis/MCEmbeddingTools/Simulation_HLT_customiser_cff.embeddingHLTCustomiser.Fake2 \
--processName SIMembeddingHLT \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimHLT \
--datatier RAWSIM \
--filein ... \
--fileout ...
Simulation Reco
cmsDriver.py \
--step RAW2DIGI,L1Reco,RECO:TauAnalysis/MCEmbeddingTools/Simulation_RECO_cff.reconstruction,RECOSIM \
--processName SIMembedding \
--mc \
--beamspot DBrealistic \
--geometry DB:Extended \
--era Run3_2024 \
--conditions auto:phase1_2024_realistic \
--eventcontent TauEmbeddingSimReco \
--datatier RAW-RECO-SIM \
--filein ... \
--fileout ...
Merging
cmsDriver.py \
--step USER:TauAnalysis/MCEmbeddingTools/Merging_USER_cff.merge_step,PAT \
--processName MERGE \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingMerge \
--datatier USER \
--inputCommands 'keep *_*_*_*' \
--filein ... \
--fileout ...
NanoAOD
cmsDriver.py \
--step NANO:TauAnalysis/MCEmbeddingTools/Nano_cff.embedding_nanoAOD_seq \
--data \
--scenario pp \
--conditions auto:run3_data \
--era Run3_2024 \
--eventcontent TauEmbeddingNANOAOD \
--datatier NANOAODSIM \
--filein ... \
--fileout ...

PR validation:

I had to disable the tests, as I first have to get RAW samples from tape to create a small set of samples I can use. The test .root files right now are only useable for the old structure.
I will deliver tests in the next pull request.

UPDATE
I reactivated the unit tests as I realized I can use the RAW samples from the release validation tests.
Both, the unit tests as the release validation tests worked locally.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 25, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48408/45306

  • Found files with invalid states:

    • TauAnalysis/MCEmbeddingTools/python/Merging_cff.py:
  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @winterchristian for master.

It involves the following packages:

  • Configuration/Applications (operations)
  • Configuration/ProcessModifiers (operations)
  • TauAnalysis/MCEmbeddingTools (simulation)

@antoniovilela, @civanch, @cmsbuild, @davidlange6, @fabiocos, @kpedro88, @mandrenguyen, @mdhildreth, @rappoccio can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @fabiocos, @makortel, @missirol, @mmusich this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@@ -0,0 +1,167 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason this is in the docs folder rather than the test folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it to the test folder once I have prepared a 2024 RAW file with events that are not filtered out and that can then be used as input. I thought it is still a good idea to provide a script to execute the tau embedding code.

@kpedro88
Copy link
Contributor

@smuzaffar will RAW files used as input for unit tests in TauAnalysis/MCEmbeddingTools/test automatically get cached?

@kpedro88
Copy link
Contributor

kpedro88 commented Jul 1, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 1, 2025

-1

Failed Tests: UnitTests RelVals-INPUT
Size: This PR adds an extra 48KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-825b5c/47036/summary.html
COMMIT: 3f549be
CMSSW: CMSSW_15_1_X_2025-07-01-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48408/47036/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found 1 errors in the following unit tests:

---> test test-das-selected-lumis had ERRORS

RelVals-INPUT

  • 136.902136.902_RunDoubleMuon2016H/step2_RunDoubleMuon2016H.log
  • 136.903136.903_RunDoubleMuon2017B/step2_RunDoubleMuon2017B.log
  • 136.901136.901_RunDoubleMuon2016C/step2_RunDoubleMuon2016C.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4067195
  • DQMHistoTests: Total failures: 8170
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4059005
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 215 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: found differences in 2 / 48 workflows

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 2, 2025

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48408/45387

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 2, 2025

Pull request #48408 was updated. @AdrianoDee, @Moanwar, @antoniovilela, @civanch, @cmsbuild, @davidlange6, @DickyChant, @fabiocos, @kpedro88, @mandrenguyen, @mdhildreth, @miquork, @rappoccio, @srimanob, @subirsarkar can you please check and sign again.

@winterchristian
Copy link
Contributor Author

I forgot to adapt the release validation tests, sorry. I updated them in cee48c2, and they should now work (at least they work locally).
With 2deb8be I also reactivated the unit tests, as I realized I can use the RAW datasets from the release validation as input for the unit tests.

Please run the tests again.

@civanch
Copy link
Contributor

civanch commented Jul 3, 2025

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 3, 2025

+1

Size: This PR adds an extra 116KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-825b5c/47068/summary.html
COMMIT: 2deb8be
CMSSW: CMSSW_15_1_X_2025-07-03-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48408/47068/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 3 lines to the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 4067195
  • DQMHistoTests: Total failures: 66
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4067109
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 215 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants