Skip to content

T0 to produce skimmed RAW data through Repack Workflow #12298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

LinaresToine
Copy link

Fixes #12297

Status

Ready

Description

In order to support the new raw skim datasets, we need to allow two things:

  • The moduleLabel in Repack.py can't have - in its name, so we introduce the parameter parentDataset in Outputs. This way we can create a legal name for moduleLabel:
output['moduleLabel'] = "write_%s_RawSkim_%s_%s" % (output['parentDataset'],
                                                                                             output['rawSkim'],
                                                                                             output['dataTier'])

We then delete the new attribute similarly to what is done in setupProcessingTask.

  • The repack workflow needs to pass a global tag to CMSSW in order to use the desired trigger paths.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

T0 PR: dmwm/T0#5041
CMSSW PR:

External dependencies / deployment changes

NO

@dmwm-bot
Copy link

dmwm-bot commented Mar 7, 2025

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 5 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/449/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

dmwm-bot commented Mar 7, 2025

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 5 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/450/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

dmwm-bot commented Mar 7, 2025

Jenkins results:

  • Python3 Unit tests: succeeded
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 5 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/451/artifact/artifacts/PullRequestReport.html

@LinaresToine LinaresToine changed the title Raw skim T0 to produce skimmed RAW data through Repack Workflow Mar 7, 2025
@dmwm-bot
Copy link

dmwm-bot commented Mar 7, 2025

Jenkins results:

  • Python3 Unit tests: succeeded
    • 4 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 4 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/452/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 3 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 4 comments to review
  • Pycodestyle check: succeeded
    • 2 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/470/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 4 comments to review
  • Pycodestyle check: succeeded
    • 2 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/474/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 4 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/475/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 10 warnings and errors that must be fixed
    • 2 warnings
    • 93 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/479/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 144 new failures
    • 13 changes in unstable tests
  • Python3 Pylint check: failed
    • 14 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/488/artifact/artifacts/PullRequestReport.html

@LinaresToine LinaresToine force-pushed the raw-skim branch 2 times, most recently from 98804d2 to bf3e6cf Compare March 13, 2025 22:40
@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 11 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/489/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 11 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/490/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 11 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/497/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 11 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/498/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

dmwm-bot commented Apr 7, 2025

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 11 warnings and errors that must be fixed
    • 2 warnings
    • 94 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/550/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

dmwm-bot commented Apr 7, 2025

Jenkins results:

  • Python3 Unit tests: succeeded
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 10 warnings and errors that must be fixed
    • 2 warnings
    • 93 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/551/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 126 comments to review
  • Pycodestyle check: succeeded
    • 5 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/554/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 2 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 125 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/556/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 125 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/557/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 126 comments to review
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/558/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 1 new failures
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 127 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/559/artifact/artifacts/PullRequestReport.html

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 3 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 127 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/560/artifact/artifacts/PullRequestReport.html

@LinaresToine
Copy link
Author

Hello @anpicci @todor-ivanov I have modified the unit tests for the repack workflow, and seems like I got those to work. However I see many other tests that failed and I don't understand their relationship with the changes I made:

https://cmssdt.cern.ch/dmwm-jenkins/job/WMCore-PR-Report/560/#showFailuresLink

There I dont see any of the StdBase.py and any of the Repack.py tests failing. Could you please have a look and let me know what I can do to improve those test?

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LinaresToine the 2 failing unit tests are unstable, so there is nothing to worry with that.
For the Repack unit test, if there was no complain, that means the status of the unit test did not change (but to be on the safe side, perhaps we should ensure that it is succeeding in jenkins).

output['dataTier'])
moduleLabel = "write_%s_%s" % (output['primaryDataset'],
output['dataTier'])
output['moduleLabel'] = moduleLabel.replace("-", "_") # For T0 Raw Skims, PDs will contain a "-", so here we replace for "_" for the moduleLabel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Antonio, can you please move this comment to the line above instead of inline?

Being really honest, I fear that this change can cause us "hidden" problems in the future. Is there any strong reason not to create a PD named with underscore instead of a dash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, given that it is a skim dataset, why changing the primary dataset and not the processing string (what goes between primary dataset and datatier)?

Copy link
Author

@LinaresToine LinaresToine Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comments @amaltaro. It all comes down to creating RAW data. The value of the Raw Skim datasets is that we are producing RAW data that can be Prompt Reconstructed if desired. This means that this feature takes effect in the Repack workflow. In short, we are creating two RAW outputs from the same Repack workflow, and we must distinguish between them.

Lets say then

/PD/Era-v1/RAW
/PD/Era-RawSkim-v1/RAW

which has a few inconveniences:
a). It is not trivial at all for T0 to give the new output an independent prompt reco configuration. This has the additional limitation of sending both sets of RAW data to the same destinations, which is not wanted. Allowing us to configure all these details only for the skimmed RAW is really what makes this project possible. Treating it as a primary dataset gives us this freedom.
b). It does not save us from the module label problem, since those two outputs would have the same module label by definition.

May I please ask what your concern with the dash is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to add that we do produce PDs with a dash (the error PDs), and they are processed without a problem through the system (Tier-0/WMCore/Rucio/DBS). We don't think accepting dashes in the skimmed PDs would cause a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I guess the tape families is a good argument to make this change at the PD level instead of the PS.

My only concern is that this moduleLabel could be used downstream and no longer be consistent with the output module. However, as that naming conversion only happens at the Repack factory, the risk is much smaller.

Thank you for the follow up, it looks good to me.

@LinaresToine
Copy link
Author

LinaresToine commented Apr 15, 2025

The repack tests were failing before my changes in the commit 0b888e3

Please see

However, it was not clear to me if my development required modification or addition of unit tests for the StdBase module. Existing tests were successful.

@dmwm-bot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 12 warnings and errors that must be fixed
    • 2 warnings
    • 127 comments to review
  • Pycodestyle check: succeeded
    • 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/WMCore-PR-Report/572/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

These changes are looking good to me. Can you please squash these commits? See some information on this in "Step 10" at https://github.com/dmwm/WMCore/blob/master/CONTRIBUTING.rst#contributing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

T0 to produce skimmed RAW data through Repack Workflow
4 participants