If spanet_dijets_eval.py
was modified, create a new docker image, otherwise select one of the following images:
gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-dijets-eval
: first working image (old inputs)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all
: second image (first time supporting signal inputs, but crashing)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all-210622
: third image (first time supporting signal inputs)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all-230622
: forth image (crashes due to print of missing variable)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all-240622
: fifth image (fixed print of missing variable but wrong output file name)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all-250622
: sixth image (fixed output file name, crashes on single-event inputs)gitlab-registry.cern.ch/jbossios/docker-images/atlas-spanet-jona-eval-all-010722
: latest image (spanet fixed to work on inputs w/ 1 event only)
Run the following to create a new docker image (change CUSTOM
accordingly):
sudo docker login gitlab-registry.cern.ch
sudo docker build . -f Dockerfile -t gitlab-registry.cern.ch/jbossios/docker-images/CUSTOM
sudo docker push gitlab-registry.cern.ch/jbossios/docker-images/CUSTOM
Repository with docker images: https://gitlab.cern.ch/jbossios/docker-images
Get Python3.8+:
source Setup.sh
Set the following in create_yaml.py
path
: path to dijet H5 filesn_steps_per_file
: number of steps/jobs per yaml fileversions
: list of SPANet's versions
Run script:
python create_yaml.py
On a notebook terminal run the following*:
kubectl delete secret krb-secret
kinit USERNAME
kubectl create secret generic krb-secret --from-file=/tmp/krb5cc_1000
cp -r /eos/atlas/atlascerngroupdisk/phys-susy/RPV_mutlijets_ANA-SUSY-2019-24/spanet_jona/SPANET_package_backup_notebook/SPANet .
cd SPANet/
- To open a jupyter notebook on Kubeflow, follow these steps:
- ssh -D 8090 lxplus.cern.ch
- google-chrome --proxy-server=socks5://127.0.0.1:8090
- Go to https://ml.cern.ch
- Create a notebook using 1 GPU and the following image: gitlab-registry.cern.ch/ai-ml/kubeflow_images/atlas-pytorch-gpu:0183442cdb7ad58434d6626b2ac6ff2befffa9a9
- Go to https://ml.cern.ch/_/pipeline/?
- Follow the '+ Upload pipeline' link.
- Define
Pipeline Name
(must be unique) - Set
Pipeline Description
tonamespace: jonathan-bossio
- Upload yaml file and click
create
- After new page is loaded, click
+ Create run
- Choose experiment and click
Start
- Monitor pipeline under Pipelines > Experiments
The submit_pipelines.py
script should be already present (but if it is outdated, clone this repo or copy here new version)
Set the following in submit_pipelines.py
:
date
: should match the date of the yaml filesn_yaml_files_per_version
: should match the number of yaml files created per SPANet's versionssample_type
: signal or dijetsversions
: SPANet versions to use
Submit pipelines with the following (inside a kubeflow jupyter notebook):
python3 submit_pipelines.py
Pipelines can be monitored on https://ml.cern.ch (Pipelines > Experiments)
import kfp
client = kfp.Client()
print([pipeline.name for pipeline in client.list_pipelines().pipelines])
Delete the pipeline of your choice with the following
client.delete_pipeline(client.get_pipeline_id("PIPELINE_NAME"))
- Get list of workflows
kubectl -n jonathan-bossio get workflows
- List pods from a workflow (example for spanet-dijets-eval-23022022-69hzrc5)
kubectl -n jonathan-bossio get pods | grep spanet-dijets-eval-23022022-69hzrc5
- Get log for a given pod (example for pod spanet-dijets-eval-23022022-69hzrc5-918158780)
kubectl -n jonathan-bossio logs spanet-dijets-eval-23022022-69hzrc5-918158780 main