ECCE:HOWTO Run a simulation campaign on the OSG

From eicug
Jump to navigation Jump to search

These are instructions for running an ECCE simulation campaign on the OSG

Submitting from JLab

Only certain people should run a full production campaign. These are campaigns of more than 1M events that will use a significant portion of the compute and storage resources allocated to EIC/ECCE. We restrict this to limited people to ensure coordination with the simulation WG so work is not repeated at multiple sites and these larger resource campaigns are aligned with EIC/ECCE goals since there are limited number of them we can do.

The following example uses a directory on the /work/eic2 disk.

Pre-stage input files (i.e. generated events)

The input files should be copied to an appropriate directory at JLab prior to starting the campaign. Assuming they are not too big, the preferred area would be in /work/osgpool/eic since that makes them available via xrootd and therefore accessible from anywhere (e.g. OSG).

For example: /work/osgpool/eic/ECCE/ProductionInputs/Electroweak/ep-10x100/Djangoh/erhic-nc-yesradiation_ep_10_100_q2_10_evt.root

Checkout production scripts

To start with you need to create a working directory for the campaign. All top-level production campaigns should be placed in the directory /work/eic2/ECCE/PRODUCTION. Because this will be used in a few places later, set MYDIR to the full path to this.

   setenv MYDIR /work/eic2/ECCE/PRODUCTION/2021.07.21.Electroweak_Djangoh_ep-10x100nc-q2-10
   mkdir -p $MYDIR
   cd $MYDIR

Clone the productions repository.

   git clone

Run top-level script to generate all submission scripts

Run the script. This takes two arguments: 1.) the site the job submission scripts should be generated for (in this case "OSG") and 2.) a config file that specifies the parameters of the job. It requires pyroot in order to open the input files and get the number of events in each. Thus, you need to have your environment set up with an appropriate version of root. This should be run from within the productions directory

   cd productions
   source /apps/root/6.18.04/setroot_CUE.csh
   python3 ./ OSG productionSetups/run_Electroweak_Djangoh_ep-10x100nc-q2-10.txt

The script will automatically clone the macros repository and checkout the correct branch based on the configuration file. It will then call the appropriate site-specific script for generating submission scripts for each job.

Submit all jobs

A master top-level script called will also be created which can be used to submit all of the jobs in one command. All submission scripts will be placed in a directory tree starting with submissionFiles. This allows you to use a common productions and macros directory for all jobs. In order to submit to the OSG you must be on scosg16 or scosg20.

   ssh scosg16
   setenv MYDIR /work/eic2/ECCE/PRODUCTION/2021.07.21.Electroweak_Djangoh_ep-10x100nc-q2-10

Check output

First, note that the standard locations for the production and submission scripts are in the /work/eic2/ECCE/PRODUCTION tree and the standard directory for the DSTs, evaluator files, and logs are in the /work/eic2/ECCE/MC tree. e.g.


There are a few ways to check the progress of the campaign. The first is to just run condor_q on the osg submit node (e.g.

   ssh scosg16

The second is to use the extras/ script to generate a pair of plots showing the simultaneous running jobs vs. time and the total time taken to run a job.

   cd $MYDIR/productions
   root -l Njobs_vs_time.C

The third is to generate lists of jobs based on failure modes or success. This should be run at the end of the campaign to get total status. The script will create a directory named StatusReports where all of the files will be placed.

   cd $MYDIR/productions

Publishing results

The campaign output files are not placed in the xrootd server or the S3 server automatically. This allows you to confirm the campaign was successful and there were no major issues before posting the files. To publish the resulting evaluator files to BOTH the JLab xrootd and BNL S3 servers in the standard locations just run the script. It will read the campaign configuration from the submitParameters.dat file created when the script was run. This file lives in the same directory as the submission scripts. Use this, the will form the correct directory names on the xrootd and S3 servers.

   cd $MYDIR/productions