From EIC
Jump to navigation Jump to search

The EIC task force has a large number of simulation tools available for investigating different types of physics processes. Unless noted otherwise, these can be accessed from /afs/

Event Generators

The following event generators are currently available:

  • ep
    • DJANGOH: (un)polarised DIS generator with QED and QCD radiative effects for NC and CC events.
    • MILOU: A generator for deeply virtual Compton scattering (DVCS), the Bethe-Heitler process and their interference.
    • PYTHIA: A general-purpose high energy physics event generator.
    • PEPSI: A generator for polarised leptoproduction.
    • RAPGAP: A generator for deeply inelastic scattering (DIS) and diffractive e + p events.
  • eA
    • BeAGLE: Benchmark eA Generator for LEptoproduction - a generator to simulate ep/eA DIS events including nuclear shadowing effects (based on DPMJetHybrid)
    • eSTARlight: Monte Carlo that simulates coherent vector meson photo- and electro-production in electron-ion collisions.

These generators are not part of the current distribution but can be installed upon request:

  • ep
    • gmc_trans: A generator for semi-inclusive DIS with transverse-spin- and transverse-momentum-dependent distributions.
    • LEPTO: A leptoproduction generator - used as a basis for PEPSI and DJANGOH
    • LEPTO-PHI: A version of LEPTO with "Cahn effect" (azimuthal asymmetry) implemented
  • eA
    • DPMJet: a generator for very low Q2/real photon physics in eA
    • DPMJetHybrid: an obsolete buggy beta version of BeAGLE: a generator to simulate ep/eA DIS events by employing PYTHIA in DPMJet. Not supported in any way. No physics should be published with it.
    • Sartre is an event generator for exclusive diffractive vector meson production and DVCS in ep and eA collisions based on the dipole model.

There is code provided to convert the output from most of these generators into a ROOT format. It is distributed as part of eic-smear, the Monte Carlo smearing package.

Detector simulations

The following programs are available for simulating detector geometry and response:

  • eic-smear A package to apply very fast detector smearing to Monte Carlo events.

More details on detector simulations can be found here.


See the pages of the programmes listed above for their documentation. Other useful references are:

  • BASES/SPRING v1 and v5.1: Cross section integration and Monte Carlo event generation. Used in Rapgap and MILOU.

Helpful/Important Links

The following pages provide useful general information for Monte Carlo simulations:

  • MC programs:
    • A list of Monte Carlo programmes
    • HepForge, high-energy physics development environment, which includes many Monte Carlo generators.
    • Lecture slides from a course on QCD and Monte Carlos
  • Radiative Correction Codes:
  • Parton Distribution Function Interfaces:
    • LHAPDF, the Les Houches Accord PDF Interface.
      • By default, programs are linked to version 5.9.1, pointed to by $LHAPDF5.
      • You can adapt your LD_LIBRARY_PATH to point to $LHAPDF6 instead. This is a more modern version but lacks support for some EIC-relevant PDFs.
      • The PDF-Grids can be accessed via /gpfs01/eic/data/LHAPDF59SHARE/lhapdf/PDFsets, or more generally via /cvmfs/ These should be found automatically if using the system-default version of LHAPDF.
    • The users' manual of the CERN PDFLIB

MC Analysis Techniques

How to get a cross section (for non-BeAGLE MC)

to normalize your counts to cross section you need two pieces of information

  • the total number of trials, it is printed to the screen/logfile if all our MC finish
  • the total integrated cross section, the unit is in general microbarn, it is printed to the screen/logfile if all our MC finish

Counts = Luminosity x Cross Section

==> count * total integrated cross section /total number of trials

to calculate the corresponding MC luminosity

==> total number of trials/ total integrated cross section

There are some handy ROOT functions available to get the total number of trials, the total integrated MC cross section and the total number of events in the Tree
These work on Pythia, Pepsi, Djangoh and Milou event-wise root trees

  • total number of trials:
TObjString* nEventsString(NULL)
file.GetObject("nEvents", nEventsString);
  • total integrated MC cross section
TObjString* crossSectionString(NULL);
file.GetObject("crossSection", crossSectionString);
  • total number of events in the tree:
TTree* tree(NULL);
file.GetObject("EICTree", tree);
How to get a cross-section for BeAGLE

Note: This refers to standalone BeAGLE using internal e+N event generation (Pythia or optionally RAPGAP (when available)). For BeAGLE+GCF see below.

When BeAGLE simulates electron-nucleus (eA) collisions, it tags each event as scattering either from a proton or from a neutron in the nucleus. The stdout (which can be redirected to a log file) contains the total electron-proton cross section, σep, and the total electron-neutron cross section, σen, for the beam energies and kinematic ranges over which we choose to generate events. As long as we simulate enough events so that the cross sections can be effectively sampled (using sequential mode), these two values (σep and σen) will be independent of the number of events generated.

It should be noted that as long as σep and σen are equal to the desired precision, which is often the case at low x (high energy), then the simpler "non-BeAGLE" cross-section method above can be used with the exception that "number of events" should be used instead of "number of trials". For the most precise approach, we must correct for the fact that BeAGLE assumes equal ep and en cross-sections when selecting whether the struck nucleon is a proton or neutron, which is not quite right.

For a given electron-nucleon integrated luminosity, lumeic, the predicted number of events is

numeic = [ (Z×σep + N×σen) / A ] × lumeic

Here, for the nucleus being simulated, A = Z + N, Z, and N are the number of nucleons, protons, and neutrons, respectively. From the predicted total number of events, numeic, and the individual cross sections, we can calculate the ratio of the number originating from an electron-proton scattering to those originating from an electron-neutron scattering:

numeic = numprot + numneut = [ (Z×σep + N×σen) / A ] × lumeic = (Z/A)×σep×lumeic + (N/A)×σen×lumeic

This gives

numprot = (Z/A)×σep×lumeic
numneut = (N/A)×σen×lumeic

From where we calculate the following ratio:

numprot / numneut = (σep / σen) × (Z/N)

This means that if we simulate a certain number of events, numsim, the ratio of the simulated events which should come from a proton scattering to those which should come from a neutron scattering is given by the above equation. However, since the cross-section ratio depends strongly on the kinematic range we choose to generate over, using the true ratio is impractical. In practice, BeAGLE simulates proton and neutron scattering events according to the ratio of protons to neutrons in the nucleus under consideration.

So if we generate a total number of events, numsim, and then apply cuts (and/or binning), we end up with a final number of events, numsim,cut. Then we can ask, for a given electron-nucleon integrated luminosity, lumeic, how many events are predicted with that cut, numeic,cut. In order calculate this accurately, we first define the following:

numsim = numsimprot + numsimneut
lumsimprot = numsimprot / σep
lumsimneut = numsimneut / σen

In the above equations, numsimprot and numsimneut are the total number of proton and neutron scattering events generated, respectively. These numbers can be found in the output log file, or in the output ROOT file by seeing how many events are tagged as coming from a nucleon==2212 (proton) or a nucleon==2112 (neutron).

Next, after applying the cuts (and/or binning) mentioned above, we can write

numsim,cut = numsim,cutprot + numsim,cutneut

Again, using the output ROOT file, we can determine the number of proton-tagged events and neutron-tagged events which pass the cut. Then we can calculate the expected number of events:

numeic,cut = lumeic × ( [ Z×(numsim,cutprot/lumsimprot) + N×(numsim,cutneut/lumsimneut) ] / A )

As a specific example, we can consider the case where the 'cut' is simply a range in x and Q2 (i.e. a x-Q2 bin). We use the above result to calculate the inclusive (per-nucleon) differential cross section as

d2σ/dxdQ2 = numeic,cut / ( lumeic × Δx × ΔQ2) =  ( [ Z×(numsim,cutprot/lumsimprot) + N×(numsim,cutneut/lumsimneut) ] / A ) / (Δx × ΔQ2)

In the above equation, Δx × ΔQ2 gives the area of the x-Q2 bin.

How to get a cross-section for BeAGLE+GCF

GCF (Generalized Contact Formalism) is a package which calculates cross-sections in e+A collisions with Short-Range Correlations (SRCs) for a variety of specific processes: Quasi-elastic, vector-meson diffraction, DIS. In many cases, BeAGLE can read these events in and handle the nuclear response. GCF generates events which are flat or gaussian in a variety of variables and then reports a weight which can be used to convert this to a cross-section.

The weight is carried in the "RAevt" variable and has units nb (some older versions used cm2).

How to scale to the MC luminosity to the luminosity we want for the measurement

Very often it is impossible to generate so many events that the MC luminosity would correspond to one month of EIC running.
For this case we generate so much MC events that all distributions are smooth and scale the uncertainties.
The factor needed to scale is the ratio lumi-scale-factor = EIC-luminosity / generated MC luminosity. If we have this factor there are 2 ways to scale.

  • scaling of counts in histogram by

this will scale the number of counts in each bin of the histogram to what you would get for the EIC-luminosity
statistical uncertainties can then be calculated simply by sqrt(counts)

  • scaling the statistical uncertainties only
Example: reduced cross section

This example shows how to calculate the reduced cross section need to extract F_2 and F_L and how to scale the statistical uncertainties to a certain integrated luminosity

sigma_reduced =   prefactor * dsigma/dx/dQ2 with prefactor = Q^4 * x / (2*pi*alpha_em^2*(1+(1-y)^2)

this cross section would have the unit barn * GeV^2, to make it dimensionless you need to use a conversion factor for barn to 1/GeV^2 (h^2c^2/GeV^2 = 0.3894 millibarn)

sigma_reduced = counts(x,Q^2) * prefactor * total integrated MC cross section /total number of trials/ conversion-barn-to-GeV /x-binsize/Q2-binsize 

if the root function Scale was used the statistical uncertainty is 

delta sigma_reduced = sqrt(counts(x,Q^2)) * prefactor * total integrated MC cross section /total number of trials/ conversion-barn-to-GeV /x-binsize/Q2-binsize

in the other case it is 

delta sigma_reduced = sqrt(counts(x,Q^2)) * prefactor * total integrated MC cross section /total number of trials/ conversion-barn-to-GeV /x-binsize/Q2-binsize/ sqrt(lumi-scale-factor)

Attention: all luminosities and cross section must be in the same unit (pb or fb or ...)

High-Statistics BeAGLE Simulation

Multiple BeAGLE simulation jobs can submitted to the BNL batch farm using the condor utility. A simple submission file to run 10 jobs simultaneously is shown below.

Universe       = vanilla
Notification   = Never
Executable     =
GetEnv         = True
Input	       = /dev/null
Arguments      = "$(Process)"
Output         = jobout/eAujob.out.$(Process)
Error          = jobout/eAujob.err.$(Process)
Log            = jobout/eAujob.log.$(Process)
Queue 10

This now works fine. Random # seeds are now taken from /dev/urandom inside of BeAGLE automatically. If that is unavailable a more complicated calculation is done which includes the time AND the process id. There is no longer any problem with identical seeds on parallel farm jobs occurring at the same time.


If using the standard BeAGLE random number seeding, however, many of the seeds can be identical if the jobs start at the same second. An example is given below. Running this command on the output log files:

grep -i "SEED = " eAu_*.log

displays the random number seeds:

eAu_0.log: SEED =         2411
eAu_1.log: SEED =         2421
eAu_2.log: SEED =         2296
eAu_3.log: SEED =         2296
eAu_4.log: SEED =         2421
eAu_5.log: SEED =         2136
eAu_6.log: SEED =         2321
eAu_7.log: SEED =         2136
eAu_8.log: SEED =         2391
eAu_9.log: SEED =         2391

As can be seen, several of the jobs have the same seed, and therefore the output of the simulation will be identical. A simple workaround is to use the "/dev/urandom" pseudo-random number generator in a shell script to set the seed directly in the BeAGLE input file. This is done in the following shell script, which is the executable in the condor submission file.

if [ -z "$1" ]
	echo "No job number set."        echo "Please run as ./ jobnumber"
	echo "Exiting..."
	exit 1
echo "-----------------------------------"
echo "Running BeAGLE Simulation for eAu Collider!!!"
echo "-----------------------------------"
echo "Performing Job $1"
echo "..."
echo "" 
sed "${PythiaCard}" S3ALL002 > InpAu_$1
sed "${BeAGLECard}" ./inputFiles/eAu.inp > ./inputFiles/eAu_$1.inp 
SEED=`od -vAn -N2 -tu2 < /dev/urandom`
echo "The Random SEED is ${SEED// /}"
echo ""
sed -i "s/1234567/${SEED// /}/g" InpAu_$1 
echo "Running BeAGLE..."
$BEAGLESYS/BeAGLE < inputFiles/eAu_$1.inp > logs/eAu_$1.log  
echo "Completed Simulation!!!"
echo "" 
echo "Making Output ROOT File..."
root -l -b -q "make_tree.C(\"eAu_$1.txt\")"
echo "Done!!!"
echo "" 
echo "Cleaning up..."
rm -vf ./outForPythiaMode/eAu_$1.txt
rm -vf InpAu_$1
rm -vf inputFiles/eAu_$1.inp
echo "Done!!!"
echo ""

This script makes use of this line in the Pythia input card that is called by BeAGLE:

FSEED 1234567

Using the above script, the seeds for the simultaneously run jobs should differ. An example of the set seeds for 10 jobs is shown below.

eAu_0.log: SEED =        14473
eAu_1.log: SEED =        30550
eAu_2.log: SEED =        55545
eAu_3.log: SEED =         4022
eAu_4.log: SEED =        28238
eAu_5.log: SEED =        36662
eAu_6.log: SEED =        12502
eAu_7.log: SEED =        57292
eAu_8.log: SEED =        29721
eAu_9.log: SEED =        37451

As can be seen, all the random seeds are different. A larger test was performed where 100 jobs were submitted simultaneously, each job simulating 100k events (10 million events in total). The jobs were all found to have different seeds, and all the jobs completed within 5 hours of submission.