# Difference between revisions of "Roman Pot Sim"

This page gives details of the Roman Pot simulations, which includes acceptance studies, momentum reconstruction studies, and the move towards a full mock analysis of DVCS. The process includes generation of physics events, importing physics events through EicRoot for GEANT tracking of particles through magnetic fields to the detectors, digitizing the raw hit information, and implementing a reconstruction algorithm to determine the pT of the proton from the hits in the Roman Pot. All of the steps have been implemented into a single script that can handle the workflow, which can be found at /direct/eic+u/rmpetti/workarea/roman_pots/condor/run_pot_sim_milou_jobs.csh. The basic steps in the simulation process are described in the sections below.

The steps are summarized in the below workflow diagram and each block has a dedicated section for explanation of details.

## MILOU sim

The first step in the simulation study is to generate DVCS physics events. Details of the MILOU Monte Carlo program can be found elsewhere. An example steering file used for this study can be found at /direct/eic+data/rmpetti/milou_sim/dvcs.steer. A quick note about this step. If one desires to run many jobs in parallel with Condor, it is necessary to change the seed for the random number generator that is specified in the steering file (denoted by the SEED keyword). A script (/direct/eic+u/rmpetti/workarea/roman_pots/condor/mod_steering_script.pl) has been written that will take as one of its arguments the desired seed (the script takes four arguments, seed, minimum t, maximum t, and number of events) that will regenerate the steering file automatically replacing the original value of the seed with the one used as an argument in running the script. As it is currently setup when running in Condor, the seed is passed by the process number that is automatically assigned to the Condor job through the scheduler.

Another note if running within Condor. MILOU will generate an integration table that is used for its calculations with BASES, with the table stored in the generated bases.data file. This is time consuming. To increase efficiency, the same bases.data file is used for all the jobs farmed out. In order to achieve this, MILOU first needs to be run once outside of Condor. The point here is just to generate the bases.data file, so it is sufficient to generate just one event. Within the steering file, the keyword IGEN must be set to 0 to generate the table from scratch. Then when running in Condor, the IGEN flag can be set to 4 and the existing bases.data table will be used. This is already setup if running the mod_steering_script.pl script referenced above.

## Divergence smearing

One of the effects that needs to be studied is the impact of the smearing of the angle of the outgoing protons due to the angular divergence of the beams at the IP. In principle, this effect will influence the momentum reconstruction resolution. This is an initial state effect in that the electrons and protons in the beam will collide with some angle compared with the nominal trajectories. This effect is approximated by adding a smearing only of the outgoing proton. This means that the MILOU events are generated with the electron and proton colliding head on, but then an extra rotation is applied, determined by the magnitude of the divergence. This should be sufficient for evaluating the effect of the momentum resolution, but may need to be handled in the initial collision for more detailed studies.

The divergence is calculated from the beam parameters obtained by the machine designers, specifically the (un-normalized) beam emittance, ε, and β*, via the equation $\displaystyle{ \sigma_\theta = \sqrt{\varepsilon/\beta^*} }$. This formula is used to generate the angle which the outgoing proton will be rotated. The angle is chosen randomly from a Gaussian distribution with a width equal to the calculated εθ. The script /direct/eic+u/rmpetti/workarea/roman_pots/MCafterburners/addDivergence.C will take in as input, the ASCII file generated by running MILOU in the previous step and will output a modified event record with the proton rotated by the specified angle. The script also takes in the (un-normalized) emittance and β* in both the x and y direction to calculate the magnitude of the divergence in each direction separately.

## Roman Pot setup

The next described step is the construction of the Roman Pot systems. This can be done by running the script /direct/eic+u/rmpetti/workarea/roman_pots/detectorSetup/tracker.C. This takes many arguments as detailed in the defined function in the script. The script allows for the construction of the Roman Pot through all the arguments supplied, including the position along the lattice in x and z, the rotation of the pot so that the beam comes in perpendicular to the sensors, the denoting of how many layers of sensors should be included in the station and the size of the rectangular "hole" that represents the gap in acceptance due to how close the pots can be placed to the beam (currently assumed to be 10σ of the beam width from the core of the beam and determined from the emittance and β function). Each station is also given an index through the arguments. This script needs to be run inside ROOT. Some more general information of constructing detector geometries inside EicRoot can be found here.

## EicRoot simulation

The next step is to run the GEANT tracking of the events through the magnet setup and the Roman Pots within the EicRoot package. An example script to handle this is found at /direct/eic+u/rmpetti/workarea/roman_pots/sim/simulation.C. This script will use the events generated above in step 1, as well as the detector constructs from step 3. Additionally, the magnetic setup in the IR needs to be defined. Some general information on the importing of magnetic fields into EicRoot can be found here or here.

If one wants to study the effect of the angular divergence, then this step needs to be performed twice. The simulation.C macro will be run once on the event record without divergence and run again on the event record with divergence. The output of simulation.C WITHOUT divergence is simply used to gain access to the original, un-smeared, proton pT and to pass this value along the simulation chain. The actual detector hits that will be analyzed later for acceptance and momentum reconstruction studies will utilize the output of simulation.C WITH divergence applied.

In both cases, the output of simulation.C is a ROOT file containing the standardized EICTree format used by the BNL EIC Task Force group.

## Digitizing hits

The next step is performed to add an additional layer of reality to the simulation. Rather than use the raw exact hits of the protons in the sensors, a digitized hit is used, which takes into account the pixelation of the detector sensor. This step is applied by running the macro /direct/eic+u/rmpetti/workarea/roman_pots/sim/digitization.C. The pixel size in x and y can be adjusted within this script. This macro takes in the output of the previous simulation step and outputs another ROOT file containing TTrees with the hit information in each Roman Pot station. If it is desired to study the effect of divergence, this macro needs to be run with the simulated file which includes the divergence. For more general information on the digitization step in EicRoot, please consult this page.

## Pt reconstruction

This step gets a little complicated. Given a defined magnet lattice, a proton exiting the IP with a certain angle and having a certain momentum will hit the Roman Pot in a particular location and at some angle through the sensors. This logic can be reversed, so that if the angle of the proton through the Roman Pots, as well as the hit locations is measured and the magnetic field path is known, then the pT of the proton can be reconstructed. For ease of coding, it was decided to use a machine learning algorithm to automatically generate this association. This is the underlying logic of this reconstruction step.

This step starts by running the macro /direct/eic+u/rmpetti/workarea/roman_pots/reco/makeRecoMap_extendedStations.C within ROOT. This macro is designed to read in the output from simulation.C (step 4) run over the event record WITHOUT divergence applied (simply as a method to propagate the original, un-smeared proton pT) along with the output of the digitization.C macro (step 5) run over the event record WITH divergence applied. The output of this macro is an ASCII file detailing for each DVCS event, the hit locations of the proton in the first sensor of the Roman Pot station in x, y, z and the 3D angle of the proton going through the Roman Pot station. The angles are determined by the hits in the layers of sensors within the Roman Pot station by obtaining the best fit line from the hits.

This ASCII file is then input into the reconstruction model. It is important to note here the difference in approach if one is submitting jobs to Condor or if one is simply running in interactive mode (i.e. single job). If using the thus developed scripts for Condor, it is assumed that a model for the pT reconstruction has already been trained and saved to file. As part of the master Condor script, this model is read in file file and used. This ensures that one single model is used for all events (and also allows one to train a model with relatively large statistics). The script /direct/eic+u/rmpetti/workarea/roman_pots/reco/produceRecoModel.py will produce this initial model and save to disk (using Python's Pickle library). In the original work flow, the simulation was split in two in an effort to ensure enough statistics at higher |t|. Thus produceRecoModel.py takes in two files, one for low |t| events, one for high |t| events. The files are from the output of makeRecoMap_extendedStations.C from the above paragraph. The model used for the momentum reconstruction uses the functionality of scikit-learn in Python. Many different models were tried. The best performance in terms of momentum resolution was found using the Support Vector Machine regression model. Currently the study has focusing on using just two Roman Pot stations. Thus the model takes as input the x, y, z hit position of the proton in the front face of the Roman Pot, along with the 3D angle of the proton going through the 4 sensors within the station, for each station. The model outputs the pT. This is a supervised learning problem, so the true pT is used in the training of the model.

Once the model has been trained and saved, it can be used later to make predictions, thus giving the reconstructed pT. Following the master Condor script, the macro /direct/eic+u/rmpetti/workarea/roman_pots/reco/determine_pt.py will perform this function. The macro takes in the text output of makeRecoMap_extendedStations.C and writes out a new ASCII file of the format with each line representing an event, and each line containing the true pT and the reconstructed pT.

## Evaluation

Finally, it is desirable to produce some method to evaluate the simulation performance. The focus here is on acceptance and pT resolution/bin purity. Several macros perform this function and are part of the master Condor script. First the macro /direct/eic+u/rmpetti/workarea/roman_pots/reco/makeRecoHistos.C will help to evaluate the momentum reconstruction. It takes in the text output from determine_pt.py described above in the reconstruction step, which simply details the true pT and the reconstructed pT values from the simulation. The output of makeRecoHistos.C is the input needed for the full DVCS measurement and consists of histograms of the true pT and reconstructed pT distributions (h_true_pt and h_reco_pt respectively), as well as the same for the |t| distributions (h_true_t and h_reco_t respectively) saved within a ROOT file. The macro also outputs a TGraph (g_res) that represents the momentum resolution as well as outputs histograms needed to calculate the purity quantifying bin migrations due to resolution smearing (h_gen, h_in, h_out, h_purity). If running the simulation in Condor batch mode, the histograms from all the output files need to be added, and then the collective purity can be calculated by adding and dividing h_in, h_out, and h_gen in the right combination (purity = (Ngen - Nout)/(Ngen - Nout + Nin)). h_purity has the result tabulated already, but only for the single job.

The macro /direct/eic+u/rmpetti/workarea/roman_pots/acc/calculateAcc_wDivergence_singleFile.C will produce histograms used to calculate the acceptance and they will be stored in the same ROOT file referenced directly above. If just interested in running a single job, the saved c1 canvas will output the acceptance in each Roman Pot station separately to a nice plot. If collecting many events, the histograms need to be summed to get the full acceptance for all events. The h_thrown histogram is a histogram of all the generated protons. h_in_x is a histogram of all the protons that actually hit Roman Pot station x. One can calculate the acceptance for each individual station by simply dividing the histograms h_in_x and h_thrown.

## Miscellaneous scripts

This section lists some useful scripts that can be used for diagnosing problems or simply cross-checking magnet implementations and such. These macros and scripts can be found in the directory /direct/eic+u/rmpetti/workarea/roman_pots/macros.

Before any simulations are launched in any detail, it is important to ensure that the placement of the beam line and IR magnets are in the proper place, as problems of this nature have been observed in the past. To test a specific magnet setup, launch the simulation (/direct/eic+u/rmpetti/workarea/roman_pots/sim/simulation.C) for one single event. Also, change the input argument denoted "box" in the function simulation to "true". This will generate nine tracks used as benchmarks, one proton at beam energy (this may need to be adjusted inside the script depending on what energy is being studied), one proton at beam energy, but emitted at an angle of 4 mrad, another at -4 mrad, one with 20% less than beam energy, two with 20% less than beam energy and at an angle of +/- 4mrad, one photon at zero degrees (for the neutral line) and two photon tracks at +/- 4 mrad indicating the nominal neutron cone needed. Then run verifyParticleTracking_rr.C in the macro directory over the output of the simulation to produce a plot showing in the x-z plane the magnet apertures and the paths of each track. The macro may have to be compiled first (run a the command line as root verifyParticleTracking_rr.C++).

Some code is available to troubleshoot the pT reconstruction and investigate the correlations between emission angle of the proton to the hits/angles in the Roman Pots. The macro /direct/eic+u/rmpetti/workarea/roman_pots/reco/mapOutHits.C investigates the relation between the angle of emission in φ to the hit positions. The macro divides the φ space into 8 equal regions and produces plots of the hit map on the surface of the detector (in the local coordinates of the detector, meaning that the beam passes through (0,0). A similar set of plots can be obtained, but investigating the relationship between the θ angle of emission and the hit position via the macro mapOutHitsTheta.C in the same directory. Both macros look at one station at a time, and so the station index needs to be added as an argument to the function call.