PID algorithms

From EIC
Jump to: navigation, search

Lepton-Hadron Separation

PID values are used to separate between hadrons and leptons in the events. Several different detectors (ECal, HCal, RICH, ....) can contribute to the separation and as such to the value of PID.
A PID value is just the log of the ratio of the probability that the track is a lepton divided by the probability that it is a hadron for a individual detector
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): log(\frac{P_{lep}(E, p)} {P_{had}(E, p)})
for more detector these probabilities can simply be added

PID = PID_RICH + PID_ECal
PID = PID_RICH + PID_ECal + PID_HCal

These probabilities are best determined from test beam data, in rare cases MC can fill in kinematic regions not reached in the test beam.
Such a PID scheme is based on Bayes Theorem, it is also important to include the relative particle fluxes, which are at minimum a function of momentum and scattering angle.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): Fluxfactor = log(\frac{phi_h} {phi_l})

The typical way to use PID in an experiment to make a cut, such as

   0 < PID-FluxFactor < 100  for leptons
-100 < PID-FluxFactor < 0    for hadrons
PID distribution for positive particles at different momenta. Left plot: only PID is used (PID3+PID5 combines different detectors); Right plot: PID with a flux correction is used.

From this plot is is clear a 99% lepton purity can be reached by cutting on a higher value of PID-FluxFactor, which results in some loss of efficiency.

More Info

For a detailed description of the PID formalism as used in HERMES see Juergen's PIDLIB paper

Hadron (π, K, p) separation

The performance of the RICH detector is described by the P- and Q-matrices. The P-matrix contains the probabilities Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^i_t

that a particle of true type t is identified as type i, while the Q-matrix contains the probabilities Failed to parse (Missing texvc executable. Please see math/README to configure.):  Q^i_t
 that a particle that was identified as type i is truly of type t . They are connected by the relation

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): Q^i_t=\frac{P^i_t \phi_t}{\sum_i P^i_s\phi_s}
where Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \phi_t

are the relative particle fluxes. This is basically a version of Bayes' theorem. If Failed to parse (Missing texvc executable. Please see math/README to configure.): \vec{I}=(I_{\pi}, I_K,I_p,I_X)

are the numbers of identified hadrons (X: not identified), and Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{N}=(N_{\pi}, N_K,N_p)

are the true numbers of hadrons, these two vectors are connected by the matrices P and Q:
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{I}=P \vec{N}


Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{N}=Q \vec{I}

The diagonal elements of the P-matrix are the identification efficiencies, the diagonal elements of the Q-matrix are the purities of the identified samples.

The P-matrix elements are in principle detector properties, and for single tracks or tracks with rings that do not overlap they are independent of the particle sample. They are independent of the particle charge, but they depend on the event topology. For samples with overlapping rings the topology dependence of the P-matrix elements leads to a dependence on the composition of the particle sample, i.e. the flux factors. However, the inclusion of the overlap parameter would increase the necessary amount of Monte Carlo and experimental data by another order of magnitude. Hence, the topology dependence of the P-matrix elements is included through the number of tracks per detector half only.

The Q-matrix elements are necessarily different for different data samples. From the first equation can be seen that the Q-matrix elements depend directly on the relative particle fluxes. The relative particle fluxes in turn are different for positive and negative particles.

As a result, the P-matrix elements are given as a function of momentum and topology (track multiplicity), the Q-matrix elements additionally also as function of the charge of the particle. Once enough data are available, it would be desirable to also parameterize them in terms of the overlap parameter for double track (per detector half) events.

The P-matrix elements can be either extracted from Monte Carlo data or from experimental data for decaying particles. However, due to the nature of the decaying particle data, experimental values are not available for all momentum and topology bins, e.g. not for single track kaons.

Principle

`Unfolding' stands for the extraction of the true hadron momentum spectra or asymmetries from the measured ones. This can be done by using the inverse of the truncated P matrix, i.e. the inverse of the P-matrix resulting from elimination of the fourth row corresponding probabilities for unidentified particles. This matrix is always non-singular in the HERMES regime, and consequently has a unique inverse which can be used to solve the equation:
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{N} = P_{trunc}^{-1} \vec{I}_{trunc}

to obtain the true particle fluxes, Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{N} . Here Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{I}_{trunc}

is the identified particle vector with the last member corresponding to unidentified particles removed. This solution can then be used with in the first equation to obtain the Q-matrix.

The unfolding plays a similar role as the correction for the hadron contamination in the electron/positron sample in inclusive measurements. However, there it's only two `particle types', electron and hadron, while the RICH detector has to deal with 3 hadron types and unidentified particles.

The differences between the measured and the true distributions may be large, for some momentum and topology bins up to 50% of the true values. Hence, they must be corrected.

Unfolding Procedures

In the discussion which follows two procedures for extraction of the true hadron fluxes or spectra are described. The first is an event weighting procedure which has the advantage that all kinematic dependences are, in principle conserved, since the unfolding is performed prior to any averaging inherent in the analysis procedure for a particular measurement. It has proven to be the more frequently employed approach. The second method has the advantage of simplicity, but at the expense of kinematic detail.

Unfolding by event weighting

A simple and direct procedure for extracting true particle fluxes is event weighting with the elements of the inverse P matrix, Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1} . It is based on equation Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \vec{N} = P_{trunc}^{-1} \vec{I}_{trunc} . Although the elements of Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1}

are not probabilities, when applied to identified hadron fluxes, as defined in Failed to parse (Missing texvc executable. Please see math/README to configure.): \vec{N} = P_{trunc}^{-1} \vec{I}_{trunc}

, in the correct manner, they yield the fluxes of true hadrons. Note that some of the off diagonal elements of Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1}

are negative, reflecting the presence in the identified fluxes of particles misidentified.

In the event weighting procedure, each identified pion, kaon, and nucleon track is assigned a weight given by the appropriate elements of Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1} . For example, a track identified as a pion is to be weighted with Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1}_{\pi_t\pi_i}

in the true pion sample, with Failed to parse (Missing texvc executable. Please see math/README to configure.): P^{-1}_{K_t\pi_i}
 in the true kaon sample, and 

with Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): P^{-1}_{n_t\pi_i}

in the true nucleon sample. The number of true hadrons can then be computed as the sum of all weights, 

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): N_h^t=\sum_j(P^{-1})_{h_t(h_i)_j} h_t=\pi, K, n

where the sum runs over all hadron tracks and Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): (h_i)_j

  labels the identified hadron type of track j.

P-Matrix Examples

The P-Matrixes depend on several characteristics of the events. Examples are how many tracks are in the RICH volume, their momenta, their angles, background conditions, i.e. noise in the photon matrix, .... The figure below shows as example 2 P-matrixes from the HERMES RICH for a different amount of tracks in the RICH detector volume.

Pmatrix.1.png
Pmatrix.3.png

More Info

For a detailed description of the RICH-PID formalism used in HERMES see RICH PIDLIB paper