IC40 Cascade Analysis
Stephanie Hickford
University of Canterbury
Last updated:


Stephanie's Homepage Level 5
Level 5 cuts are done in ROOT, and are run by me using local computers at Canterbury.

Cut variables
Level 5 uses multivariate analysis (TMVA), a machine learning algorithm with ROOT. There are three precuts followed by the training of the BDT at this level:
  • Z vertex position: CredoFit4_Pos_Z > -450 metres and < 450 metres
  • String containment: &rarr This precut is due to the IC40 geometry because many background events (especially corner clippers) survive outside the detector volume.
  • DOM charge contaniment: &rarr DOM with the largest charge must not be on an outer string

Development of cuts
Figure 1 shows the first precut for TMVA, the reconstructed z vertex position. Figure 2 shows an arial view of the reconstructed x and y vertex position which illustrates the effect of the two containment precuts for TMVA. After these precuts the training and testing for TMVA is run. The output from the BDT is cut on in the next level.

L4 PosZ L4 PosY PosX L4 PosY PosX after stringcontainment L4 PosY PosX after stringcontainment and domchargecontainment
Figure 1: Reconstructod z vertex position using 4 iteration credo reconstruction. The cuts are shown at CredoFit4_Pos_Z > -450 metres and CredoFit4_Pos_Z < 450 metres in black. Figure 2: Reconstructod x and y vertex positions using 4 iteration credo reconstruction. The cut is on the outer strings and are shown in black. a) Before precuts. b After String containment. c After DOM charge containment.


The passing rates for level 5 are shown in Table 1.

. Trigger Rate (Hz) Level 2 Rate (Hz) Level 3 Rate (Hz) Level 4 Rate (Hz) Level 5 Rate (Hz)
Experimental data 1500 16.3 (1.1%) 1.75 (10.7%) 2.54 × 10-2 (1.5%) 2.09 × 10-3 (8.21%)
Monte Carlo 1270 12.5 (1.0%) 0.92 (7.4%) 3.30 × 10-2 (3.6%) 2.49 × 10-3 (7.54%)
E-2 signal 2.55 × 10-4 1.48 × 10-4 (58.0%) 1.15 × 10-4 (77.9%) 5.55 × 10-5 (48.2%) 1.83 × 10-5 (39.97%)
Table 1: Passing rates for level 5.


The cut variables used by TMVA (run after the precuts) are as follows:
  • Z Vertex Position: CredoFit4_Pos_Z
  • Zenith Track Direction: SPEFit32_Zenith
  • Track Reduced Log Likelihood: SPEFit32_rlogl
  • Linefit Velocity LineFit_LFVel
  • Eigenvalue Ratio: PoleToI_evalratio
  • Fill Ratio: SDM1_FillRatioFromMeanPlusRMS
  • Time Vertex Split: SplitSPECascadeLlhVertex2_Time-SplitSPECascadeLlhVertex1_Time
  • Split Containment: &radic [(SplitSPECascadeLlhVertex1_Pos_X)2+(SplitSPECascadeLlhVertex1_Pos_Y)2+(SplitSPECascadeLlhVertex1_Pos_Z)2]

All available standard CORSIKA (excluding two-component) is used in training, testing and evaluating stages of TMVA. 2,000 files of electron neutrino E-1 signal (datasets 2182 and 2510) are used for training and testing, the remaining 6,000 files (dataset 3221) are used in the evaluation. Below are the plots from TMVA, including input variables, correlation matrices, overtraining checks and cut efficiencies.

TMVA variables 1 TMVA variables 2
Figure 3: Variables used in TMVA.

PosZ for TMVA Zenith for TMVA rlogl for TMVA LFVel for TMVA evalratio for TMVA Fillratio for TMVA TimeSplit for TMVA SplitContainment for TMVA
Linear PosZ for TMVA Linear Zenith for TMVA Linear rlogl for TMVA Linear LFVel for TMVA Linear evalratio for TMVA Linear Fillratio for TMVA Linear TimeSplit for TMVA Linear SplitContainment for TMVA
Figure 4: Variables used in TMVA in both log and linear (normalised to one) scale. These distributions are shown for after the precuts.

TMVA signal correlation matrix TMVA background correlation matrix
Figure 5: Correlation Matrices for TMVA. a) Signal. b) Background.

TMVA overtraining check TMVA cut efficiencies TMVA ROC curve
Figure 6: Other TMVA plots. a) Overtraining check. b) Cut efficiencies. c) ROC curve.


Supervisors: Dr. Jenni Adams and Dr. Suruj Seunarine
email: Stephanie Hickford