Level 5

IC40 Cascade Analysis
Stephanie Hickford
University of Canterbury
Last updated:

Stephanie's Homepage

Analysis

Data
CORSIKA
Signal

Level 2

Trigger (level 0)
Pole filter (level 1)
Level 2

Level 3

Cut variables
Reconstructions
Development of cuts

Level 4

Cut variables
Reconstructions
Development of cuts

Level 5

Cut variables
Development of cuts

Level 6

Cut variables
Development of cuts

Final

Sensitivity
Remaining events
Resolutions

Data/Monte Carlo Comparison

Excess in COG
Level 5 TMVA

Summary Of Issues

Feature extractor
Hit binning bug
Good DOM list
DOMcalibrator

Systematics

Ice properties
DOM efficiency
Cross-section
Seasonal variation

Unblinding Discussion

Diffuse call
Analysis call

Results
- Plots
- Events

Level 5
Level 5 cuts are done in ROOT, and are run by me using local computers at Canterbury.

Cut variables
Level 5 uses multivariate analysis (TMVA), a machine learning algorithm with ROOT. There are three precuts followed by the training of the BDT at this level:

Z vertex position: CredoFit4_Pos_Z > -450 metres and < 450 metres
String containment: &rarr This precut is due to the IC40 geometry because many background events (especially corner clippers) survive outside the detector volume.
DOM charge contaniment: &rarr DOM with the largest charge must not be on an outer string

Development of cuts
Figure 1 shows the first precut for TMVA, the reconstructed z vertex position. Figure 2 shows an arial view of the reconstructed x and y vertex position which illustrates the effect of the two containment precuts for TMVA. After these precuts the training and testing for TMVA is run. The output from the BDT is cut on in the next level.

**Figure 1:** Reconstructod z vertex position using 4 iteration credo reconstruction. The cuts are shown at *CredoFit4_Pos_Z* > -450 metres and *CredoFit4_Pos_Z* < 450 metres in black. **Figure 2:** Reconstructod x and y vertex positions using 4 iteration credo reconstruction. The cut is on the outer strings and are shown in black. a) Before precuts. b After String containment. c After DOM charge containment.

The passing rates for level 5 are shown in Table 1.

**Table 1:** Passing rates for level 5.
.	Trigger Rate (Hz)	Level 2 Rate (Hz)	Level 3 Rate (Hz)	Level 4 Rate (Hz)	Level 5 Rate (Hz)
Experimental data	1500	16.3 (1.1%)	1.75 (10.7%)	2.54 × 10^-2 (1.5%)	2.09 × 10^-3 (8.21%)
Monte Carlo	1270	12.5 (1.0%)	0.92 (7.4%)	3.30 × 10^-2 (3.6%)	2.49 × 10^-3 (7.54%)
E^-2 signal	2.55 × 10^-4	1.48 × 10^-4 (58.0%)	1.15 × 10^-4 (77.9%)	5.55 × 10^-5 (48.2%)	1.83 × 10^-5 (39.97%)

The cut variables used by TMVA (run after the precuts) are as follows:

Z Vertex Position: CredoFit4_Pos_Z
Zenith Track Direction: SPEFit32_Zenith
Track Reduced Log Likelihood: SPEFit32_rlogl
Linefit Velocity LineFit_LFVel
Eigenvalue Ratio: PoleToI_evalratio
Fill Ratio: SDM1_FillRatioFromMeanPlusRMS
Time Vertex Split: SplitSPECascadeLlhVertex2_Time-SplitSPECascadeLlhVertex1_Time
Split Containment: &radic [(SplitSPECascadeLlhVertex1_Pos_X)²+(SplitSPECascadeLlhVertex1_Pos_Y)²+(SplitSPECascadeLlhVertex1_Pos_Z)²]

All available standard CORSIKA (excluding two-component) is used in training, testing and evaluating stages of TMVA. 2,000 files of electron neutrino E^-1 signal (datasets 2182 and 2510) are used for training and testing, the remaining 6,000 files (dataset 3221) are used in the evaluation. Below are the plots from TMVA, including input variables, correlation matrices, overtraining checks and cut efficiencies.

**Figure 3:** Variables used in TMVA.

**Figure 4:** Variables used in TMVA in both log and linear (normalised to one) scale. These distributions are shown for after the precuts.

**Figure 5:** Correlation Matrices for TMVA. a) Signal. b) Background.

**Figure 6:** Other TMVA plots. a) Overtraining check. b) Cut efficiencies. c) ROC curve.

Supervisors: Dr. Jenni Adams and Dr. Suruj Seunarine
email: Stephanie Hickford