Table of ContentsMy websiteDownload PDFGitHub Repository

14.3 Event Selection

The primary physics objects considered in this analysis are large-radius, AK8 jets representing the two Higgs bosons. AK4 jets are also used in the online triggers and to identify nonresonant VBF HH production. As we do not expect any isolated leptons in our signal, events containing any isolated electrons and muons are vetoed. The online trigger selections are described in Section 14.3.1, and the offline selections for the nonresonant and resonant searches in Sections 14.3.2 and 14.3.3, respectively.

14.3.1 Triggers

No dedicated online trigger algorithms were available in Run 2 for boosted Higgs classification. Instead, a combination of high level triggers (HLTs) is considered, which require high hadronic activity and/or AK8 jets with high transverse momentum, as well as jet mass and/or b-tagging requirements. The efficiencies of these triggers as a function of AK8 jet pT, soft-drop mass [394], and bb¯-tagging score are measured in data in an unbiased semi-leptonic tt¯ region, defined using single muon triggers and offline selections on the muon and an AK8 jet. This measurement is shown in Figure 14.2 for the 2018 dataset. The triggers are generally fully efficient for jet pT > 500GeV, while for pT < 400GeV the efficiency is 10%. This is a significant limitation of the analysis and generally of boosted Higgs searches in Run 2, which is addressed in Run 3 by the introduction of dedicated triggers for boosted Higgs searches [397].

PIC

Figure 14.2. Trigger efficiencies for the 2018 dataset measured in bins of the AK8 jet pT, soft drop mass (MassSD) and TXbb score.

14.3.2 Nonresonant offline selection

In the nonresonant analysis, both the H bb¯ and H VV decays are targeted through an offline selection for two highly boosted AK8 jets with a minimum pT of 300GeV and |η| < 2.4. ParticleNet is used to isolate the signal H bb¯ jets against background QCD jets, using the TXbb discriminant derived from its outputs (Eq. 13.1.1), while our new GloParT model is leveraged to identify the H VV 4q jet. Both networks have been decorrelated from the mass of the jets by enforcing a uniform distribution in jet mass and pT in the training samples [164], to aid with their calibration. Additionally, as the jet mass resolution is crucial to the sensitivity of the search, we optimize the mass reconstruction for all AK8 jets using the ParticleNet-based regression algorithm, the output of which we refer to as mreg. The jet with the higher (lower) TXbb score is considered the bb¯- (VV-) candidate jet.

The VBF process produces two, likely forward, jets with large invariant masses and pseudorapidity separations. To identify this mode, we select up to two AK4 jets per event, required to have pT > 25GeV, |η| < 4.7, and a ΔR separation of 1.2 and 0.8, respectively, from the bb¯- and VV-candidate AK8 jets. The pseudorapidity separation between and invariant mass of the two highest pT jets passing these requirements are used as input variables in a boosted decision tree (BDT) to discriminate against QCD and other backgrounds. Other input variables include outputs from the GloParT tagger and the two selected AK8 jet kinematics. The variables are optimized to provide the highest BDT performance while remaining decorrelated from the bb¯-candidate jet’s mass.

The BDT is optimized simultaneously for both the SM ggF and BSM VBF κ2V = 0 signals, and separate “ggF” and “VBF” signal regions are defined using the BDT probabilities for the respective processes, referred to as BDTggF and BDTVBF. Concretely, the VBF region is defined by selections on the TXbb and BDTVBF discriminants, corresponding to VBF signal (background) efficiencies of 40% ( 0.1%) and 20% ( 0.003%), respectively, chosen to optimize the expected exclusion limit on the VBF signal. The ggF region is defined by a veto on events passing the VBF selections plus selections on the TXbb and BDTggF discriminants, corresponding to ggF signal (background) efficiencies of 60% ( 0.3%) and 7% ( 0.01%), respectively, similarly chosen to optimize the limit on the ggF signal. These selections are henceforth referred to as the ggF and VBF TXbb and BDT working points (WPs). The TXbb discriminant’s signal efficiencies are calibrated using boosted gluon splitting to bottom quark (g bb¯) jets in data and simulations [164], with pT-dependent scale factors and uncertainties applied to the HH signals. The uncertainty on the BDT signal efficiency is dominated by that of the GloParT tagger and is calibrated based on a new technique using the ratio of the primary Lund jet plane [61] densities of each individual quark-subjet, described below in Section 13.3.

The search is performed by constructing a likelihood in the pass region as a function of the H bb¯-candidate jet’s regressed mass (mregbb). The QCD multijet background contribution in the pass region is estimated through data in a “fail” region, defined using the same baseline selections on the two AK8 jets, but with the TXbb selection inverted, as described in Section 14.4 below. A summary of all offline selections is provided in Table 14.1, and the signal and fail region selections in terms of the TXbbbb and BDT scores are illustrated in Figure 14.3.

Table 14.1. Offline selection criteria for the signal and fail nonresonant analysis regions.
-----------------------------|----------------------------|-----------------------
        VBF   Region         |        ggF Region          |     Fail Region
----------------------------------------------------------------------------------

                              No electrons or muons


                                  ≥  2 AK8  jets

                              pT > 300GeV  (all jets)

                                |η| < 2.4 (all jets)

                          50 < mreg <  250GeV  (all jets)

                           TXbb > 0.8 (at least one jet)

                                 Jet assignment:

                           H →  b ¯b: highest TXbb score

              H →  VV:  out of remaining jets, highest GloParT  score
-----------------------------|----------------------------|-----------------------
                             |                            |
                             |Not passing VBF   selections |
     bb                      |     bb                      |  bb
   T Xbb ≥ VBF  TXbb WP      |   TXbb ≥ ggF  TXbb WP      |T Xbb < ggF TXbb WP
                             |                            |
-BDTVBF---≥--VBF--BDT---WP-----BDTggF--≥--ggF-BDT---WP----------------------------

PIC

Figure 14.3. Illustration of the signal and fail nonresonant analysis region selections in terms of the TXbbbb and BDT scores.

14.3.3 Resonant offline selection

The resonant analysis similarly selects for two wide-radius jets representing the two H bb¯ and Y VV processes. Specifically, we select for two boosted AK8 jets with pT 350GeV, with at least one of pT 400GeV, and pseudorapidity |η| 2.4. Out of all AK8 jets in the event passing these requirements, the one with the highest TXbb discriminant score is considered our H bb¯ candidate jet, and is required to pass the high purity WP and have a jet mass close to the SM Higgs mass: 110 mass < 145GeV. As in the nonresonant case, the jet mass resolution is crucial to the sensitivity of the search and hence we use the ParticleNet-based regression algorithm to reconstruct the jet mass, mreg, here as well.

The mass-decorrelated GloParT tagger is again used to identify the Y VV 4q jet, using the discriminant THVV targeting the VV 4q final state derived from its outputs (Eq. 13.2.1). The AK8 jet passing the above pT and η kinematic selections with the highest THVV score is considered the Y VV candidate jet,1 and is required to have a THVV score > 0.6, corresponding to a 60% ( 1%) signal (background) efficiency. The signal efficiency is calibrated based the Lund jet plane as described in Chapter 13.3. All the pT and tagger selections were jointly optimized for the lowest expected exclusion limits for a range of mX,mY points.

The search is performed in events passing these selections, referred to as the signal or “pass” region, in the 2D plane of the VV-candidate jet regressed mass (mregVV) and the invariant mass of the bb¯- and VV-candidate jets (mjj), representing the potential Y and X boson masses, respectively. An orthogonal control, or “fail”, region is defined by inverting the two tagger selections for both jets to estimate the QCD background in the pass region, as detailed in Section 14.4. Finally, separate “validation” pass and fail regions using the H bb¯ candidate jet’s mass sidebands are used to validate the background estimation technique before unblinding the analysis. A summary of the offline selections is provided in Table 14.2.

Table 14.2. Offline selection criteria for analysis regions for the fully-merged Y topology.
-----------------------------------|--------------------------------------------------
                                   |
          Signal Region                            Validation Region
--------------------------------------------------------------------------------------
                                    ≥ 2 AK8   jets

                                pT > 350GeV  (all jets)

                                  |η | < 2.4 (all jets)

                           pT > 400GeV   (jet leading in pT)


                                   Jet assignment:

                             H →  bb¯: highest TXbb score

                 Y  →  VV:  out of remaining jets, highest THVV score
-----------------------------------|--------------------------------------------------
       110 ≤ mbb  < 145GeV         |92.5 ≤ mbb  < 110GeV   or 145 ≤ mbb  < 162.5GeV
---------------reg----------------------------reg-----------------------reg--------------
                 |                 |                |
      Pass       |      Fail       |     Pass       |              Fail
-----------------|-----------------|----------------|---------------------------------
 T    ≥ HP  WP   |T    <  HP  WP   |T    ≥ HP  WP   |        T    <  HP  WP
  Xbb            |  Xbb            | Xbb            |          Xbb
   T     ≥ 0.6   |  T     <  0.6    |  T     ≥ 0.6   |          T     <  0.6
----HVV---------------HVV--------------HVV-----------------------HVV------------------

1In the rare ( < 0.1% of signal events) case where the same jet has the highest TXbb and THVV score, that jet is considered the H bb¯ candidate, and the second-highest THVV scoring jet is the Y VV candidate.