14.2 Overview of analysis strategy
Looking for all-hadronic decays is challenging because of the large QCD multijet background. By searching for boosted production, where the final state particles for each Higgs boson produce a single merged wide-radius jet, this background is exponentially reduced. Furthermore, in this regime, deep learning techniques can be extremely effective in identifying these unique merged wide-radius signal jets using low-level reconstructed particle and vertex data; indeed, key components of this search are the development and application of such techniques for identifying boosted Higgs jets.
For both the nonresonant and resonant searches, triggers and a loose offline preselection for boosted jets are first applied, selecting for two high wide-radius jets with at least one loosely -tagged. The key discriminating features between our signals and the predominantly QCD background are the masses of the two jets (plus the dijet mass in the resonant search) and and tagging scores. As described in Chapter 13, we use ParticleNet-based mass regression to improve the mass resolution of the two jets, and the established ParticleNet and new Particle Transformer mass-decorrelated taggers for and tagging, respectively.
Following the event selection, which uses a combination of these features, QCD remains the dominant background in the signal regions. The shape and normalization of this background are predicted from data in control regions, defined by inverting tagger selections, multiplied by transfer factors assumed to be smoothly parametrized functions. Smaller contributions from top quark and vector boson backgrounds are predicted from MC simulations.
For the nonresonant analysis, an event-level boosted decision tree (BDT) is trained to further discriminate between the HH signals and the QCD multijet and top backgrounds. As the boosted regime is particularly sensitive to high deviations, the BDT optimized simultaneously for both the SM ggF HH signal as well as the BSM VBF signal with = 0, and includes information about smaller-radius forward jets which are unique to the VBF mode. The shape of the jet mass exhibits a resonance peak for the HH signal (with better resolution than the jet mass) and thus it is chosen as the observable used in the final step of the signal extraction procedure. The control region for the QCD background is defined by inverting the tagger cut.
In the resonant case, the two tagger scores, along with a selection around the Higgs mass window on the -tagged jet, are used to define the signal region. The signal is then extracted from the 2D distribution of the dijet mass and -tagged jet mass, with the QCD background predicted from a control region with both tagger scores inverted. As the analysis is currently blinded in the signal region, secondary validation pass and fail regions are defined with the same selections above, except in the sidebands of the Higgs mass. These are used to estimate the background in the signal region and derive expected sensitivities and upper limits.