Table of ContentsMy websiteDownload PDFGitHub Repository

16.3 Experiments

We experiment with and evaluate the performance of the LGAE and baseline models on reconstruction and anomaly detection for simulated high-momentum jets from the JetNet dataset. In this section, we describe the dataset in more detail in Section 16.3.1, the different models we consider in Section 16.3.2, the reconstruction and anomaly detection results in Sections 16.3.3 and 16.3.4 respectively, an interpretation of the LGAE latent space in Section 16.3.5, and finally experiments of the data efficiency of the different models in Section 16.3.6.

16.3.1 Dataset

We use 30-particle high pT jets from the JetNet dataset as described in Chapter 9.2, obtained using the JetNet library from Chapter 15. The model is trained on jets produced from gluons and light quarks, which are collectively referred to as quantum chromodynamics (QCD) jets.

As before, we represent the jets as a point cloud of particles, termed a “particle cloud”, with the respective 3-momenta, in absolute coordinates, as particle features. In the processing step, each 3-momentum is converted to a 4-momentum: pμ = (|p|,p), where we consider the mass of each particle to be negligible. We use a 60%20%20% training/testing/validation splitting for the total 177,000 jets. For evaluating performance in anomaly detection, we consider jets from JetNet produced by top quarks, W bosons, and Z bosons as our anomalous signals.

We note that the detector and reconstruction effects in JetNet, and indeed in real data collected at the LHC, break the Lorentz symmetry; hence, Lorentz equivariance is generally an approximate rather than an exact symmetry of HEP data. We assume henceforth that the magnitude of the symmetry breaking is small enough that imposing exact Lorentz equivariance in the LGAE is still advantageous — and the high performance of the LGAE and classification models such as LorentzNet support this assumption. Nevertheless, important studies in future work may include quantifying this symmetry breaking and considering approximate symmetries in NNs.

16.3.2 Models

LGAE model results are presented using both the min-max (LGAE-Min-Max) and “mix” (LGAE-Mix) aggregation schemes for the latent space, which consists of varying numbers of complex Lorentz vectors — corresponding to different compression rates. We compare the LGAE to baseline GNN and CNN autoencoder models, referred to as “GNNAE” and “CNNAE” respectively.

The GNNAE model is composed of fully-connected MPNNs adapted from MPGAN (Section 10.1). We experiment with two types of encodings: (1) particle-level (GNNAE-PL), as in the PGAE [67] model, which compresses the features per node in the graph but retains the graph structure in the latent space, and (2) jet-level (GNNAE-JL), which averages the features across each node to form the latent space, as in the LGAE. Particle-level encodings produce better performance overall for the GNNAE, but the jet-level provides a more fair comparison with the LGAE, which uses jet-level encoding to achieve a high level of compression of the features.

For the CNNAE, which is adapted from Ref. [248], the relative coordinates of each input jets’ particle constituents are first discretized into a 40 × 40 grid. The particles are then represented as pixels in an image, with intensities corresponding to pTrel. Multiple particles per jet may correspond to the same pixel, in which case their pTrel’s are summed. The CNNAE has neither Lorentz nor permutation symmetry, however, it does have in-built translation equivariance in η ϕ space.

Hyperparameter and training details for all models can be found in E.1 and E.2, respectively, and a summary of the relevant symmetries respected by each model is provided in Table 16.1. The LGAE models are verified to be equivariant to Lorentz boosts and rotations up to numerical error, with details provided in E.3.

Table 16.1. Summary of the relevant symmetries respected by each model tested.
-----------------------------------------------------------------------------------------------------------------
 Model     Aggregation    Name               Lorentz symmetry    Permutation  symmetry    Translation symmetry
-----------------------------------------------------------------------------------------------------------------

           Min -Max       LGAE   -Min -Max    ✓ (equivariance )    ✓ (invariance)            ✓ (equivariance )
 LGAE
           Mix            LGAE   -Mix         ✓ (equivariance )    ✗                        ✓ (equivariance )

           Jet-level      GNNAE    -JL        ✗                   ✓ (invariance)            ✓ (equivariance )
 GNNAE
           Particle-level  GNNAE    -PL        ✗                   ✓ (equivariance )         ✓ (equivariance )


-CNNAE--------------------CNNAE--------------✗-------------------✗------------------------✓-(equivariance-)-------

16.3.3 Reconstruction

PIC

Figure 16.2. Jet image reconstructions by LGAE-Min-Max (τ(12,12) = 4, 56.67% compression), LGAE-Mix (τ(12,12) = 9, 61.67% compression), GNNAE-JL ( dim (L) = 55, 61.11% compression), GNNAE-PL ( dim (L) = 2 × 30, 66.67% compression), and CNNAE ( dim (L) = 55, 61.11% compression).

We evaluate the performance of the LGAE, GNNAE, and CNNAE models, with the different aggregation schemes discussed, on the reconstruction of the particle and jet features of QCD jets. We consider relative transverse momentum pTrel = p Tparticlep Tjet and relative angular coordinates ηrel = ηparticle ηjet and ϕrel = ϕparticle ϕjet(mod 2π) as each particle’s features, and total jet mass, pT and η as jet features. We define the compression rate as the ratio between the total dimension of the latent space and the number of features in the input space: 30particles × 3featuresperparticle = 90.

Figure 16.2 shows random samples of jets, represented as discrete images in the angular-coordinate plane, reconstructed by the models with similar levels of compression in comparison to the true jets. Figure 16.3 shows histograms of the reconstructed features compared to the true distributions. The differences between the two distributions are quantified in Table 16.2 by calculating the median and interquartile ranges (IQR) of the relative errors between the reconstructed and true features. To calculate the relative errors of particle features for the permutation invariant LGAE and GNNAE models, particles are matched between the input and output clouds using the Jonker–Volgenant algorithm [303426] based on the L2 distance between particle features. Due to the discretization of the inputs to the CNNAE, reconstructing individual particle features is not possible; instead, only jet features are shown.1

We can observe visually in Figure 16.2 that out of the two permutation invariant models, while neither is able to reconstruct the jet substructure perfectly, the LGAE-Min-Max outperforms the GNNAE-JL. Perhaps surprisingly, the permutation-symmetry-breaking mix aggregation scheme improves the LGAE in this regard. Both visually in Figure 16.3 and quantitatively from Tables 16.2 and 16.3, we conclude that the LGAE-Mix has the best performance overall, significantly outperforming the GNNAE and CNNAE models at similar compression rates. The LGAE-Min-Max model outperforms the GNNAE-JL in reconstructing all features and the GNNAE-PL in all but the IQR of the particle angular coordinates.

PIC

Figure 16.3. Top: particle momenta (pTrel,ηrel,ϕrel) reconstruction by LGAE-Min-Max (τ(12,12) = 4, resulting in 56.67% compression) and and LGAE-Mix (τ(12,12) = 9, resulting in 61.67% compression), and GNNAE-JL ( dim (L) = 55, resulting in 61.11% compression) and GNNAE-PL ( dim (L) = 2 × 30, resulting in 66.67% compression). The reconstructions by the CNNAE are not included due to the discrete values of ηrel and ϕrel, as discussed in the text. Bottom: jet feature (M,pT,η) reconstruction by the four models. For the jet feature reconstruction by the GNNAEs, the particle features in relative coordinates were transformed back to absolute coordinates before plotting. The jet ϕ is not shown because it follows a uniform distribution in (π,π] and is reconstructed well.

Table 16.2. Median and IQR of relative errors in particle feature reconstruction of selected LGAE and GNNAE models. In each column, the best-performing latent space per model is italicized, and the best model overall is highlighted in bold.
------------------------------------------------------------------------------------------------------------
                                                         Particle prel      Particle ηrel      Particle ϕrel
 Model     Aggregation    Latent space                -------------T----------------------------------------
                                                       Median     IQR    Median     IQR     Median    IQR
------------------------------------------------------------------------------------------------------------

                          τ(1∕2,1∕2) = 4 (56.67% )        0 .006    0 .562    0.002     1.8     0.003     1.8
           Min -max
                          τ(1∕2,1∕2) = 7 (96.67% )         0.002     0.640   − 0.627    1.7    < 10 −3    1.7
 LGAE
                          τ(1∕2,1∕2) = 9 (61.67% )       <  10− 3   0.011   <  10− 3  0.452   < 10 −3   0.451
           Mix
                          τ(1∕2,1∕2) = 13 (88.33% )      <  10− 3  0.001   <  10− 3  0.022   < 10 −3   0.022


                          dim (L) = 45 (50.00% )        − 0.983    3.8     0.363     3.1     0.146     2.1
           Jet-level
                          dim (L) = 90 (100.00% )      − 0 .627    3 .5      4.4      14.7    0.146     2.6
 GNNAE
                          dim (L) = 2 × 30 (66.67% )    − 0.053   0.906    0.009    0.191    0.013    0.139
           Particle-level
--------------------------dim-(L)-=-3-×-30-(100.00%-)--−-0-.040---0-.892---−-0.037---0.177----0.005----0.243--

Table 16.3. Median and IQR of relative errors in jet feature reconstruction by selected LGAE and GNNAE models, along with the CNNAE model. In each column, the best performing latent space per model is italicised, and the best model overall is highlighted in bold.
----------------------------------------------------------------------------------------------------------------------------------

                                                      -----Jet mass-----------Jet-pT--------------Jet-η--------------Jet-ϕ--------
 Model     Aggregation    Latent space
-------------------------------------------------------Median-----IQR-----Median-----IQR-----Median-----IQR-----Median------IQR-----

                          τ(1∕2,1∕2) = 4 (56.67% )        0.096    0.134    0.097    0 .109    <  10−3    0.004   <  10− 3   0.002
           Min -max
                          τ(1∕2,1∕2) = 7 (96.67% )        − 0.139  0.287   − 0.221    0.609    <  10−3    0.021   <  10− 3   0.007
 LGAE
                          τ(1∕2,1∕2) = 9 (61.67% )       <  10− 3  0.003   < 10 −3   < 10− 3  <  10−3   < 10 −3  <  10− 3  < 10 −3
           Mix
                          τ(1∕2,1∕2) = 13 (88.33% )      <  10− 3  0.003   < 10 −3   < 10− 3  <  10−3   < 10 −3  <  10− 3  < 10 −3


                          dim (L) = 45 (50.00% )        0.326    0.667    0.030    0 .088     0.005     0.040    0 .001     0.021
           Jet-level
                          dim (L) = 90 (100.00% )         3.7      2.6     0.030     0.089     0.292     0.433     0.006     0.021
 GNNAE
                          dim (L) = 2 × 30 (66.67% )    0.277    0.299    0.037     0.110     0.002     0.010    − 0.001   0.005
           Particle-level
                          dim (L) = 3 × 30 (100.00% )   0.339    0.244    0.050    0 .094    − 0.001    0.011   <  10− 3   0.005

                                                                                                 −3                 − 3
-CNNAE-----Linear-layer---dim-(L)-=-55-(61.67%-)--------− 0.030--0.042---−-0.021----0.017----<--10------0.017---<--10------0.003---

16.3.4 Anomaly detection

PIC

Figure 16.4. Anomaly detection ROC curves for the top quark signal (upper left), W boson signal (upper right), Z boson signal (lower left), and the combined signal (lower right) by the selected LGAE-Min-Max (τ(12,12) = 7), LGAE-Mix (τ(12,12) = 2), GNNAE-JL ( dim (L) = 30), GNNAE-PL ( dim (L) = 2 × 30), and CNNAE ( dim (L) = 55) models.

We test the performance of all models as unsupervised anomaly detection algorithms by pre-training them solely on QCD and then using the reconstruction error for the QCD and new signal jets as the discriminating variable. We consider top quark, W boson, and Z boson jets as potential signals and QCD as the “background”. We test the Chamfer distance, energy mover’s distance [325] — the earth mover’s distance applied to particle clouds, and MSE between input and output jets as reconstruction errors, and find the Chamfer distance most performant for all graph-based models. For the CNNAE, we use the MSE between the input and reconstructed image as the anomaly score.

Receiver operating characteristic (ROC) curves showing the signal efficiencies (𝜀s) versus background efficiencies (𝜀b) for individual and combined signals are shown in Figure 16.4,2 and 𝜀s values at particular background efficiencies are given in Table 16.4. We see that in general the permutation equivariant LGAE and GNNAE models outperform the CNNAE, strengthening the case for considering equivariance in neural networks. Furthermore, LGAE models have significantly higher signal efficiencies than GNNAEs and CNNAEs for all signals when rejecting > 90% of the background (which is the minimum level we typically require in HEP), and LGAE-Mix consistently performs better than LGAE-Min-Max.

Table 16.4. Anomaly detection metrics by a selected LGAE and GNNAE models, along with the CNNAE model. In each column, the best performing latent space per model is italicized, and the best model overall is highlighted in bold.
-----------------------------------------------------------------------------------------------

                                                                        𝜀s at given 𝜀b
 Model     Aggregation    Latent space                 AUC     -------------------------------
                                                                𝜀s(10−1)  𝜀s(10−2)   𝜀s(10 −3)
-----------------------------------------------------------------------------------------------

                          τ(1∕2,1∕2) = 2 (30.00% )        0.7253     0.5706    0.1130    0.0011

           Min -Max       τ(1∕2,1∕2) = 4 (56.67% )        0.7627     0.5832    0.1305    0.0007

                          τ(1∕2,1∕2) = 7 (96.67% )        0.7673   0 .5932     0.0820    0.0009
 LGAE
                          τ(1∕2,1∕2) = 2 (15.00% )        0.8023    0.6178    0.1662    0.0250

           Mix            τ(1∕2,1∕2) = 4 (28.33% )        0.8023    0.6257    0.1592    0.0229

                          τ(1∕2,1∕2) = 7 (48.33% )        0.7967    0.6290     0.1562    0.0225

                          dim (L ) = 10 (11.11% )        0.5891     0.1576    0.0161    0.0014

           JL             dim (L ) = 40 (44.44% )        0.6636    0 .2293     0.0262    0.0013

 GNNAE                    dim (L ) = 80 (88.89% )        0.7006    0.2240    0.0239    0.0010


                          dim (L ) = 2 × 30 (66.67% )   0.8195    0 .4435     0.0564    0.0042
           PL
                          dim (L ) = 3 × 30 (100.00% )  0.8095     0.4306    0.0762    0.0044

 CNNAE     linear layer    dim (L ) = 55 (61.67% )        0.7700     0.2473    0.0469    0.0053
-----------------------------------------------------------------------------------------------

16.3.5 Latent space interpretation

PIC

Figure 16.5. The correlations between the total momentum of the imaginary components in the τ(12,12) = 2 LGAE-Mix model and the target jet momenta. The Pearson correlation coefficient r is listed above.

PIC PIC

Figure 16.6. Top: distributions of the invariant mass squared of the latent 4-vectors and jet momenta of the LGAE-Mix with τ(12,12) = 2 latent 4-vectors. Bottom: distributions of the invariant mass squared of two latent 4-vectors and jet momenta of the LGAE-Min-Max with τ(12,12) = 2 latent 4-vectors.

The outputs of the LGAE encoder are irreducible representations of the Lorentz groups; they consist of a pre-specified number of Lorentz scalars, vectors, and potentially higher-order representations. This implies a significantly more interpretable latent representation of the jets than traditional autoencoders, as the information distributed across the latent space is now disentangled between the different irreps of the Lorentz group. For example, scalar quantities like the jet mass will necessarily be encoded in the scalars of the latent space, and jet and particle 4-momenta in the vectors.

We demonstrate the latter empirically on the LGAE-Mix model (τ(12,12) = 2) by looking at correlations between jet 4-momenta and the components of different combinations of latent vector components. Figure 16.5 shows that, in fact, the jet momenta is encoded in the imaginary component of the sum of the latent vectors.

We can also attempt to understand the anomaly detection performance by looking at the encodings of the training data compared to the anomalous signal. Figure 16.6 shows the individual and total invariant mass of the latent vectors of sample LGAE models for QCD and top quark, W boson, and Z boson inputs. We observe that despite the overall similar kinematic properties of the different jet classes, the distributions for the QCD background are significantly different from the signals, indicating that the LGAE learns and encodes the difference in jet substructure — despite substructure observables such as jet mass not being direct inputs to the network — explaining the high performance in anomaly detection.

Finally, while in this section we showcased simple “brute-force” techniques for interpretability by looking directly at the distributions and correlations of latent features, we hypothesize that such an equivariant latent space would also lend itself effectively to the vast array of existing explainable AI algorithms [427428], which generically evaluate the contribution of different input and intermediate neuron features to network outputs. We leave a detailed study of this to future work.

16.3.6 Data efficiency

PIC

Figure 16.7. Median magnitude of relative errors of jet mass reconstruction by LGAE and CNNAE models at trained on different fractions of the training data.

In principle, equivariant neural networks should require less training data for high performance, since critical biases of the data, which would otherwise have to be learned by non-equivariant networks, are already built in. We test this claim by measuring the performances of the best-performing LGAE and CNNAE architectures from Section 16.3.3 trained on varying fractions of the training data.

The median magnitude of the relative errors between the reconstructed and true jet masses of the different models and fractions is shown in Figure 16.7. Each model is trained five times per training fraction, with different random seeds, and evaluated on the same-sized validation dataset; the median of the five models is plotted. We observe that, in agreement with our hypothesis, the LGAE models both maintain their high performance all the way down to training on 1% of the data, while the CNNAE’s performance steadily degrades down to 2% and then experiences a further sharp drop.

1These are calculated by summing each pixel’s momentum “4-vector” — using the center of the pixel as angular coordinates and intensity as the pTrel.

2Discontinuities in the top quark and combined signal LGAE-Min-Max ROCs indicate that at background efficiencies of 5 × 103, there are no signal events remaining in the validation dataset.