Table of ContentsMy websiteDownload PDFGitHub Repository

16.1 Introduction

In this chapter, we present the first Lorentz-group-equivariant autoencoder (LGAE) for jets. As described in Chapter 7.3, autoencoders are networks that learn to encode input data into, and decode data from, a low dimensional latent space, and thus have interesting applications in data compression [414415] and anomaly detection [67248250262416419]. Both tasks are particularly relevant for HEP: the former to cope with the storage and processing of the ever-increasing data collected at the LHC; and the latter for model-agnostic searches for new physics.

Incorporating Lorentz equivariance into an autoencoder has the potential to not only increase performance in both respects, but also provide a more interpretable latent space and reduce training data requirements. As discussed in Chapter 7.2, Lorentz symmetry has been successfully exploited recently in HEP for jet classification [54420422], with competitive and even SOTA results. In the same spirit, we aim to extend these developments to an autoencoder and explore its performance and interpretability. To do so, we employ the Fourier space approach discussed above, which uses the set of irreducible representations (irreps) of the Lorentz-group as the basis for constructing equivariant maps. We also train alternative architectures, including GNNs and convolutional neural networks (CNNs), with different inherent symmetries and find the LGAE outperforms them on reconstruction and anomaly detection tasks.

The principal results of this work demonstrate (1) that the advantage of incorporating Lorentz equivariance extends beyond whole jet classification to applications with particle-level outputs and (2) the interpretability of Lorentz-equivariant models. The key challenges overcome in this work include: (1) training an equivariant autoencoder via particle-to-particle and permutation-invariant set-to-set losses (Section 16.3); (2) defining a jet-level compression scheme for the latent space (Section 16.2); and (3) optimizing the architecture for different tasks, such as reconstruction (Section 16.3.3) and anomaly detection (Section 16.3.4).

This paper is structured as follows. We present the LGAE architecture in Section 16.2, and discuss experimental results on the reconstruction and anomaly detection of high energy jets in Section 16.3. We also demonstrate the interpretability of the model, by analyzing its latent space, and its data efficiency relative to baseline models. Finally, we conclude in Section 16.4.