New #Paper: '#FeatureSelection with Distance Correlation' (https://arxiv.org/abs/2212.00046) - a short #PaperSummary thread
We investigates how to automatically find a small # of features that - when put into a simple #NeuralNetwork - yield good performance (e.g. for classification)
Two possible uses:
- Explain the behavior of a #BlackBox classifier
- Build a light-weight classifier from scratch
arXiv.orgFeature Selection with Distance CorrelationChoosing which properties of the data to use as input to multivariate
decision algorithms -- a.k.a. feature selection -- is an important step in
solving any problem with machine learning. While there is a clear trend towards
training sophisticated deep networks on large numbers of relatively unprocessed
inputs (so-called automated feature engineering), for many tasks in physics,
sets of theoretically well-motivated and well-understood features already
exist. Working with such features can bring many benefits, including greater
interpretability, reduced training and run time, and enhanced stability and
robustness. We develop a new feature selection method based on Distance
Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted
top- and $W$-tagging. Using our method to select features from a set of over
7,000 energy flow polynomials, we show that we can match the performance of
much deeper architectures, by using only ten features and two
orders-of-magnitude fewer model parameters.