Nonuniform random feature models using derivative information

Dr. Konstantin Pieper

Abstract:

Neural networks are often treated as a "black box," and their interpretation and construction is based on empirical evidence. However, for scientific applications, accurate and stable approximation of functions (and their derivatives) is needed, which requires interpretability. Moreover, mathematical analysis can enable the design of more reliable and efficient training methods. In this talk, we consider fully trained, arbitrarily wide shallow neural networks with sparse regularization and randomly initialized partially trained networks. Although the latter class of methods is easy to set up, train, and interpret through the lens of random feature models, theoretical and practical evidence shows that there is a large gap in quality between fully and partially trained networks, owing to the lack of adaptivity of inner weights.

To narrow this gap, we propose new initialization strategies to improve the associated random feature models, and reduce the need for nonlinear training. The nonuniform data-driven parameter distributions are based on derivative data of the function to be approximated. We address the cases of Heaviside and ReLU activation functions and their smooth approximations (sigmoid and softplus). We extend recent analytic results that give exact representation, and obtain densities that concentrate in regions of the parameter space corresponding to neurons that are well suited to model the local derivatives of the unknown function. Based on these results, we suggest simplifications of these exact densities based on approximate derivative data in the input points that allow for efficient sampling and lead to approximation quality of random feature models close to optimal sparse networks in several scenarios.

As a necessary background, we briefly give an introduction to:

- interpretation through random feature models and reproducing kernel Hilbert spaces,

- sparse infinite feature regression and variation Banach spaces,

- connections of neural network to classical spline based approximation,

- conditions in which conventional random initialization and gradient based training can be interpreted in terms of these frameworks.

Speaker’s Bio:

Konstantin Pieper is a staff mathematician in the Data Analytics and Machine Learning group, which he joined in June 2019. He obtained his Ph.D. from the Technical University Munich in 2015 and, prior to joining the Oak Ridge National Laboratory, he worked as postdoctoral researcher at Florida State University. His research interests include structure preserving discretization methods, sparse optimization, inverse problems, sensor placement, surrogate models and scientific machine learning.

September 12

9:00am - 10:00am

Contact

Konstantin Pieper

Staff Mathematician

PIEPERK@ORNL.GOV

Mathematics in Computation Section

Nonuniform random feature models using derivative information

Dr. Konstantin Pieper

Related Organizations