Understanding and Designing Deep Neural Networks Through Theory-Guided Training

Karthika Nasir; Aradhana Reva; Jai Sekhar

doi:10.31224/4790

##article.authors##

Karthika Nasir Bharti Centre for Communication, Indian Institute of Technology, Bombay https://orcid.org/0009-0000-6215-2980
Aradhana Reva Bharti Centre for Communication, Indian Institute of Technology Bombay, India
Jai Sekhar Bharti Centre for Communication, Indian Institute of Technology Bombay, India

DOI:

https://doi.org/10.31224/4790

Keywords:

Deep Neural Networks, Theory-Guided Training, Optimization Theory, Neural Tangent Kernel, Implicit Bias, Information Theory, Generalization, Robustness, Interpretability

Abstract

Deep Neural Networks (DNNs) have revolutionized numerous fields with their remarkable empirical success, yet their theoretical understanding remains incomplete. The emerging paradigm of \emph{theory-trained deep neural networks} aims to bridge this gap by integrating rigorous theoretical principles into the design and training of deep models. This survey provides a comprehensive overview of the foundational theories, training methodologies, and practical applications that define this vibrant area of research. We categorize existing approaches based on the theoretical frameworks they leverage, including optimization landscapes, neural tangent kernels, implicit bias, and information-theoretic principles. Furthermore, we discuss empirical successes across diverse domains such as computer vision, natural language processing, scientific computing, and reinforcement learning. Finally, we outline key challenges and promising future directions, emphasizing the need for scalable, interpretable, and robust theory-guided learning algorithms. This survey serves as a resource for researchers and practitioners interested in the intersection of deep learning theory and practice.

Downloads

Download data is not yet available.

Understanding and Designing Deep Neural Networks Through Theory-Guided Training

##article.authors##

DOI:

Keywords:

Abstract

Downloads

Downloads

Posted

License

Latest preprints