I emphasize mathematical/conceptual foundations because implementations of ideas(ex. Torch, Tensorflow)
will keep evolving but the underlying theory must be sound. Anybody with an interest in deep learning
can and should try to understand why things work.
I include neuroscience as a useful conceptual foundation for two reasons. First, it may provide inspiration
for future models and algorithms. Second, the success of deep learning can contribute to useful hypotheses
and models for computational neuroscience.
Information Theory is also a very useful foundation as there's a strong connection between data compression
and statistical prediction. In fact, data compressors and machine learning models approximate Kolmogorov Complexity
which is the ultimate data compressor.
You might notice that I haven't emphasized the latest benchmark-beating paper. My reason for this is that a good
theory ought to be scalable which means that it should be capable of explaining why deep models generalise and we
should be able to bootstrap these explanations for more complex models(ex. sequences of deep models(aka RNNs)).
This is how all good science is done.