Alice's Adventures in a Differentiable Wonderland -- Volume I, A Tour of the Land

Abstract

This book is a self-contained introduction to the design of modern (deep) neural networks. Because the term β€œneural” comes with a lot of historical baggage, I prefer the simpler term β€œdifferentiable models” in the text. The focus of this 250-pages volume is on building efficient blocks for processing nD data, including convolutions, transformers, graph layers, and modern recurrent models (including linearized transformers and structured state-space models). Because the field is evolving quickly, I have tried to strike a good balance between theory and code, historical considerations and recent trends. I assume the reader has some exposure to machine learning and linear algebra, but I try to cover the preliminaries when necessary. The volume is a refined draft from a set of lecture notes for a course called Neural Networks for Data Science Applications that I teach in Sapienza. I do not cover many advanced topics (generative modeling, explainability, prompting, agents), which will be published over time in the companion website.

Arxiv Link
Book Link