by Sebastian Raschka & Vahid Mirjalili
I primarily wanted to read this book due to the Pytorch section and pretty much flipped through the Scikit learn section while absorbing and practicing with the Pytorch section. So my review largely is based on chapter 13 and beyond. Apart from the Pytorch official documentation, there are not too many comprehensive sources that can serve as a good reference with practical examples for Pytorch in my view. This book aims to do that and pretty much hits the mark.
Ch 1-11 is a good refresher on SciKit -learn and sets up all the foundational knowledge you need for the more advanced concepts with Pytorch. I wish though the authors had different examples and not resorted to ubiquitous examples like MNIST ( like in chapter 11/ 13 etc) for explaining neural network concepts. While these are good for understanding foundational concepts, I find the really good books usually veer away from standard examples found online and get creative.
Chapter 12 provides an excellent foundation in Pytorch and a primer for building a NN model in Pytorch. The code examples are precise with the data sources clearly defined so I could follow along without any issues. I did not need a GPU/collab to run examples . Good to see the section on writing custom loss functions as those are useful practical skills to have.
Ch-14 which has us training a smile classifier to explain convolution neural networks is a useful example especially for tricks like data augmentation that can be applied to other usecases.
I skipped through the chapter on RNNs as transformers are the rage now ( Ch-16) and Pytorch already has everything implemented in its wrapper function for structures like LSTMs. Still , a lot of dense and useful material explaining the core concepts behind RNNs and some interesting text generation models using Categorical pytorch classes to draw random samples.
The chapter on Transformers is a must-read and will clear up a lot of foundational concepts. Another thing to mention is that the book has well depicted color figures that make some of the dense material more understandable. Contrasting the transformers approach to RNNs using concepts like attention mechanisms is clearly explained. More interestingly, the book dwells into building larger language models with unlabeled data such as BERT and BART. I plan to re-read this chapter to enhance my understanding of transformers and the modern libraries such as HuggingFace that they power.
The chapter of GANs was laborious with more MNIST examples and could have had a lot more practical examples.
Ch-18 on Graph Neural Network is a standout section in the book and provides useful code examples to build pytorch graphs to treat as datasets defining edges and nodes. For example, libraries like Torch Drug are mentioned that use pytorch Geometric framework for drug discovery. Spectral graph convolution layers, graph pooling layers, and
normalization layers for graphs are explained and I found this chapter to be a comprehensive summary that would save one hours of searching online for the fundamental concepts. GNNs definitely have a ton of interesting applications and a lot of recent papers with links are provided.
Ch-19 on reinforcement learning adds another dimension to the book which is largely focused on supervised and unsupervised learning in prior chapters. Dynamic programming to Monte Carlo to Temporal Difference methods are clearly articulated. The standard open AI gym examples are prescribed for implementing grids to specify actions and rewards. I thought this chapter was great explaining the theoretical concepts but the examples were all the standard Q-learning fare you would find online. Would have loved to see a more realistic example or pointers to apply to your own usecases.
All in all, I enjoyed the Pytorch examples and clearly explained concepts in this book and it would be a good Pytorch reference to add to your library.