Keras Callbacks

Keras Callbacks are incredibly useful while training neural networks especially when running large models on Azure (these days) that I know will cost me time, computation and $.

I pretty much used to painfully run models across all epochs before I discovered this gem. The official documentation describes it best but its essentially a setting to let your model know when to stop training post a self-set threshold of accuracy. The callback I usually end up using is the ModelCheckPoint as well as the EarlyStopping.

Model Checkpointing is used with model.fit() to check-in weights at a defined interval so the model be loaded at the state that it was saved in. Some great scenarios also saving it at stages based on certain value you can monitor such as accuracy (val_acc), loss (applied to train set), val_loss(applied to test set).

The advantage of using ModelCheckPoint versus save_weights or just save is that it can save the whole model or just the weights depending on the state.

Detailed parameters here in the source code: https://github.com/keras-team/keras/blob/master/keras/callbacks/callbacks.py#L632

When we call fit() on the the model for training , Keras calls the following functions:

  • on_train_begin and on_train_end being called at the beginning and the end of training respectively. on_test_begin and on_test_end being called at the beginning and the end of evaluation respectively.
  • on_predict_begin and on_predict_end at the beginning and at the end of the prediction  respectively.

In addition , the baselogger class also accumulates the epoch average of metrics with  on_epoch_begin, on_epoch_end, on_batch_begin, on_batch_end – this gives us flexibility, for example, for Early Stopping to be called at the end of every epoch and to compare current value with the best value until then.

Detailed parameters here in the source code: https://github.com/keras-team/keras/blob/master/keras/callbacks/callbacks.py#L732

You can use EarlyStopping callback to stop training when the val_acc stops increasing, else the model with overfit on the data. You could also see this as cases where the loss keeps decreasing while the val_loss increases or stays stagnant. What usually works is to start with something like below and plot the error loss with and without early stopping.

A combination of both approaches:

Plenty more callbacks in the documentation: https://keras.io/api/callbacks/

Peak

A great book, well-worth sharing, that I recently read is Peak: Secrets from the New Science of Expertise by Anders Ericcson and Robert Pool.

What does it take to become an expert?

This is an intriguing subject which consciously or subconsciously plays a part in our professional and personal lives, as we constantly strive to better ourselves.

The book busts the myth that some of us are born with vastly superior talent far out of the reach of others. Using the example of the famous ‘child prodigy’, Mozart, Anders deconstructs the ideology of unattainable intrinsic talent. Yes, Mozart was gifted. At age 4 when most kids were playing with the 18th century wooden version of Lego in Salzburg, this son of a musician was surrounded by musical instruments and learning to shred on sonatas under the watchful eye of his father.

This runs in parallel to the ‘10000 hours of practice‘ theory, popularized by Malcolm Gladwell. But that’s where the similarity ends. The book goes on to describe the futility of ‘naive practice’ or generic practice – as most kids forced to attend piano lessons every week would also attest to.

‘Purposeful practice’ is defined by quantified goals with small steps that lead you to an improved ability to attain your goal. This resonated with me as I personally have fond memories of my teenage years wood-shedding on a musical instrument for seemingly infinite hours. This led me to attain a high proficiency in spite of not having any innate musical gene. Purposeful practice means focused practice with regular feedback from a mentor. It also involves pushing the boundaries gradually to advance to the next level.

However, mastery comes with a higher level of practice, which Anders terms as ‘Deliberate Practice’. In addition to Purposeful practice, it is based on proven techniques developed by experts in the past. It entails intense, methodical, and sustained efforts to fulfill your aim. For example, memorization has well known documented techniques that have relentlessly pushed human ability to retain information as evidenced by memory contests. Similarly, violin training techniques developed over centuries, and studying virtuosos like Paganini offer the true path to mastery.  The seeker must identify the absolute best in the field and carefully study the method to attain mastery.

Even with all the great points the author makes, it is still obvious that some endeavors like gymnastics or specific sports that require certain physical attributes may not be attainable regardless of practice. The book does not sufficiently counter this. Yet, regardless of what any book or expert says, there is really no all-encompassing established way for mastery. Just because I have driven for many years, pretty sure I won’t be a NASCAR-ruling “Ricky Bobby” anytime soon. This is because there is no deliberate practice on nailing the finer aspects here. Similarly, just because I try to belt out the aria “Nessun dorma”in the shower, does not mean I will emulate Pavarotti anytime soon. That’s the point.

 “This is a fundamental truth about any sort of practice: If you never push yourself beyond your comfort zone, you will never improve.” – Anders Ericsson

If you want to do better than what you are doing right now, this book may benefit you. As a parent, the book was also a great reminder on the importance of focused methodical training as opposed to a one-size-fits-all curriculum that is sometimes inculcated in our children. Highly recommended.

Sonic Pi

As a hobbyist musician for many years, it’s been a constant struggle to fuse the programming paradigm with musical ability. Without midi, it’s usually just a fruitless exercise trying to use some of the open source available. But with Sonic Pi (http://sonic-pi.net/) and the power of a dynamically typed interpreted language like Ruby – this experience has just been getting better over the years. It took me less than 10 minutes to compose a piece of music (and in the process teach the inherent power of loops and iteration to my 6-year old this morning). A totally immersive experience with the power of meta-programming.

Results below ( used a tabla sample fused with e minor pentatonic piano loop in about 75 lines of code). If you use the software, consider donating to https://lnkd.in/ggKTaBT to keep SonicPi alive.

/gDqMdPD