Spark AI Summit 2020 Notes

Spark + AI Summit - Databricks

Spark AI Summit just concluded this week and as always, plenty of great announcements. (Note: I was one of the speakers at the event but this post is more about the announcements and areas of my personal interest in Spark. The whole art of virtual public speaking is another topic). The ML enhancements and impact is a bigger topic probably for another day as I catch up with all the relevant conference talks and try out the new features.

Firstly, I think the online format worked for this instance. This summit ( and I’ve been to it every year since its inception) was way more relaxing and didn’t leave me exhausted physically and mentally with information overload. Usually held at the Moscone in San Francisco, the event becomes a great opportunity to network with former colleagues, friends and Industry experts which is the most enjoyable part yet taxing in many ways with limited time to manage. The virtual interface was way better than most of the online events I’ve been to before – engaging and convenient. The biggest drawback was the networking aspect and the online networking options just don’t cut it. The video conferencing fatigue probably didn’t hit since it was 3 days and the videos were available instantly online so plenty of them are in my “Watch Later” list. (Note the talks I refer to below are only the few I watched so plenty of many more interesting ones)

The big announcement was the release of Spark 3.0 – Hard to believe but it’s been 10 years of evolution. I remember 2013 as the year I was adapting to the Hadoop ecosystem writing map-reduce using Java/Pig/Hive for large scale data pipelines when Spark started emerging as a fledgling competitor with an interesting distributed computational engine using Resilient Distributed Datasets (RDD). Fast-forward to 2020 and Spark is the foundation of large scale data implementations across the industry and its ecosystem has evolved to frameworks and engines like Delta and MLflow which are also gaining a foothold as foundational to the enterprise across Cloud providers. More importantly, smart investment into its DataFrames API has reduced the barrier to entry to it with the SQL access patterns.

There were tons of new features introduced but focusing on the ones I paid attention to. There has not been a major release of Spark for years so this is pretty significant (2.0 was in 2016).

Spark 3.0

  • Adaptive Query execution: At the core, this helps in changing the number of reducers at runtime. It divides the SQL Execution plan into stages earlier instead of the usual RDD graph. Newer stages help injecting optimizations before the queries get executed as later stages have the full picture of the entire query plan to have a global picture of all shuffle dependencies . The execution plans can be auto-optimized at runtime for example changing a SortMergeJoin to a BroadcastJoin where applicable. This is huge in large-scale implementation when I see tons of poorly formed queries eating a lot of compute thanks to skewed joins. More specifically, settings like the number of shuffle partitions set using spark.sql.shuffle.partitions which has defaulted to 200 since inception can now be automatically tuned based on the reducers required for the mapping stage output – i.e. setting it high for larger data and smaller for smaller data.

  • Dynamic partition pruning: Enables the ability to perform filter pushdowns versus table scans by adding a partition pruning filter. At the core if you consider a broadcast hash join between a fact and dimension table, the enhancement intercepts the result of the broadcast and plugs them as a filter on top of the dynamic filter on the fact table as opposed to the earlier approach of pushing out the broadcast hash table derived from the dimension table to every worker to determine the value of the join with the fact. This is huge to avoid scanning irrelevant data. This session explains it well.

  • Accelerator-aware scheduler: Traditionally, the bottleneck usually has been small data in partitions that GPUs find hard to handle, cache processing efficiencies, slow I/O on disk, UDFs that need CPU processing and a lot more issues. But GPUs are massively useful for high cardinality datasets, matrix operations, window operations and transcoding situations. Originally termed project Hydrogen, this feature helps Spark be GPU-aware. The cluster managers now have GPU support that schedulers can request from. The schedulers can now understand GPUs allocations to executors and assign GPUs appropriately to tasks. The GPU resources still need to be configured using the configs to assign the appropriate resources. We can request resources at the executor, drive and the task level. This also allows the resources to be discovered on the nodes and their assignments. This is supported in YARN, Kubernetes and Standalone modes.
  • Pandas UDF overhaul: Extensive use of python type annotations – this becomes more and more imperative as codebases scale and newer engineers take longer to understand and maintain the code effectively. instead of writing hundreds of test cases or worse find out about it from irate users. Great documentation and examples here.

  • PySpark UDF: Another feature that I’ve looked forward is to enable PySpark to handle Pandas Vectorized UDFs as an array. In the past, we needed to jump through god awful hoops like writing scala functions as a helper and then switch over to Python in order to help Python read these as arrays. ML engineers will welcome this.

  • Structured Streaming UI: Great to see more focus on the UI and additional features appearing in the workspace interface which frankly has got to be pretty stale over the last few years. The new tab shows more statistics for running and completed queries and more importantly will help developers debug exceptions quickly rather than poring through log files.

  • Proleptic Gregorian calendar: Switched to this from the previous hybrid (Julian + Gregorian). This uses Java 8 API classes from the java.time packages that are based on ISO chronology . The “proleptic” part comes from extending the Gregorian calendar backward to dates before before 1582 when it was officially introduced.

    Fascinating segway here –
Pope Gregory XIII portrait.jpg

The Gregorian Calendar (named after pope Gregory the 13th , not the guy who gave us the awesome Gregorian Chants, that was Gregory 1 ) is what we use today as part of ISO 8601:2004. The Gregorian calendar’ replaced the the Julian Calendar due to its inaccuracies in determining an actual year plus issues where it could not really take into the complexities of adding a leap year almost every 4 years. Catholics liked this and adopted it while protestants held out for 200 years (!) with suspicion before  England and the colonies switched over advancing the date from September 2 to September 14, 1752! Would you hand over 12 days of your life as a write -off? In any case, you can thank Gregory the 13th for playing a part in this enhancement.

  • Also a whole lot of talk on better ANSI SQL compatibility that I need to look closer at. Working with a large user base of SQL users, this could only be good news.

  • A few smaller super useful enhancements:
    • “Show Views” command
    • “Explain” output formatted for better usability instead of a jungle of text
    • Better documentation for SQL
    • Enhancements on MLlib, GraphX

Useful talks:

Delta Engine/Photon/Koalas

Being a big Delta proponent, this was important to me especially as adoption grows and large-scale implementations need continuous improvements in this product to justify rising storage costs on cloud providers as the scale grows.

The Delta Engine now has an improved query optimizer and a native vectorized execution engine written in C++. This builds on the optimized reads and writes in today’s NVMe SSDs that eclipse the SATA SSDs found in previous generations along with faster seek times. Gaining these efficiencies out of the CPU at the bare metal level is significant especially as data teams deal with more and more unstructured data and high velocity. The C++ implementation helps exploiting data-level and instruction-level parallelism as explained in detail in the keynote by Reynold Xin. Some interesting benchmarks on strings using regex to demonstrate faster processing. Looking forward to more details on how the optimization works under the hood and implementation guidelines.

Koalas 1.0 now implements 80% of the Pandas APIs. We can invoke accessors to use the Pyspark APIs from Koalas. Better type hinting and a ton of enhancements on DataFrames, Series and Indexes with support for Python 3.8 make this another value proposition on Spark.

A lot of focus on Lakehouse in ancillary meetings were encouraging and augurs well for data democratization on a single linear stack versus fragmenting data across data warehouses and data lakes. The Redash acquisition will provide another option for large scale enterprises for easy-to-use dashboarding and visualization capabilities on these curated data lake. Hope to see more public announcements on that topic.


More announcements around the MLflow model serving aspects with Model Registry (announced in April) that lets data scientists track model lifecycle across versions such as Staging, Production, or Archived. With MLflow in the Linux Foundation, it helps evangelizing it to a larger audience with a vendor-independent non-profit managing this project.

  • Autologging : Enables automatic logging of Spark datasource information at read-time, without the need for explicit log statements. mlflow.spark.autolog() will enable auto logging for spark data sources if you provide the relevant data and versions using Delta Lake so the managed Databricks implementation definitely looks slicker with the UI. Implementation would be as easy as attaching a ml-flow spark JARS and then call mlflow.spark.autolog. More significantly, enables the cloning of models.
  • On Azure – the updated mlflow.azureml.deploy API for deploying MLflow models to AzureML. This now uses the up-to-date Model.package() and Model.deploy() APIs.
  • Model schemas for input and out schemas, custom metadata tags for tracking which means more metadata to track which is great.
  • Model Serving : Ability to deploy models via a rest endpoint on Hosted ML Flow which is great. Would have loved to see more turnkey methods to deploy to an agnostic deployment endpoint say, a managed Kubernetes service – the current implementation is for databricks clusters from what I noticed.

  • Lots of cool UI fixes including highlighting different parameter values when comparing runs, UI plot updates with scaling to thousands of points.

Useful talks:

Looking forward to trying out the new MLflow features which will go on public preview later in July.

Let it read – A character-based RNN generating Beatles lyrics

The Bob Spitz biography stares at me all day as I spend most of my waking hours at my home office desk taunting me to finish it one of these days. Probably subliminally influenced me to write a quick app to generate lyrics using a Recurrent Neural Net. The Databricks community edition or Google collab makes it a breeze to train these models at a zero or reasonable price point using GPUs.

All the buzz around GPT-3 and my interest in Responsible AI is helping inspire some late night coding sessions blown away by the accuracy of some of these pretrained language models. Also wanted to play around with Streamlit which saves me from tinkering around with javascript frameworks and what not while trying to deploy an app. The lack of a WSGI option limits some deployment options but I found it relatively easy to containerize and deploy on Azure.

The official TensorFlow tutorial is great for boilerplate which in turn is based on this magnum opus. The tutorial is pretty straight forward and google collab is great for training on GPUs.

Plenty of libraries available to scrape data – this one comes to mind. The model is character-based and predicts the next character in the sequence given a sequence. I tweaked around training this on small batches of text and finally settled on around 150 characters to start seeing some coherent structures. My source data scould do better, it stops after Help I believe. On my todo list is to embellish it with full discography so as to make the model better.

The model summary is as defined below:

Didn’t really have to tweak the sequential model too much to start seeing some decent output. The 3 layers were enough along with a Gated Recurrent Unit ( GRU). The GRU seemed to give me a better output than LSTM so I let it be…

As per the documentation, for each character the embedding is inputted into the GRU which is then run with one timestep. The dense layer generates the logits predicting the log-likelihood of the next character.

For each character the model looks up the embedding, runs the GRU one timestep with the embedding as input, and applies the dense layer to generate logits predicting the log-likelihood of the next character.

The standard tf.keras.losses.sparse_categorical_crossentropy  loss function is my usual go-to in this case because the classes are all mutually exclusive. Couldn’t get past 50 epochs on my mac book pro without the IDE hanging so had to shift to a Databricks instance which got the job done with no sweat on a 28.0 GB Memory, 8 Core machine.

All things must pass and 30 minutes later, we had a trained model thanks to an early stopping callback monitoring loss.

The coolest part of the RNN is the text generating function from the documentation that you can configure the number of characters to generate. It uses the start string and the RNN state to get the next character. The next predicted character is based on the categorical distribution which provides the index of the highest distributed category as the next input into the model. The state of the model is retained per input and modified states of the model are fed back into the model to help it learn. This is the magical mystery of the model. Plenty of more training to do as I wait for the Peter Jackson release.

Link to app:


Alesis V49

Finally bit the bullet and added the Alesis v49 to my motley collection of instruments. 49 keys is enough for a novice like me. I’ve been near midi keyboards a lot when I actively played guitars for years but never really felt the inclination to own one. The piano & guitar combination is still my favourite duo when it comes to making music that feels good- think massive drum fills, shredding guitars with minor arpeggios, edgy bass, FM bells and a moody atmosphere and you got my code.

 The main goal here is to get better at programming accompaniments to instrumental guitar and to get better at arranging compositions.The 8 LED backlit pads respond OK though I haven’t really used them a lot. The sensitivity index on them seems different so it seems like a bit of bother to use them both especially for fast passages. The 4 assignable knobs are super convenient to program. I really like the Pitch and Mod wheels which let you wail so I can channel my inner Sherinian.

The 49 keys are full-sized and semi-weighted. Compared to a legit piano, they size up pretty well and feel great. The form factor is slightly more harder to press down if you do a direct comparison with a full-fledged digital keyboard but its not too far off. Alesis also allows you to alter the channel, transposition, octave and velocity curve for the keys – I did that immediately after unboxing as most of my earlier research suggested it best to fix that early before I got too comfortable with the “stiffness”.

Overall build quality seems solid. The 37-inch size of this workspace is where its appeal lies for me. I can easily place it on my workdesk without needing separate storage space for it.The package also comes with the Ableton Live Lite Alesis Edition & MIDI editor software. The activation and account creation steps were a breeze – firmware upgrade and I was good to go in 15 minutes.


Leader Election

Citizens in a democratic society usually vote to elect their leaders based on a pool of candidates. Distributed systems can choose a master node as well. Moreso , they can invoke “spot” elections in the situation of a leadership vacuum which helps in maintaining redundancy.

Introducing redundancy in services may introduce new problems of duplication and multi-operations. Leaders in a group of servers for redundancy will serve the business logic with the followers ready to take over leadership. Not a trivial problem with multiple systems that need to share state and gain consensus to elect the leader. Synchronization overhead and reduction of network trips are key considerations for this process.

Dealing with frameworks like Apache Spark on a daily basis makes you more aware of the inherent challenges stemming from a driver/worker architecture and the limited fault tolerance options when the master node goes down especially in high availability scenarios. Though several mitigation options exist (restarting with checkpoints, WALs etc)

As described here, Stable leader election ensures the leader remains that until any crashes irrespective of other behavior and its preferred they ensure minimal communication overhead, fault tolerant and ensure the leader is elected in constant time when the system is stable.

The Leader election process should guarantee that at most there is one leader at a time and eliminate the situation of multiple leaders being elected at the same time. This could be possible in network-partitioned systems where leaders being elected are unaware of each other. Zookeeper, Multi-paxos and Raft use temporary leaders to reduce number of messages required to reach an agreement.

Common Leader Election algorithms include the Bully algorithm or the Ring algorithm.

Bully Algorithm

Explained here

  • Each node gets a unique rank to help identify the leader
  • Elections begin when a node does not respond and the node that notices it first starts an election notifying the nodes greater than the fallen leaders rank.
  • The highest ranked responder takes over the process by responding to lower ranked nodes as well as a check to the fallen leader if any response.
  • If the node does not respond, there is a new leader elected.

This could be vulnerable to the multi-leader scenario if the nodes get split into different partitions. Also, the proclivity to the highest rank means unstable leaders can be elected which may mean frequent elections.

Here’s a quick barebones non-socket invocation of a 6-node election process.

Ring Algorithm

Explained here

  • Follows a ring topology and all nodes in the system form a ring
  • New election started when leader undergoes failure, with an alerting node notifying the next node on the ring.
  • Each node on the ring then contacts the next available node (higher ranked within the ring topology)
  • The node receives the responses back and confirms completeness of the traversal around the ring when it sees its own ID on the list and then picks the highest ID.

Apart from these, there are several other interesting implementations such as:

  • Next-in-line failover: Each elected leader provides a list of failover nodes and the leader failure starts a new election round.
  • Candidate/ordinary optimization: The nodes are split into candidate and ordinary – only one of the candidates can become a leader.
  • Invitation algorithm- Allows nodes to invite other nodes to join their groups instead of trying to outrank them

Alex Petrov’s book on Database Internals does a great job of explaining Election scenarios in a distribution context and has served as an inspiration for more detailed study in this topic.

Seven years

Seven years in Tibet by Heinrich Harrer

The author along with his friend, the resourceful Peter Aufschnaiter, takes you on a journey of 1000s of miles over hostile territories at a time of limited resources during the mid 20th century.  The first part of the book deals with the author’s numerous escapes and hardships that were overshadowed by his tenacity to get to the Tibetan capital, Lhasa◊We’ll sync on this next week internally for any options.

.Seven years in Tibet the tale of Austrian Mountaineer Heinrich Harrer’s trek through Tibet and his relationship with the 14th Dalai Lama. The Tibetan culture described before the Chinese invasion points to a simplistic life driven by a feudalistic society. I found the read immersive and had to pause many times to reflect back to a time that people today can never experience. The captivating descriptions of monks being carried on palanquins in Lhasa and the resplendent Potala Palace which was the home of the Dalai Lama leave a lasting impression.

The Dalai Lama , forbidden to mingle with the locals, would gaze from the Potala’s roof at Lhasa street life through a telescope and found a willing companion in the author who eventually rose to become the tutor regaling the young boy with his worldy experience. There are vivid descriptions on Tibetan rituals and the rustic but charming Tibetan way of life before the proceedings were rudely interrupted in 1950 by Mao Zedong’s invading army. The Dalai Lama’s flight to India from a chasing army was marked by the consecration of every building he stayed in during the journey as a holy place. That amazing journey is documented here.

The story ends at the foothills of the himalayas where the Dalai Lama still resides at the upper reaches of North India’s Kangra Valley in Dharamsala.

Harrer seems to be one hell of a renaissance man being a mountaineer, teacher, gardener, architect, civil servant and photographer besides being the part of the team that made the first ascent of the North Face of the Eiger in Switzerland. More importantly, he captures the times with his simplistic writing and takes the reader on a breathtaking journey. I found my self riveted by the journey more than the destination.

“One of the best characteristics of the Tibetan people is their complete tolerance of other creeds. Their monastic theocracy has never sought the conversion of infidels.”

Heinrich Harrer

Power Windows

Power Windows (1985) is the first Rush studio album I listened to and it remains my seminal Rush experience in the 80s Rush era (my other go-to’s are Hemispheres and Counterparts) . My perpetual fondness for this album is not based on any lurking nostalgia but for sheer prodigious musicianship and lyrical genius this album conveyed. Also, though my focus is more on the guitar wizardry, the virtuosity on the drums and bass together makes this album what it ultimately became.


An album forged during the height of the 80s synth boom, I’ve had numerous arguments over the years with other fans who (may rightfully) suggest this represents a low point after the grandeur of Moving Pictures/Permanent Waves or even the progressive heights of 2112 and Hemispheres.

The production quality is pristine with a landscape painted by every track that stands on its own right be it the polyrhythms of “Mystic Rhythms” or the 162 BPM synchronicity of “Manhattan Project”. The overall sound is produced and engineered to excellence.

Most rabid fans claim the CD’s re-issues are not in the same league as the vinyl ( which I don’t own ) but the remasters seem to do the original mercury recordings some justice. Incidentally, this was the first Rush album to ever be released directly to CD.

I’ve gone through a phase spending a lot of time replicating Alex’s 80s tone to varying degrees of satisfaction. Essentially they amounted to a lot of chorus and flanging delays, not ot mention tons of reverb. In some cases chaining multiple layers of chorus one over the other seemed to get the tone closer. Arguably all over the top as the 80s demanded. I plan to hopefully put up a guitar rig patch slightly closer to this tone.

Track Listing:

The Big Money (5:36) – Named after John Do Possos trilogy of the same name. A tribute to soul-less capitalism, monetary value and fame. The opener delivers a banging riff thanks to a full step tuned-up guitar and open chords with ringing harmonics. The 80s were big on racks and this was Alex on Rolands and TC Electronics racks as told to guitar magazines in that era. The live version in “A show of hands” is a worthy live companion. “

Grand Designs (5:05) – Another title from the John Do Possos trilogy. Deals with individualism and non-conformance to group-think. Being stuck in a two-dimensional life and part of the mass production scheme, it takes courage and tenacity to go against the grain. The focus on “image” in the 80s and the “form without substance” also led to a competing and overlapping music/bands that the era is known for. Classic Rush style with repetitive patterns with the drums churning out progressively more complicated parts. The Allan Holdsworth influence comes out in this track with more whammy work and sonic feedback across tonal centers creating an atmosphere towering over the synth barrage. The 7 phrase ending is another nod to the progressive roots.

Manhattan Project (5:05) – About the birth of the nuclear age with a poignant description of the time (WW2), a man (Oppenheimer) and a place ( Los Alamos). The mention of the “Enola Gay” that dropped the atomic bomb completes out the last chorus of this orchestral majesty. The powerful imagery is complemented by a barrage of synth in the foreground with the opening verse melding into a tension-filled chorus. The lyrics shine in sections that drop out bass and guitar in a rare departure from the usual arrangements.

Marathon (6:09) – More inspirational fare about pacing your life partly inspired by Ernest Hemingway’s quote, “In life, one must (first) last“. Set goals to achieve them but don’t burn out too fast. A sublime chorus is overshadowed by a bassline that sprints through the song and the song unveils a middle section in 7/4 that this song is renowned for. Ringing notes sustained by the whammy bar from Alex’s Signature Aurora kick off the song and the texture continues into a mesmerizing middle section leading into a majestic solo.

Territories (6:19) – about nationalism. Beautiful lyrics backed by modern post-punk pop style sound under a big rock foundation. The context should suit the current times well – better being proud of being a world citizen than the bloody pride of being associated with a flag that divides. The Ibanez HD-1000 rack provides the sonic delay that resonates throughout the track. The solo or lack of a true one is understated and the song does well without requiring more histrionics to distract from the lyrics.

Middletown Dreams (5:15) – Similar themes to the suburban disillusion of Subdivisions from the Signals era. The power of dreams and ability to challenge your current environment to achieve those dreams. This excerpt sums it the best. The lyrics probably resonate the most of this album with me as an immigrant and sometimes stranger in a strange land (“Dreams transport the ones who need to get out of town”). Musically, the track ups the ante with more synths and a tempo change in the middle section and shifts a key. The synth and multi-tap delays create some excellent passages making way for the vocals and a busy bassline and a grand chorus. The sequencer in the middle around the 2 minute mark does get a little tiresome especially after all the space in the previous passages but that’s a minor blip.

Emotion Detector (5:10) – Seems like not about anything particular but a bunch of messages meshed together. Key standout are lines like “Illusions are painfully shattered right where discovery starts”   – we have an illusion about who we really are. Probably the weakest track on the album – a bit too much veering to the pop side of the repertoire. The guitar tries in vain to save it with a brilliant solo accentuated by feedback and tasteful notes towards the climax of the solo. Guitar feedback in the beginning of the track washes you over with absolute sonic bliss until the synth authoritatively climbs all over it dampening it to a few notches lower. The vocals are a highlight and the standout. The synth drums have not aged well and sound a bit dated.

Mystic Rhythms (5:54) – A magnum opus to end the album. My interpretation is that this is about exploring our primal instinct about what behind the obvious and beyond and challenging our understanding about the universe. Surprisingly, this was an acquired taste for me and over the years I’ve grown to appreciate the musicianship and the poetry in this more. The guitar texture and the repetitive staccato patterns are the thread that keeps the song’s foundation as arpeggios add an air of sublime mystery to the track. Lots of vocal harmony and the imagery of the track takes you “under northern lights Or the African sun“. the guitar parts are understated but effective as they give way to layers of synth.

Despite the synth barrage and the guitar buried in the mix, the album reflects the times. The songwriting is just on another plane of existence and this evolution in their sound may probably the most accessible for the average fan if you can get past the 80s influence.

Keras Callbacks

Keras Callbacks are incredibly useful while training neural networks especially when running large models on Azure (these days) that I know will cost me time, computation and $.

I pretty much used to painfully run models across all epochs before I discovered this gem. The official documentation describes it best but its essentially a setting to let your model know when to stop training post a self-set threshold of accuracy. The callback I usually end up using is the ModelCheckPoint as well as the EarlyStopping.

Model Checkpointing is used with to check-in weights at a defined interval so the model be loaded at the state that it was saved in. Some great scenarios also saving it at stages based on certain value you can monitor such as accuracy (val_acc), loss (applied to train set), val_loss(applied to test set).

The advantage of using ModelCheckPoint versus save_weights or just save is that it can save the whole model or just the weights depending on the state.

Detailed parameters here in the source code:

When we call fit() on the the model for training , Keras calls the following functions:

  • on_train_begin and on_train_end being called at the beginning and the end of training respectively. on_test_begin and on_test_end being called at the beginning and the end of evaluation respectively.
  • on_predict_begin and on_predict_end at the beginning and at the end of the prediction  respectively.

In addition , the baselogger class also accumulates the epoch average of metrics with  on_epoch_begin, on_epoch_end, on_batch_begin, on_batch_end – this gives us flexibility, for example, for Early Stopping to be called at the end of every epoch and to compare current value with the best value until then.

Detailed parameters here in the source code:

You can use EarlyStopping callback to stop training when the val_acc stops increasing, else the model with overfit on the data. You could also see this as cases where the loss keeps decreasing while the val_loss increases or stays stagnant. What usually works is to start with something like below and plot the error loss with and without early stopping.

A combination of both approaches:

Plenty more callbacks in the documentation:


A great book, well-worth sharing, that I recently read is Peak: Secrets from the New Science of Expertise by Anders Ericcson and Robert Pool.

What does it take to become an expert?

This is an intriguing subject which consciously or subconsciously plays a part in our professional and personal lives, as we constantly strive to better ourselves.

The book busts the myth that some of us are born with vastly superior talent far out of the reach of others. Using the example of the famous ‘child prodigy’, Mozart, Anders deconstructs the ideology of unattainable intrinsic talent. Yes, Mozart was gifted. At age 4 when most kids were playing with the 18th century wooden version of Lego in Salzburg, this son of a musician was surrounded by musical instruments and learning to shred on sonatas under the watchful eye of his father.

This runs in parallel to the ‘10000 hours of practice‘ theory, popularized by Malcolm Gladwell. But that’s where the similarity ends. The book goes on to describe the futility of ‘naive practice’ or generic practice – as most kids forced to attend piano lessons every week would also attest to.

‘Purposeful practice’ is defined by quantified goals with small steps that lead you to an improved ability to attain your goal. This resonated with me as I personally have fond memories of my teenage years wood-shedding on a musical instrument for seemingly infinite hours. This led me to attain a high proficiency in spite of not having any innate musical gene. Purposeful practice means focused practice with regular feedback from a mentor. It also involves pushing the boundaries gradually to advance to the next level.

However, mastery comes with a higher level of practice, which Anders terms as ‘Deliberate Practice’. In addition to Purposeful practice, it is based on proven techniques developed by experts in the past. It entails intense, methodical, and sustained efforts to fulfill your aim. For example, memorization has well known documented techniques that have relentlessly pushed human ability to retain information as evidenced by memory contests. Similarly, violin training techniques developed over centuries, and studying virtuosos like Paganini offer the true path to mastery.  The seeker must identify the absolute best in the field and carefully study the method to attain mastery.

Even with all the great points the author makes, it is still obvious that some endeavors like gymnastics or specific sports that require certain physical attributes may not be attainable regardless of practice. The book does not sufficiently counter this. Yet, regardless of what any book or expert says, there is really no all-encompassing established way for mastery. Just because I have driven for many years, pretty sure I won’t be a NASCAR-ruling “Ricky Bobby” anytime soon. This is because there is no deliberate practice on nailing the finer aspects here. Similarly, just because I try to belt out the aria “Nessun dorma”in the shower, does not mean I will emulate Pavarotti anytime soon. That’s the point.

 “This is a fundamental truth about any sort of practice: If you never push yourself beyond your comfort zone, you will never improve.” – Anders Ericsson

If you want to do better than what you are doing right now, this book may benefit you. As a parent, the book was also a great reminder on the importance of focused methodical training as opposed to a one-size-fits-all curriculum that is sometimes inculcated in our children. Highly recommended.

Sonic Pi

As a hobbyist musician for many years, it’s been a constant struggle to fuse the programming paradigm with musical ability. Without midi, it’s usually just a fruitless exercise trying to use some of the open source available. But with Sonic Pi ( and the power of a dynamically typed interpreted language like Ruby – this experience has just been getting better over the years. It took me less than 10 minutes to compose a piece of music (and in the process teach the inherent power of loops and iteration to my 6-year old this morning). A totally immersive experience with the power of meta-programming.

Results below ( used a tabla sample fused with e minor pentatonic piano loop in about 75 lines of code). If you use the software, consider donating to to keep SonicPi alive.