Home

“Is there any point to which you would wish to draw my attention?” | “To the curious incident of the dog in the night-time.” | “The dog did nothing in the night-time.” | “That was the curious incident,” remarked Sherlock Holmes.

Not everyone can infer like Sherlock Holmes, and even Holmes himself couldn’t have done much without data. Indeed he once exclaimed impatiently: “Data! Data! Data! I cannot make bricks without clay.”

Holmes was fortunate. He only lived in Conan Doyle’s make-believe world where all the pieces of the jigsaw puzzle were available inside the cardboard box. He only had to ingeniously reconstruct the picture pasted on top of the box!

But what if some of the pieces are missing and you have to first go out and find them? What if some of the pieces required actually have to be fabricated? Or what if there are thousands of pieces available when only a few dozen are required to complete the picture?

These are fascinating problems that data analysts have to grapple with before they can make the right inference in the real-life world. ‘Finding’ the missing pieces is a bit like querying an external database; ‘fabricating’ a new piece is like undertaking some specialized data analysis; and choosing a small subset of useful data from a big data set requires compression techniques and evaluation of probabilities.

A technical discussion on such questions would be too tedious for readers of this blog. So let us create a ‘next gen’ Sherlock Holmes and return to the idyllic world of Hindi film music to get a flavour of this business of analytics and inference.

Listen to the song below first, and then we’ll meet our ‘next gen’ Holmes, and, of course, the ‘next gen; Watson.

:

Here’s our question for the new avatar of Sherlock Holmes: Who is the music director of this song?

Given that unmistakable beat of horse hooves pounding away, Holmes would surely think of O P Nayyar. But can he prove this rigorously? Here’s how his thought process goes:

“The female voice is Asha, not Lata. O P Nayyar never used Lata’s voice; so it could indeed be OP … “

But Holmes needs irrefutable proof. He replays the video:

The hero is Manoj Kumar. The heroine is Sharmila Tagore … what does the hero-heroine pair of Manoj Kumar-Sharmila Tagore tell us?

At this point Holmes quickly constructs a database query to count the number of Manoj-Sharmila films, and a second later tells Watson that he knows the answer.

“Manoj-Sharmila had only one film together: Sawan Ki Ghata; and O P Nayyar was the music director of Sawan Ki Ghata. So it has to be OP!”  QED.

Most whodunit plots provide clues that are ‘complete’ and ‘consistent’; a thoughtful chain of deductions will therefore always uniquely reveals the criminal. The big difference in our example is that the eventual solution required access to an external database.

Since this seems like fun, let us construct another example based on a Hindi film song.

 

Our question to Holmes this time is: Who is the lady singing this duet with Rafi?

Holmes decides to ask Dr Watson to solve this puzzle even as he retires to relax with his violin. “You know my methods. Apply them”, he tells him.

Next morning, Watson can hardly contain his excitement: “Holmes, I have cracked your puzzle! This is a duet from a 1966 film called Mamta. While it is hard to decipher the lady’s voice in the duet, I was fortunate to find a solo version of the same song … which was sung by Lata Mangeshkar.”

Holmes merely smiled in response. “One should always look for a possible alternative, and provide against it”, he replied, even as he submitted an online query on his Android cellphone. A beep announced the response to his query. Holmes chuckled as he saw the response and then fired a second online request. “I would consider the year in which the film was released, Watson”, he said before walking away.

“So it was Lata Mangeshkar, wasn’t it?”, Watson asked as they sat down for dinner.

“Of course not! It was Suman Kalyanpur”, Holmes replied. “I was well aware that in the mid-1960s, at the peak of their careers, Rafi and Lata were not singing duets together. My first database query was to list all the female singers who sang duets with Rafi in Mamta. As I expected, Lata did not figure in this list. But, to my surprise, I found that Rafi had sung a duet with both Suman Kalyanpur and Asha Bhosle. I therefore had to submit an online request for a cepstrum analysis to analyse the individual voices and confirm that this duet was indeed sung by Suman Kalyanpur.

Seeing Watson dumbfounded Holmes told him: “Education never ends.”

This is indeed true. The science of inference is not just a matter of simple logic now: we also need to query databases and undertake digital analysis.

Indeed the problem could get even harder, as Holmes and Watson discovered on a wet and dismal monsoon-ravaged evening in Mumbai when a distraught and dishevelled Mr Pestonji burst into their chamber. “Only you can save me sir”, he implored, “I have bet all my life’s earnings that this song is composed by Jaikishan, and I now have to provide the evidence!”

 

This Rafi solo is from the 1967 film Brahmachari with the musical score attributed to the Shankar-Jaikishan (SJ) duo. It is however well known that Shankar and Jaikishan composed their tunes separately for most of their career span. So was this song composed by Shankar or Jaikishan?

Holmes spent a troubled night pondering over the problem and playing every SJ song available on his iPod. He found a few leads but they were not enough to conclusively establish Mr Pestonji’s definitive assertion.

The film was released well before Jaikishan’s death in 1971 so it could certainly be Jaikishan | The song was written by Hasrat Jaipuri who collaborated far more frequently with Jaikishan than Shankar | A lengthy background interlude precedes this song; Jaikishan often handled background scores …

The real difficulty here is that this jigsaw puzzle has too many pieces, and it is not even certain if any subset of pieces could ever complete the jigsaw picture. To provide answers we would have to enter the realm of probability theory; the best answer we can perhaps give is that there is less than a 10% chance that this song is not by Jaikishan.

As Holmes would later tell Watson: “We balance probabilities and choose the most likely. It is the scientific use of imagination.”

Holmes also envisioned a new wind coming (“such a wind as never blew before”). This “big data” wind would bring data of great volume, velocity and variety. But it will be “God’s own wind none the less, and a cleaner, better, stronger land will lie in the sunshine when the storm has cleared”.

And in that cleaner, better and stronger tomorrow of big data analytics we might finally know for sure that it was indeed Jaikishan who composed that memorable song in Brahmachari.

— Cepstrum analysis was used in the investigation of the Airbus A320 crash at Bangalore airport on February 14, 1990. But that’s another story for another time.

18 thoughts on “The dog did nothing at night

  1. Wow! This is amazing piece! Loved reading every bit and the way you have played the intertwine of the 2 realms of dimensions – data and entertainment. Makes things so compelling! Hindi film music has thrown open a huge opportunity for a variety of analytics. The standard and mix-and-match combinations to suit the changing audience needs is indeed a thrilling experimentation.

  2. wow…. you turned a dull boring topic into a wonderful easy to understand article… brilliant.. you truly are the BIG BROTHER of genius bhogle jr.

  3. This piece is absolutely awesome. Very imaginative & you have used a very relevant & compelling example of deducing compose artists of various Bollywood compositions from available knowledge. Best part is that the focus is not just on knowing some fact, it is also about proving it with sound & quick information synthesis.

    On a side note, don’t you think the title of the blog post could have been more Sherlock-Bollywood like?

  4. Interesting!

    Every semester, in my course on risk management, I give the students that quote from Holmes. Of course, my powerpoint slide says: “Datos Datos Datos….no puedo fabricar ladrillos sin arcilla.” Sixteen years, nobody has so far been able to tell me where it came from.

    But here is the thing about Sir Arthur Conan Doyle. He was extremely naive and gullible in real life. He thought the Cottingley Fairies were real. He also believed in spiritual mumbo jumbo. The point is, in real life, there are lot more paths one can take that in the confines of a story, such possibilities are ruled out.

    Right now, for example, I am struggling to understand why two betting markets are giving completely different probabilities of Obama win next Tuesday and why there is no arbitrage. Theory tells us that it should not happen – two products cannot have two different prices in frictionless markets (and these are as close to frictionless markets as you can get).

    Real life is far more complicated than your Hindi song setup.

    Tapen

  5. That was a superb lesson in database queries, investigation and logical inferences. Using Holmes as a metaphor solving questions on Hindi movie songs was just brilliant. I would like to share it with some friends on a music oriented group in FB.

  6. Very well written. I liked the simplicity in explaining the new buzz word “Big Data” to people who may not be from tech background. Data Data Data – everywhere has become huge because it has outgrown our current ability to process them or cope with them efficiently or store them. The way we have stored data in the past and processed them are not a scalable model and hence a whole bunch of folks are into “Big Data” technology where we want to measure/analyze zettabytes of data in no time. Thanks to Jyoti for sharing this link.

    Thanks & Regards
    Santosh

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s