Thursday, August 15, 2013

Zen and the Art of Big Data

Zen approach
The Zen approach aka the holistic approach can be applied to many problems in life-Motorcycle maintenance is one of them. Yes, I am referring to the famous book by Robert Pirsig.  Holistic approach looks at the problem at a higher level in its entirety.  The need to understand the problem completely is not the basic  foundation.

Reductionist approach
The Zen approach is totally the opposite of the Reductionist approach - which relies on basically what is "divide and conquer" technique- break the big problems into smaller ones till they are either completely understood or can be divided further.  Computer programming in general has mostly been in the camp of reductionist camp. Programs are designed to handle large complex problems by working on them by splitting them into smaller bite size chunks.  This has been the case since the first computers were designed.

Enter big data. Big data can tell you the entire picture since it does not care so much about the individual sub-components or causality or co-relation. Its just data. It seems to be the best shot that  scientists in the Holistic camp have had in a long time in the digital age.  But is it? I am excluding digital techniques in the life sciences area for one reason only - I do not fully understand the whole world of life sciences, genomics and related fields. :)

To get maximum bang for the big data buck you need to articulate the problem statement with zero ambiguity. Problem statement that is well defined and can be represented by your big data dictionary. If not, you need to refine and redefine the problem. Any thing else will not work. So, what is it that we did here? We basically ended up with the reductionist approach. I hate to disappoint the Zen crowd but it is the only way for now at least that you can get returns on your big data investment.

There are some attempts in the commercial start-up world that might change this in the future but for now I am firmly in the Reductionist camp.

Images from