Big data is big news. Companies like Amazon, Google and major supermarkets are delivering new services and competitive advantage through analysing their massive datasets. The Economist reports that 30% of Amazon’s sales are through its “you may also like” recommendations.
Organizations everywhere want to make similar use of their own data. Articles and conferences on "predictive analytics" and "data science" are popping up everywhere. Even the New York Times has been promoting careers in statistics as "cool".
But, are we learning the right lessons from the successes of Amazon et al? Should decision scientists be focusing their efforts on helping organizations make sense of the data they have?
Massive datasets are a byproduct of something the showcase "big data" companies do that is, arguably, more important—they crowdsource data in real-time. Both crowdsourcing and real-time data collection are valuable. Together they are dynamite.
Making one small team within the organization responsible for collecting and "cleaning" the "official" data limits the volume of data that can be collected and increases the possibility of bias. Crowdsourcing mitigates those problems—and is cheaper.
The quality of data decays over time. Different industries experience different decay rates, but basing decisions on old data risks missing fundamental changes. Obviously, lagging data is almost useless when responding to a crisis. Real-time data collection means decisions can take the immediate situation into account.
Before you ask how you can draw insights from your existing databases, it may be advantageous to ask how you can build higher quality databases in the first place.
Tools to assist decision-makers are increasingly drawing on existing data and then combining it with the decision-makers’ assumptions and beliefs to predict outcomes and suggest action. These assumptions and beliefs are then often discarded once the decision has been made. However, these are real-time insights from the front-line. Capturing and storing them would allow decision-makers to tap into the current views of their peers—and monitor shifts in these views over time.
In addition to designing decision-making tools to produce insights, we also need to design them to collect insights. The latter activity may be the real innovation.
Privacy
Of course, there are potential privacy implications to be considered in crowdsourcing data. However, collecting data about an organization (as opposed to individuals), in the course of paid employment, and with full disclosure, raises few privacy issues. It is similar to writing and publishing a business report.