Creating a Data-Driven Organization: Practical Advice from by Carl Anderson

What do you want to turn into a data-driven association? excess of having enormous information or a crack staff of unicorn info scientists, it calls for developing a good, deeply-ingrained info tradition. This functional publication exhibits you ways precise data-drivenness consists of strategies that require real buy-in throughout your organization, from analysts and administration to the C-Suite and the board. via interviews and examples from info scientists and analytics leaders in various industries, writer Carl Anderson explains the analytics price chain you want to undertake while construction predictive company models—from facts assortment and research to the insights and management that force concrete activities. you are going to study what works and what does not, and why making a data-driven tradition all through your company is vital.

If that web server typically serves a particular geographic region, you may be missing a dispro‐ portionate amount of data from a given set of ZIP codes, and that could significantly impact the analysis. A more biased scenario is the following: imagine that you send out a customer survey and give recipients two weeks to respond. Any responses that arrive after the cutoff are excluded from the analysis. Now imagine a set of customers who are suffering from shipping issues and receive their orders late.

As you can see, there are a lot of competing considerations that determine what new data source it makes sense to bring into the organization next. There is a delicate balance of the cost and com‐ plexity to deliver that new data versus the value that data provides to the analysts and the organization as a whole. —for deeper analytics, there is even greater value when you start to link up “adjacent” data items. What do I mean by that?

In many cases, especially in medical and social sciences, data is very expen‐ sive to collect, and you might only have one chance to collect it. Imagine collecting blood pressure from a patient on the third day of a clinical trial; you can’t go back and repeat that. A core problem, a catch-22 situation in fact, is that the smaller the sample size, the more precious each record is. However, the less data an imputation algorithm has to work with, the worse its predictions will be. A single missing value within a record can render the whole record useless.

