Decision Mechanics

Insight. Applied.

  • Services
    • Decision analysis
    • Big data analysis
    • Software development
  • Articles
  • Blog
  • Privacy
  • Hire us

Data science and statistics

July 30, 2015 By editor

Prolific R developer Hadley Wickham provided an interesting perspective on data science and statistics in a recent Priceonomics article.

There are definitely some academic statisticians who just don’t understand why what I do is statistics, but basically I think they are all wrong. What I do is fundamentally statistics. The fact that data science exists as a field is a colossal failure of statistics. To me, that is what statistics is all about. It is gaining insight from data using modelling and visualization. Data munging and manipulation is hard and statistics has just said that’s not our domain.

This insight is at the heart of why the only way to get good at data science is to do it. Obtaining and preparing data prior to analysis is the bulk of a data scientist’s work. But, it’s not a simple concept that you can tie up with a nice neat bow. It’s a messy, convoluted process involving

  • trial and error
  • multiple, incompatible tools
  • missing information
  • organizational silos
  • quality issues
  • etc

It’s very difficult to cover this kind of stuff in a book chapter, or a traditional lecture. It’s like trying to teach automotive maintenance without putting on overalls—all makes perfect sense until you attempt to change the pistons.

Print Friendly, PDF & Email

Share this:

  • Email
  • Twitter
  • LinkedIn
  • Facebook

Filed Under: Data analysis, Data science, Decision science

Search

Subscribe to blog via e-mail

Subscribe via RSS

Recent posts

  • Accuracy vs precision
  • It’s not because we have insufficient data…
  • Large Language Models
  • Spreadsheet disasters
  • 10 ways to mislead with data visualization

Copyright © 2023 · Decision Mechanics Limited · info@decisionmechanics.com