FiveThirtyEight are sharing the data and code behind some of their articles. A goldmine for those wishing to learn more about data science.
R comes with a range datasets that can be used when learning the basics or trying out a new approach/package. mtcars and weather` are popular choices. However, most of the common datasets are “toy” examples. They are great for practising basic techniques, but are useless when it comes to realistically simulating data science tasks. The […]
Choosing the right type of chart is an essential part of producing an effective data visualization. It’s pointless adding bells and whistles to something that’s fundamentally unsuited to the message you are trying to convey. The Financial Times Visual Journalism team have a Visual Vocabulary tool that helps them choose the correct chart for a […]
Karl Broman and Kara Woo offer some good advice on organizing data in spreadsheets. They advocate confining the use of spreadsheets to data entry and storage—moving calculations and visualizations to other tools. This certainly avoids some of the biggest problems with using spreadsheets. However, spreadsheets don’t enforce any discipline. It’s up to the user to […]
I have a lot of sympathy for the view expressed in the following tweet Good CS expert says: Most firms that thinks they want advanced AI/ML really just need linear regression on cleaned-up data. — robin hanson (@robinhanson) November 28, 2016 Many organizations who dive into machine learning haven’t even started to extract value from […]