In a recent interview, Cameron Davies, head of corporate decision sciences for NBCUniversal, discussed the importance of data science being led by the needs of the business. He discussed "data culture" and the tendency for data scientists to focus on the technology while losing sight of the wider context. Davies points out that decision-makers want to know How does working with you, your … [Read more...]
Python in the Economist
The Economist has an article devoted to the popularity of the Python programming language. Citibank, Bain & Company and Boston Consulting Group are using it for their data science work. As Guido Van Rossum, the inventor of Python, reportedly said Python has become the language of choice for AI researchers [...] … [Read more...]
Microsoft Research Open Data
Microsoft Research have released over 50 free data sets via their Open Data site. They include 38 million tweets from the 2012 US presidential election Profiles of 1 million celebrities (1000 with images) Are there actually 1 million celebrities now?! Maybe someone can analyze the data to confirm. As in all data science we'll need to start with firming up our definitions of terms---e.g … [Read more...]
FiveThirtyEight data
FiveThirtyEight are sharing the data and code behind some of their articles. A goldmine for those wishing to learn more about data science. … [Read more...]
Real-world datasets for learning data science in R
R comes with a range datasets that can be used when learning the basics or trying out a new approach/package. mtcars and weather` are popular choices. However, most of the common datasets are "toy" examples. They are great for practising basic techniques, but are useless when it comes to realistically simulating data science tasks. The dslabs package provides datasets more suited to exploring … [Read more...]