Alice Kaerast from the Sky Betting & Gaming Engineering team has produced a comprehensive list of big data news sources, including blogs podcasts newsletters There were a couple of podcasts in it that I've added to my listening list. High praise indeed. I wish more sources would make their content available via RSS feeds. I really don't want news and articles in my inbox. If I can't add … [Read more...]
Scaling knowledge
The data team at Airbnb have written an interesting article on how to manage data science research as you bring more and more people on board. They developed an internal process and tool based on five key tenets Reproducibility---There should be no opportunity for code forks. The entire set of queries, transforms, visualizations, and write-up should be contained in each contribution and be up … [Read more...]
Idiosyncratic Rater Effect
A colleague of mine cautions that performance ratings say more about the marriage of the person doing the assessing than the performance of the person being assessed. Turns out she may have a point. Most people have some experience with performance appraisals. Maybe as part of an annual salary review. Or even just completing a customer satisfaction survey. It's become a pretty ubiquitous process … [Read more...]
Spurious correlations
Most people know that correlation doesn't mean causation. Some people are fed up of hearing it. When there are studies showing that people who have more sex earn more money you can see why people really want to make the inference. I find that most of the much-maligned link bait articles reporting fascinating correlations don't actually claim any causality. They leave that to the febrile mind of … [Read more...]
Non-transitive dice
Just took delivery of my non-transitive dice. Adding a bit of fun to my statistics talks. … [Read more...]