A colleague of mine cautions that performance ratings say more about the marriage of the person doing the assessing than the performance of the person being assessed. Turns out she may have a point. Most people have some experience with performance appraisals. Maybe as part of an annual salary review. Or even just completing a customer satisfaction survey. It's become a pretty ubiquitous process … [Read more...]
Spurious correlations
Most people know that correlation doesn't mean causation. Some people are fed up of hearing it. When there are studies showing that people who have more sex earn more money you can see why people really want to make the inference. I find that most of the much-maligned link bait articles reporting fascinating correlations don't actually claim any causality. They leave that to the febrile mind of … [Read more...]
Non-transitive dice
Just took delivery of my non-transitive dice. Adding a bit of fun to my statistics talks. … [Read more...]
The 5 most downloaded R packages
DataCamp have published an article on the five R packages with the most (direct) downloads. This is based on their leaderboard. Packages 3-5 are currently swapping positions. As I write this (18 November 2016) the top five are dplyr devtools ggplot2 cluster foreign It's notable that the list of the most popular packages is heavily weighted towards the manipulation and display of data. This is … [Read more...]
Public data sources
Data science requires data. Yep. Insightful. Unless you work at a data-rich organization, data can be hard to obtain. You may want to try out a new technique or tool. Alternatively, you may need additional data to fuse with your own limited in-house data. In either case, Nathan Yau's updated list of public data sources might help. He lists sources for the following types of … [Read more...]