I recently conducted an inter-rater reliability study for a client. There was some confusion about what this measures. Inter-rater reliability measures agreement. It's a measure of precision, not accuracy. As anyone who's been on social media knows, it's possible for everyone to be in complete agreement, yet utterly wrong. The following diagram summarises the difference between precision and … [Read more...]
It’s not because we have insufficient data…
In 1998, Neil Postman wrote critically about the Age of Information. If there are children starving in the world—and there are—it is not because of insufficient information. [...] If there is violence on our streets, it is not because we have insufficient information. If women are abused, if divorce and pornography and mental illness are increasing, none of it has anything to do with insufficient … [Read more...]
Large Language Models
Stephen Wolfram has written a comprehensive description of how Large Language Models (LLMs), such as ChatGPT, work. … [Read more...]
Spreadsheet disasters
Matt Parker describes what happens when spreadsheets go wrong in the BBC's "More or Less: Behind the Stats" podcast. I've been writing about this for over a decade and, sadly, little progress had been made. The cartoon is from xkcd, obviously. … [Read more...]
10 ways to mislead with data visualization
PolicyViz have published a useful article on errors you really need to avoid when creating data visualizations. … [Read more...]