DataCamp have published an article on the five R packages with the most (direct) downloads. This is based on their leaderboard. Packages 3-5 are currently swapping positions. As I write this (18 November 2016) the top five are dplyr devtools ggplot2 cluster foreign It's notable that the list of the most popular packages is heavily weighted towards the manipulation and display of data. This is … [Read more...]
2016 Retail Executive Survey highlights big data capability gap
FTI Consulting has published the results of its 2016 Retail Executive Survey. 100 C-suite executives were asked to rank 15 strategic priorities that they deemed to be "essential" or "high priority". While big data was 14th on this list, it's important to remember that all were considered to be at least "high priority". The story looks different when the executives were asked about their … [Read more...]
Statistics books to read for pleasure
I've read quite a few (really) dry, technical books in my time. But even I was shocked to see an article entitled "Statistics Books to Read for Pleasure". Isn't that just a step too far? However, it reminded me of one excellent book that should be required reading after this month's polling meltdown. "Everydata: The Misinformation Hidden in the Little Data You Consume Every Day" is an excellent, … [Read more...]
Five big data security challenges
Learning Tree have just published my article describing five of the main security challenges facing those who have, or are contemplating, big data deployments. … [Read more...]
Public data sources
Data science requires data. Yep. Insightful. Unless you work at a data-rich organization, data can be hard to obtain. You may want to try out a new technique or tool. Alternatively, you may need additional data to fuse with your own limited in-house data. In either case, Nathan Yau's updated list of public data sources might help. He lists sources for the following types of … [Read more...]