Spark 1.5.0 has now been released---and it's a significant one for the data science community. Databricks, in their announcement blog post, state Another major theme of this release is data science: Spark 1.5 ships several new machine learning algorithms and utilities, and extends Spark's new R API. Improvements of note include better coverage for the pipeline API and an MLlib API for … [Read more...]
Apple steps up recruitment of machine learning experts
Apple is looking to recruit another 86 artificial intelligence experts, according to an article on VentureBeat. The recruitment drive is due to concerns that they are falling behind Google, Amazon, Facebook and Microsoft in the area of machine learning. Competitors seem to be stealing a march on Apple by developing services that can anticipate users' requirements---services that rely on … [Read more...]
A walk through a Spark Random Forest
Learning Tree International have just published one of my articles on using Random Forest models with Spark. … [Read more...]
A visual introduction to machine learning
Interesting first post in a planned series that uses visualization to explain machine learning. … [Read more...]
Data science and statistics
Prolific R developer Hadley Wickham provided an interesting perspective on data science and statistics in a recent Priceonomics article. There are definitely some academic statisticians who just don't understand why what I do is statistics, but basically I think they are all wrong. What I do is fundamentally statistics. The fact that data science exists as a field is a colossal failure of … [Read more...]