IBM just published an article describing what they claim to be the top three spreadsheet errors of this decade---so far. Spreadsheets errors caused the sale price of Tibco Software to be overstated by $100m fatal errors in an oft-quoted fiscal austerity research project 10,000 sports fans to miss a synchronised swimming event at the 2012 Olympics Some companies I work with have started taking … [Read more...]
Spark 1.5.0 released
Spark 1.5.0 has now been released---and it's a significant one for the data science community. Databricks, in their announcement blog post, state Another major theme of this release is data science: Spark 1.5 ships several new machine learning algorithms and utilities, and extends Spark's new R API. Improvements of note include better coverage for the pipeline API and an MLlib API for … [Read more...]
Apple steps up recruitment of machine learning experts
Apple is looking to recruit another 86 artificial intelligence experts, according to an article on VentureBeat. The recruitment drive is due to concerns that they are falling behind Google, Amazon, Facebook and Microsoft in the area of machine learning. Competitors seem to be stealing a march on Apple by developing services that can anticipate users' requirements---services that rely on … [Read more...]
A walk through a Spark Random Forest
Learning Tree International have just published one of my articles on using Random Forest models with Spark. … [Read more...]
A visual introduction to machine learning
Interesting first post in a planned series that uses visualization to explain machine learning. … [Read more...]