Databricks have just published a free e-book entitled "Apache Spark Analytics Made Simple". Contents include An introduction to the Spark API for analytics Tips and tricks to simplify unified data access Real-world case studies of how various companies are using Spark with Databricks to transform their business There are more to come. Titles are Mastering Advanced Analytics with Apache … [Read more...]
17000 UK male pregnancies reported in 2012
A 2012 study of National Health Service data in the UK found that there were over 17000 male inpatient admissions to obstetric services over 8000 male inpatient admissions to gynecology nearly 20000 male inpatient admissions to midwifery Before jumping to the conclusion that the UK is at the forefront of an exciting/disturbing evolutionary trend we should probably look for a simpler … [Read more...]
Overview of Microsoft R Server
Learning Tree just published my overview of Microsoft R Server. … [Read more...]
The cost of bad data
Lemonly, a data visualization company, have produced an intriguing infographic on the cost of bad data. Read the full blog post. … [Read more...]
What Nathan Yau uses to visualize data
Nathan Yau of FlowingData has published an up-to-date list of what he uses to turn raw data into his impressive visualizations. What's always striking about the tools he uses is that there's no "quick fix". He utilizes a range of industry standard data manipulation (e.g. R) and graphic design (e.g. Adobe Illustrator) tools in his work. I particularly liked his comments on processing and … [Read more...]