Decision Mechanics

Insight. Applied.

  • Services
    • Decision analysis
    • Big data analysis
    • Software development
  • Articles
  • Blog
  • Privacy
  • Hire us

Machine learning at scale on HDInsight using Microsoft R Server at Spark

August 13, 2016 By editor

Microsoft have published an article on how to conduct a decision tree analysis using Microsoft R Server and Spark on Azure HDInsight.

Using four 8-core 28Gb RAM (D4) worker nodes they were able to process 170 million rows (37GB) in around 5 minutes. This was 20% faster than using Spark’s own MLLib libraries—although there’s no comparison with Spark’s newer ML libraries.

Print Friendly, PDF & Email

Share this:

  • Email
  • Twitter
  • LinkedIn
  • Facebook

Filed Under: General

Search

Subscribe to blog via e-mail

Subscribe via RSS

Recent posts

  • Data Wrangler
  • The Trolley Problem
  • Counting votes using Excel
  • Accuracy vs precision
  • It’s not because we have insufficient data…

Copyright © 2025 · Decision Mechanics Limited · info@decisionmechanics.com