Data science requires data. Yep. Insightful. Unless you work at a data-rich organization, data can be hard to obtain. You may want to try out a new technique or tool. Alternatively, you may need additional data to fuse with your own limited in-house data. In either case, Nathan Yau's updated list of public data sources might help. He lists sources for the following types of … [Read more...]
RStudio 1.0 released
RStudio have released version 1.0 of their eponymous R IDE. They are calling it their ...biggest [release] ever! It certainly has a number of very significant features. Integrated support for Spark Spark and R are core tools for data scientists. While Spark has an R API, support for the machine learning libraries is lagging. So, it's great to hear that RStudio now has integrated support … [Read more...]
R at Microsoft
David Smith, R Community Lead at Microsoft, talks about how they are using R. He covers both how it is being integrated into the product line and how it is used internally to analyse operational data. … [Read more...]
Machine learning algorithm cheat sheet
The recent explosion of interest in machine learning has resulted in a profusion of algorithms. It can be difficult to know which one is most suited to your problem. Recognizing this challenge Microsoft have produced a machine learning algorithm cheat sheet. It's designed to allow you to choose between the algorithms available in Microsoft's Azure Machine Learning Studio, but, as many of the … [Read more...]
ScaleR package now available as part of free Microsoft R Client
The ScaleR package provides functions for performing scalable and extremely high performance data management, analysis, and visualization in R. It was only available to those who had a Microsoft R Server license---until now. With the introduction of the free Microsoft R Client for Windows tool you can now work with the full set of ScaleR functions without having to part with a cent. Of course, … [Read more...]