RStudio have released version 1.0 of their eponymous R IDE. They are calling it their
…biggest [release] ever!
It certainly has a number of very significant features.
Integrated support for Spark
Spark and R are core tools for data scientists. While Spark has an R API, support for the machine learning libraries is lagging.
So, it’s great to hear that RStudio now has integrated support for Spark and the sparklyr package. sparklyr provides extensive access to Spark’s Machine Learning Library (MLlib) and, through the rsparkling extension package, access to H2O’s distributed machine learning algorithms.
RStudio can be used to manage connections to Spark and run R functions on data held in the cluster. Data is read and transformed using Hadley Wickham’s excellent dplyr data manipulation package.
R Notebooks
R Notebooks allow the creation of documents where computation can be interspersed with narrative. Code can be executed interactively and the document updated accordingly. Readers of an R Notebook can modify the code in-place, execute it and see the new output—e.g. an updated chart. This is a particularly powerful tool for teaching R and data science.
Code profiling
I’ve used the profvis package many times to rescue clients from an analysis tool that takes hours to run. profviz provides an interactive graphical display of where you R code is spending time or eating memory.
This has now been integrated into RStudio, so you can select a block of code, click a menu option and see a visual representation of your code’s performance characteristics.
What are you waiting for?
RStudio 1.0 is free and available now on Linux, OS X and Windows. Why are you still reading this? Go and download it.