You are here

High Performance Analytics Improves Productivity for Busy Research Institution

Revolution R Enterprise with Microsoft HPC Server 2008 Outperforms University’s Alternatives

Background

Erik Segur, Information Technologist, runs a mission-critical IT environment to Michigan State University’s Department of Statistics and Probability. Hundreds of researchers including professors and graduate students rely on the infrastructure he manages to process their analytics work. Revolution R Enterprise 5.0 delivered two important improvements for the Michigan State University’s Department of Statistics and Probability Statistical Computing Cluster compared to previous versions of Revolution R Enterprise and the Open Source distribution of R. They are the ability to automate the schedule of jobs to be run on the Statistical Computing Cluster and performance improvements.

Challenge

Automated Job Scheduling Increases Throughput of the Statistical Computing Cluster Fourfold

Prior to using Revolution R Enterprise 5.0, analytics jobs had to be scheduled manually on the MSFT HPC platform, so Erik was required to constantly check the status of the Statistical Computing Cluster to make sure everything was still running and to start a new job if the system was available. To complicate matters, the system did not close a stopped or killed session properly and there was no way to know the duration of a job prior to it being run. The manual scheduling process caused downtime on the machine because if the job ended shortly after Erik left for the day, a new job would not be started until the following day. Further, it’s common for professors to be unavailable until the afternoon due to teaching commitments, so this powerful, valuable computing resource could remain idle for nearly 24 hours.

There has been a 4x increase in the number of jobs being run on the Statistical Computing Cluster due to the improvements delivered with Revolution R Enterprise 5.0. “We were able to run more than 800 jobs in a semester using Revolution R Enterprise 5.0, yet without it we would have only been able to complete about 200 in the same amount of time. This is awesome. People are very happy.” Erik also said that he now logs into the system on a weekly basis rather than multiple times a day.

Solution

Revolution R Enterprise with Microsoft HPC Server 2008 Outperforms University’s Alternatives

The performance advantages of the Revolution R Enterprise 5.0 framework-powered Statistical Computing Cluster are also a stark contrast to the other computing resources available to researchers at the university, which includes the single core Open Source R running in the university’s HPCC. “There’s a tremendous benefit to running the analysis in a multi-core, multi-threaded environment. I worked with a researcher who had a model that took 3.5 months to run on the HPCC but with our Statistical Computing Cluster and Revolution R Enterprise, we got it down to just over a week. This is phenomenal because our users are able to produce more data and refine their findings because they have so much more time to run many more iterations of analysis.”

Results

Revolution Analytics’ R Productivity Environment (RPE) Improves Researchers’ Productivity

Many of the university’s researchers had been using Open Source R running on the HPCC and on their personal desktops. Once they learned about the performance improvements provided by the Department of Statistics and Probability’s Statistical Computing Cluster, demand grew for it. Erik ran a few training classes for the researchers to show them how they could easily use the Revolution Analytics RPE to convert their R code to be run in the Revolution R Enterprise 5.0 cluster. This has resulted in far fewer requests for help from the researchers, who are now able to do the work on their own using the Revolution Analytics RPE.

The Bottom Line

Erik reports that demand for Statistical Computing Cluster resources has boomed. “The Revolution R Enterprise 5.0 environment has delivered order of magnitude performance improvements, which has allowed our department to process four times the amount of analytics jobs. The researchers are now able to run, evaluate, modify and re-run their models multiple times to get more precise conclusions. It’s been amazing.”

About Company

About Revolution Analytics

Revolution Analytics was founded in 2007 to foster the R community, as well as support the growing needs of commercial users. Our name derives from combining the letter "R" with the word "evolution." It speaks to the ongoing development of the R language from an open-source academic research tool into commercial applications for industrial use.

Though our Revolution R products, we aim to make the power of predictive analytics accessible to every type of user & budget. We provide free and premium software and services that bring high-performance, productivity and ease-of-use to R – enabling statisticians and scientists to derive greater meaning from large sets of critical data in record time.  

We also offer our full-featured production-grade software to the academic community for FREE, in order to support the continued spread of R's popularity to the next generation of analysts. 

For customers such as Pfizer, Novartis, Yale Cancer Center, Bank of America and others, our flagship Revolution R Enterprise product stands for faster drug development, reduced time of data analysis, and more powerful and efficient financial models.

About Revolution Analytics

Revolution Analytics was founded in 2007 to foster the R community, as well as support the growing needs of commercial users. Our name derives from combining the letter "R" with the word "evolution." It speaks to the ongoing development of the R language from an open-source academic research tool into commercial applications for industrial use. 

Though our Revolution R products, we aim to make the power of predictive analytics accessible to every type of user & budget. We provide free and premium software and services that bring high-performance, productivity and ease-of-use to R – enabling statisticians and scientists to derive greater meaning from large sets of critical data in record time.  

We also offer our full-featured production-grade software to the academic community for FREE, in order to support the continued spread of R's popularity to the next generation of analysts. 

For customers such as Pfizer, Novartis, Yale Cancer Center, Bank of America and others, our flagship Revolution R Enterprise product stands for faster drug development, reduced time of data analysis, and more powerful and efficient financial models.