Skip to main content

Introducing our dedicated Jupyter cluster available to all Vanderbilt students, faculty and staff

Posted by on Thursday, January 30, 2020 in Website.

We are launching our dedicated Jupyter cluster, a big data service that is available to all Vanderbilt students, faculty, and staff, as well as VUMC employees, at no cost. You do not need to be a user of our traditional cluster to access the Jupyter cluster; all you need to do is sign in with your VUnetID and password. As ACCRE works extensively with the Compact Muon Solenoid (CMS) experiment at the CERN Large Hadron Collider, we are providing access to members of the CMS experiment outside of Vanderbilt as well.

This is distinct from our general compute cluster and has its own storage and compute resources that are dedicated to Spark and HDFS workloads.

In 2015 ACCRE received a TIPs grant, “A Trans-institutional Big Data Architecture at Vanderbilt” to develop a data-centric infrastructure and culture at Vanderbilt. As part of the grant we launched our first big data cluster to the public two years later. Since our launch we have seen Jupyter notebooks take off as a data analysis platform for academic research and data science. Notebooks provide programmers with the ability to combine code, documentation, analysis, and visualization inside a single document that is accessible from a web interface and therefore easy to share with colleagues. Our new Jupyter cluster extends our big data mission by allowing anyone to create a Jupyter notebook to analyze their data and share it with others.

The Jupyter cluster is designed for research and teaching. Already, nearly 500 unique users at Vanderbilt and CMS have tried out the service. Many classes have already used the Jupyter cluster for their coursework, including physics classes taught by Prof. Will Johns and big data classes taught by Profs. Daniel Fabbri and Abhishek Dubey. Over in CMS, it has been used for large-scale tutorials ranging from one hour to an entire week. The Jupyter cluster has also been used extensively for physics research, both at Vanderbilt and CMS, as well as cancer biology research: Akshitkumar Mistry and his lab have been using the cluster for over a year.

Which cluster is right for me?

If you don’t have an ACCRE account, if you are working with students without an ACCRE account (but have a VUnetID), or if you need to use Spark or HDFS, use our dedicated Jupyter cluster.

Keep in mind that the Jupyter cluster cannot access files you have on the traditional cluster (including your /home directory), nor can it access Lmod software packages or submit jobs. If you have an ACCRE account and need access to data or software on the cluster, use the traditional cluster instead. For ACCRE users who wish to use a Jupyter notebook and must access files or programs on the traditional cluster, we provide an option through the ACCRE Visualization Portal.

Learn more

Visit the Jupyter at ACCRE website for more information on how to get started.

The Jupyter cluster is supported through a TIPs grant led by Prof. Paul Sheldon. It is run by Andrew Melo, big data application developer at ACCRE who is based at Fermi National Accelerator Laboratory (“Fermilab”) in Batavia, IL.