Monday, 29 September 2014

Collaborative Mathematics with SageMathCloud and Google Cloud Platform



(cross-posted on the Google for Education blog and Google Cloud Platform blog)

Modern mathematics research is distinguished by its openness. The notion of "mathematical truth" depends on theorems being published with proof, letting the reader understand how new results build on the old, all the way down to basic mathematical axioms and definitions. These new results become tools to aid further progress.

Nowadays, many of these tools come either in the form of software or theorems whose proofs are supported by software. If new tools produce unexpected results, researchers must be able to collaborate and investigate how those results came about. Trusting software tools means being able to inspect and modify their source code. Moreover, open source tools can be modified and extended when research veers in new directions.

In an attempt to create an open source tool to satisfy these requirements, University of Washington Professor William Stein built SageMathCloud (or SMC). SMC is a robust, low-latency web application for collaboratively editing mathematical documents and code. This makes SMC a viable platform for mathematics research, as well as a powerful tool for teaching any mathematically-oriented course. SMC is built on top of standard open-source tools, including Python, LaTeX, and R. In 2013, William received a 2013 Google Research Award which provided Google Cloud Platform credits for SMC development. This allowed William to extend SMC to use Google Compute Engine as a hosting platform, achieving better scalability and global availability.
SMC allows users to interactively explore 3D graphics with only a browser
SMC has its roots in 2005, when William started the Sage project in an attempt to create a viable free and open source alternative to existing closed-source mathematical software. Rather than starting from scratch, Sage was built by making the best existing open-source mathematical software work together transparently and filling in any gaps in functionality.

During the first few years, Sage grew to have about 75K active users, while the developer community matured with well over 100 contributors to each new Sage release and about 500 developers contributing peer-reviewed code.

Inspired by Google Docs, William and his students built the first web-based interface to Sage in 2006, called The Sage Notebook. However, The Sage Notebook was designed for a small number of users and would work for a small group (such as a single class), but soon became difficult to maintain for larger groups, let alone the whole web.

As the growth of new users for Sage began to stall in 2010, due largely to installation complexity, William turned his attention to finding ways to expand Sage's availability to a broader audience. Based on his experience teaching his own courses with Sage, and feedback from others doing the same, William began building a new Web-hosted version of Sage that can scale to the next generation of users.

The result is SageMathCloud, a highly distributed multi-datacenter application that creates a viable way to do computational mathematics collaboratively online. SMC uses a wide variety of open source tools, from languages (CoffeeScript, node.js, and Python) to infrastructure-level components (especially Cassandra, ZFS, and bup) and a number of in-browser toolkits (such as CodeMirror and three.js).

Latency is critical for collaborative tools: like an online video game, everything in SMC is interactive. The initial versions of SMC were hosted at UW, at which point the distance between Seattle and far away continents was a significant issue, even for the fastest networks. The global coverage of Google Cloud Platform provides a low-latency connection to SMC users around the world that is both fast and stable. It's not uncommon for long-running research computations to last days, or even weeks -- and here the robustness of Google Compute Engine, with machines live-migrating during maintenance, is crucial. Without it, researchers would often face multiple restarts and delays, or would invest in engineering around the problem, taking time away from the core research.

SMC sees use across a number of areas, especially:

  • Teaching: any course with a programming or math software component, where you want all your students to be able to use that component without dealing with the installation pain. Also, SMC allows students to easily share files, and even work together in realtime. There are dozens of courses using SMC right now.
  • Collaborative Research: all co-authors of a paper can work together in an SMC project, both writing the paper there and doing research-level computations.

Since launching SMC in May 2013, there are already more than 20,000 monthly active users who've started using Sage via SMC. We look forward to seeing if SMC has an impact on the number of active users of Sage, and are excited to learn about the collaborative research and teaching that it makes possible.

No comments:

Post a Comment