Open Source Project Livy Eases Spark Consumption for Application Developers and Data Scientists by Enabling Fine-Grained Job Submission and Result Retrieval Over a Simple REST-based Interface Enabling New Use Cases and Architectures.
SPARK SUMMIT, SAN FRANCISCO, Calif. June 06, 2016 – Cloudera, the global provider of the fastest, easiest, and most secure data management and analytics platform built on Apache Hadoop and the latest open source technologies, today announced it has worked in joint collaboration with Microsoft to reduce the burden on application developers leveraging Spark. Cloudera and Microsoft, together with other open source contributors, have built a new open source Apache licensed REST-based Spark Service, called Livy, which is still in early alpha development.
Livy provides an easy way for applications to interface with Spark, submit jobs and programmatically retrieve results. At its core, Livy is a REST server for submitting, running, and managing Spark jobs and contexts. Its Client API enables fine grained Spark job submission and retrieval of results synchronously or asynchronously. Clients can consume Spark like a multi-tenant service, and not have to worry about deployment, configuration or monitoring. Livy provides Spark as a multi-tenant service with session isolation, security and user-impersonation.
Livy’s key benefits include:
- Reduced Friction in Spark Consumption - Each client of Spark need not go through a Spark installation or configuration process to get started. Only a lightweight client that talks to an HTTP endpoint is needed.
- Enabling Third-Party Applications to Use Spark - Applications can build with REST-based client APIs in Java, Scala and Python for fine-grained Spark job submission, result retrieval and management of SparkContexts (the Scala and Python client APIs are under development). Spark can be invoked by applications written in diverse frameworks like Django for Python, Play for Scala or Java. Moreover, because it is REST-based, with a little work, you can also leverage Livy from applications written in languages like Node.js or Go.
- Enabling of New Architectures - Livy makes it easy to integrate Spark into service oriented- or microservices-based architectures, which primarily interact through REST.
“Microsoft is focused on simplifying big data and advanced analytics to make technologies like Apache Hadoop and Spark available for everybody,” said Tiffany Wissner, senior director of Data Platform Marketing at Microsoft. “The collaboration on Project Livy was able to make interacting with Spark easier for developers through a REST web service and able to make Spark enterprise-ready as a robust back-end for running interactive notebooks.”
“Spark gives you fast big data processing with a general purpose flexible API. We see a natural tendency among our customers and partners to want to leverage Spark’s capabilities from client applications that can easily interface with Spark, and Livy makes that possible,” said Anand Iyer, senior product manager at Cloudera. “Livy will open Spark to new use cases, and we are hoping it attracts a community of developers that will not only build applications on top of Livy, but also contribute to it, help shape its API and enhance its functionality. It is still a very nascent project, and hence any contribution will have tremendous impact.”
Contact Us and Learn More at Spark Summit West
Cloudera will be attending Spark Summit West 2016 from June 6–8 at the Hilton San Francisco Union Square. Additionally, Cloudera will be presenting at the show.
- Cloudera and Microsoft to present Livy at Spark Summit:
- Date and Time: Tuesday, June 7 from 4:50 – 5:20 PM PT
- Location: Imperial Room
- Session: LIVY: A REST WEB SERVICE FOR APACHE SPARK
- Presenters: Anand Iyer (Cloudera) and Pravin Mittal (Microsoft Corporation)
- Cloudera to present a Developer session at Spark Summit:
- Date and Time: Tuesday, June 7 from 4:15 – 4:45 PM PT
- Location: Ballroom B
- Session: HIGH-PERFORMANCE PYTHON ON SPARK
- Presenter: Wes McKinney (Cloudera)
- Doug Cutting, Cloudera’s chief architect and co-founder of Hadoop, to deliver a keynote:
- Date and Time: Wednesday, June 8 from 9:30 – 9:40 AM PT
- Location: Ballroom A
Cloudera delivers the modern data management and analytics platform built on Apache Hadoop and the latest open source technologies. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform available for the modern world. Our customers efficiently capture, store, process and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training and professional services. Learn more athttp://cloudera.com.
Connect with Cloudera
About Cloudera: cloudera.com/about
Follow us on Twitter: twitter.com/cloudera
Visit us on Facebook: facebook.com/cloudera
Join the Cloudera Community: community.cloudera.com
Cloudera, Cloudera's Platform for Big Data, Cloudera Enterprise Data Hub Edition, Cloudera Enterprise Flex Edition, Cloudera Enterprise Basic Edition, Cloudera Navigator Optimizer and CDH are trademarks or registered trademarks of Cloudera Inc. in the United States, and in jurisdictions throughout the world. All other company and product names may be trademarks of their respective owners.