Interactive Big Data Analytics with the Presto + Alluxio stack for the Cloud
As data analytic needs have increased with the explosion of data, the importance of the speed of analytics and the interactivity of queries has increased dramatically.
In this webinar, we will introduce the Starburst Presto, Alluxio, and Cloud object store stack for building a highly-concurrent and low-latency analytics platform. This stack provides a strong solution to run fast SQL across multiple storage systems including HDFS, S3 and others in public cloud, hybrid cloud and multi cloud environments.
You’ll learn about:
- The architecture of Presto, an open source distributed SQL engine, as well as innovations by Starburst like as it’s cost-based optimizer
- An overview of Alluxio, open-source distributed file system, core concepts, architecture and metadata and data paths
- How Presto can query data from cloud object storage like S3 at high performance and cost-effectively with Alluxio.
- How to achieve data locality and cross-job caching with Alluxio no matter where the data is persisted and reduce egress costs.
In addition, we’ll present some real world architectures & use cases from internet companies like JD.com and NetEase.com running the Presto and Alluxio stack at the scale of hundreds of nodes.
The live webinar has concluded. Register to receive access to the on demand video.
Speaker: Bin Fan
Bin Fan is the founding engineer of Alluxio, Inc. and the PMC member of Alluxio open source project. Prior to Alluxio, he worked for Google where he won the Technical Infrastructure Award. Bin received his Ph.D. in Computer Science from Carnegie Mellon University working on distributed systems.
VP, Community and Founding Member at Alluxio
Speaker: Matt Fuller
Matt Fuller is responsible for product engineering at Starburst Data, and has spent the past 10 years in the field of data warehousing and analytics. Prior to Starburst Data, Matt was Director of Engineering at Teradata leading engineering teams working on Presto. Prior to joining Teradata, Matt held product architect and tech lead positions at other distributed SQL technology pioneers, Hadapt (acquired by Teradata in 2014) and Vertica (acquired by HP).
Cofounder and VP of Engineering at Starburst
...open-source virtual distributed file system that provides a unified data access layer for hybrid and multi cloud deployments.
Alluxio resides between storage systems such as Amazon S3 or Apache HDFS and computations frameworks and applications such as Apache Spark or Presto.
With Alluxio, your data is centralized and applications have a single common interface and namespace for data access.
Alluxio is an...