Interactive Big Data Analytics with the Presto + Alluxio stack for the Cloud
As data analytic needs have increased with the explosion of data, the importance of the speed of analytics and the interactivity of queries has increased dramatically.
In this webinar, we will introduce the Starburst Presto, Alluxio, and Cloud object store stack for building a highly-concurrent and low-latency analytics platform. This stack provides a strong solution to run fast SQL across multiple storage systems including HDFS, S3 and others in public cloud, hybrid cloud and multi cloud environments.
You’ll learn about:
- The architecture of Presto, an open source distributed SQL engine, as well as innovations by Starburst like as it’s cost-based optimizer
- An overview of Alluxio, open-source distributed file system, core concepts, architecture and metadata and data paths
- How Presto can query data from cloud object storage like S3 at high performance and cost-effectively with Alluxio.
- How to achieve data locality and cross-job caching with Alluxio no matter where the data is persisted and reduce egress costs.
In addition, we’ll present some real world architectures & use cases from internet companies like JD.com and NetEase.com running the Presto and Alluxio stack at the scale of hundreds of nodes.
See the on-demand video now.
Speaker: Bin Fan
Bin Fan is the founding engineer of Alluxio, Inc. and the PMC member of Alluxio open source project. Prior to Alluxio, he worked for Google where he won the Technical Infrastructure Award. Bin received his Ph.D. in Computer Science from Carnegie Mellon University working on distributed systems.
VP, Community and Founding Member at Alluxio
Speaker: Matt Fuller
Matt Fuller is responsible for product engineering at Starburst Data, and has spent the past 10 years in the field of data warehousing and analytics. Prior to Starburst Data, Matt was Director of Engineering at Teradata leading engineering teams working on Presto. Prior to joining Teradata, Matt held product architect and tech lead positions at other distributed SQL technology pioneers, Hadapt (acquired by Teradata in 2014) and Vertica (acquired by HP).
Cofounder and VP of Engineering at Starburst
...a data orchestration layer for compute in any cloud. It unifies data silos on-premise and across any cloud to give you data locality, accessibility, and elasticity.
Whether it’s accelerating big data frameworks on the public cloud, running big data workloads in hybrid cloud environments, or enabling big data on object stores or multiple clouds, Alluxio reduces the complexities associated with orchestrating data for today’s big data and AI/ML workloads.