Customers who viewed this item also viewed
Used - Like New
$53.73$53.73
Ships from: SuperBookDeals--- Sold by: SuperBookDeals---
Sorry, there was a problem.
There was an error retrieving your Wish Lists. Please try again.Sorry, there was a problem.
List unavailable.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the author
OK
Apache Polaris: The Definitive Guide: Enriching Apache Iceberg Data Lakehouses with an Open Source Catalog 1st Edition
Purchase options and add-ons
Revolutionize your understanding of modern data management with Apache Polaris (incubating), the open source catalog designed for data lakehouse industry standard Apache Iceberg. This comprehensive guide takes you on a journey through the intricacies of Apache Iceberg data lakehouses, highlighting the pivotal role of Iceberg catalogs.
Authors Alex Merced, Andrew Madson, and Tomer Shiran explore Apache Polaris's architecture and features in detail, equipping you with the knowledge needed to leverage its full potential. Data engineers, data architects, data scientists, and data analysts will learn how to seamlessly integrate Apache Polaris with popular data tools like Apache Spark, Snowflake, and Dremio to enhance data management capabilities, optimize workflows, and secure datasets.
- Get a comprehensive introduction to Iceberg data lakehouses
- Understand how catalogs facilitate efficient data management and querying in Iceberg
- Explore Apache Polaris's unique architecture and its powerful features
- Deploy Apache Polaris locally, and deploy managed Apache Polaris from Snowflake and Dremio
- Perform basic table operations on Apache Spark, Snowflake, and Dremio
- ISBN-13979-8341608146
- Edition1st
- PublisherO'Reilly Media
- Publication dateOctober 21, 2025
- LanguageEnglish
- Dimensions7 x 2 x 9.19 inches
- Print length258 pages
Frequently bought together

Deals on related products
Customers also bought or read
- Apache Iceberg: The Definitive Guide: Data Lakehouse Functionality, Performance, and Scalability on the Data Lake
Paperback$44.94$44.94FREE delivery Thu, Apr 23 - Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems#1 Best SellerData Modeling & Design
Paperback$56.86$56.86FREE delivery Thu, Apr 23 - Practical Lakehouse Architecture: Designing and Implementing Modern Data Platforms at Scale
Paperback$45.39$45.39FREE delivery Thu, Apr 23 - Managing Data as a Product: Design and build data-product-centered socio-technical architectures
Paperback$28.87$28.87Delivery Thu, Apr 23 - The Enterprise Data Catalog: Improve Data Discovery, Ensure Data Governance, and Enable Innovation
Paperback$41.13$41.13FREE delivery Thu, Apr 23 - Terraform: Up and Running: Writing Infrastructure as Code
Paperback$42.87$42.87FREE delivery Thu, Apr 23 - LLM Engineer's Handbook: Master the art of engineering large language models from concept to production
Paperback$44.99$44.99FREE delivery Thu, Apr 23 - Building Medallion Architectures: Designing with Delta Lake and Spark
Paperback$39.49$39.49FREE delivery Thu, Apr 23 - Learning LangChain: Building AI and LLM Applications with LangChain and LangGraph
Paperback$56.06$56.06FREE delivery Thu, Apr 23 - Unlocking Data with Generative AI and RAG: Enhance generative AI systems by integrating internal data with large language models using RAG
Paperback$44.99$44.99FREE delivery Thu, Apr 23
From the brand
-
Databases, data science & more
-
Data Science
-
Data Visualization
-
Databases
-
Streaming
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
From the Publisher
From the Preface
Welcome to Apache Polaris: The Definitive Guide. This book is designed to guide you through the journey of building and managing scalable, secure, and flexible data lakehouses with Apache Polaris, an innovative, community-driven catalog project. As data lakehouses continue to evolve, Polaris represents the next generation of catalog solutions, offering unified data management, role-based access control, and multi-catalog support, all while promoting open standards and interoperability across cloud and on-premise environments.
The story of Apache Polaris begins with the data lakehouse architecture and the critical role that Apache Iceberg plays in making data lakehouses performant, reliable, and accessible. In the first part of this book, we’ll dive deep into the origins and architecture of data lakehouses, explore the challenges they were designed to solve, and walk through the capabilities that Apache Iceberg brings to modern data lakes. As data becomes increasingly central to all aspects of business operations, Iceberg’s robust table format has emerged as an essential tool for managing data at scale, providing essential features like ACID transactions, schema evolution, and efficient querying. We’ll also look at how Iceberg catalogs originally developed to bring this table format to life, allowing data lakehouses to become more accessible and consistent.
But even with the power of Iceberg, the need for a new generation of catalogs has grown clearer. Chapter 2 introduces the diverse world of Iceberg catalogs, highlighting their unique advantages and the challenges that come with having multiple catalog options. From file-based catalogs to service-driven solutions, you’ll see how each catalog provides unique features but also introduces complexity, especially when deployed across diverse environments and data tools. This leads us to the Apache Iceberg REST Catalog Specification, which was developed to streamline client interactions across catalog implementations, making cross-language support and integration with managed services simpler and more consistent.
The foundation of Polaris builds on this REST specification, taking it further by tackling some of the most pressing challenges in data management today. In Part II, we’ll explore Apache Polaris as a new kind of Iceberg catalog. Polaris brings a multi-catalog architecture, enabling organizations to maintain multiple catalogs with distinct roles and access controls, ensuring that each catalog serves its specific purpose while being centrally governed. Additionally, Polaris allows users to connect external catalogs that support the REST Spec, creating a unified environment where Iceberg tables are discoverable across catalog systems. In this part, you’ll gain a deeper understanding of Polaris’s security model, including role-based access control (RBAC), and learn best practices for managing permissions at scale. We’ll also delve into Git-for-Data, a unique ecosystem feature that allows for versioned data operations, branching, and tagging—powerful capabilities that make data versioning as straightforward as software versioning.
In Part III, we take a hands-on approach to working with Polaris, starting with deployment and configuration in Chapter 6. Here, you’ll learn how to set up Polaris locally, manage multiple catalogs, configure access roles, and integrate security controls. The following chapters provide practical guides on using Polaris with popular data tools, including Apache Spark, Snowflake, and Dremio. These chapters will walk you through setting up connections, executing queries, managing data, and utilizing each tool’s unique capabilities, demonstrating how Polaris can serve as the backbone of a robust, tool-agnostic data lakehouse environment.
Keep in mind that Apache Polaris, like any technology, will evolve. As things change, we will aim to reflect those updates in the book’s companion GitHub repository.
By the end of this book, you’ll be well-equipped to leverage the full power of Apache Polaris in your data lakehouse architecture. You’ll understand the theory and architecture behind catalogs and the practical steps needed to deploy Polaris as a central, scalable, and secure solution for data management. Whether you’re a data engineer, architect, or analyst, Apache Polaris: The Definitive Guide will provide the insights and tools you need to take your data lakehouse to the next level.
Editorial Reviews
About the Author
Andrew Madson is an experienced data leader with 17 years of experience leading technical teams. Currently the Head of Evangelism and Education at Tobiko - the creators of SQLMesh and SQLGlot, Andrew has held senior leadership positions at institutions such as JP Morgan, LPL Financial, MassMutual, and Arizona State University. In addition to leading data teams, Andrew is a professor of data science and analytics at several universities, where he teaches graduate courses in machine learning, statistics, SQL, R, Python, Tableau, and Power BI.
Tomer Shiran is the Founder and Chief Product Officer of Dremio, an open data lakehouse platform that enables companies to run analytics in the cloud without the cost, complexity and lock-in of data warehouses. As the company's founding CEO, Tomer built a world-class organization that has raised over $400M and now serves hundreds of the world's largest enterprises, including 3 of the Fortune 5. Prior to Dremio, Tomer was the 4th employee and VP Product of MapR, a Big Data analytics pioneer. He also held numerous product management and engineering roles at Microsoft and IBM Research, founded several websites that have served millions of users and hundreds of thousands of paying customers, and is a successful author and presenter on a wide range of industry topics. He holds an MS in Computer Engineering from Carnegie Mellon University and a BS in Computer Science from Technion - Israel Institute of Technology.
Product details
- ASIN : B0FBRJ7J1Y
- Publisher : O'Reilly Media
- Publication date : October 21, 2025
- Edition : 1st
- Language : English
- Print length : 258 pages
- ISBN-13 : 979-8341608146
- Item Weight : 1.01 pounds
- Dimensions : 7 x 2 x 9.19 inches
- Best Sellers Rank: #1,014,144 in Books (See Top 100 in Books)
- #151 in Data Warehousing (Books)
- #290 in Data Modeling & Design (Books)
- #394 in Data Processing
About the author

Discover more of the author’s books, see similar authors, read book recommendations and more.
Related products with free delivery on eligible orders
Customer reviews
- 5 star4 star3 star2 star1 star5 star0%0%0%0%0%0%
- 5 star4 star3 star2 star1 star4 star0%0%0%0%0%0%
- 5 star4 star3 star2 star1 star3 star0%0%0%0%0%0%
- 5 star4 star3 star2 star1 star2 star0%0%0%0%0%0%
- 5 star4 star3 star2 star1 star1 star0%0%0%0%0%0%
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on Amazon





