
In the ever-evolving world of data, the need for scalable, flexible, and high-performance infrastructure has never been more urgent. Today, we’re excited to introduce DuckLake — an integrated data lake and catalog format that sets a new standard for how organizations manage and analyze data at scale.
DuckLake is a bold step forward in data architecture, combining the scalability of data lakes with the performance and structure of data warehouses, all powered by standard SQL. It is built on open principles and designed to empower data teams with speed, consistency, and limitless growth.
What Is DuckLake?
At its core, DuckLake is a SQL-native format that integrates a data lake and a metadata catalog. It brings together the best features of modern data platforms to solve long-standing challenges around performance, consistency, and scale.
DuckLake enables:
- Local compute for fast and efficient processing
- Central consistency through a unified metadata catalog
- Scalable storage that can grow with your data — without vendor lock-in
And because DuckLake is an open standard, it’s accessible, extensible, and future-proof.
Why DuckLake, and Why Now?
Modern data teams face a trade-off: data lakes offer scale, but often lack structure and reliability; data warehouses offer performance, but are expensive and inflexible. DuckLake removes this compromise by delivering:
- Performance without limits – Bring compute closer to your data for faster queries
- Consistency you can trust – Centralized metadata ensures accurate, repeatable results
- Simplicity through SQL – No need to learn new paradigms; use the language you already know
- Open and extensible design – Future-friendly architecture with community collaboration in mind
Whether you’re building large-scale analytics pipelines, querying millions of records, or managing multiple data consumers, DuckLake provides a streamlined and unified experience.
Powered by DuckDB
To bring DuckLake to life, we’ve implemented it as a DuckDB extension — simply named "ducklake"
. DuckDB is already well-loved for its speed, simplicity, and embedded analytics capabilities. Now, with DuckLake, it becomes even more powerful.
This tight integration makes it easy to spin up your own data lake with catalog support, without relying on massive cloud services or complex infrastructure. It’s local-first, SQL-first, and built for today’s data teams.
Built for the Community
DuckLake is not just a product; it’s a platform for open innovation. As an open standard, it encourages contributions, experimentation, and integrations from across the data ecosystem.
We’re committed to maintaining transparency, documentation, and community involvement as DuckLake evolves — ensuring it remains accessible and beneficial for everyone.
Explore DuckLake
Ready to dive in?
- Read the official announcement blog post to learn more about the vision, design, and technical details.
- Visit the new DuckLake website at ducklake.select to explore documentation, examples, and community resources.
DuckLake isn’t just another data tool — it’s a new foundation for building fast, scalable, and trustworthy data systems. Whether you’re a startup building on a laptop or a large enterprise managing petabytes, DuckLake is designed to meet your needs with elegance and efficiency.
The future of data lakes is here. It’s SQL-native. It’s open. It’s DuckLake.
Follow us for more Updates