Prøve GULL - Gratis
Apache Iceberg and Trino: Powering Data Lakehouse Architecture
Open Source For You
|December 2025
Apache Iceberg is a cornerstone of any open data lakehouse, providing the transactional foundation upon which highly scalable and flexible analytics can flourish. Along with Trino, it can be used to build a robust, scalable, and high-performance data lakehouse.
Over the past ten years, the emergence of Big Data has transformed how organisations store and process their data. The performance and reliability of traditional data warehouses lacks flexibility and cost-effectiveness. At the same time, data lakes have scale and affordability but are challenged with governance, schema enforcement, and performance limitations.
Enter the data lakehouse -- a new data architecture that combines the scale-out store of data lakes with the transactional and governance features of data warehouses. By allowing SQL workloads natively on object storage with capabilities such as ACID transactions, schema evolution, and time-travel queries, the lakehouse offers a single platform for BI, data science, and real-time analytics.
What is a data lakehouse?
A data lakehouse integrates the best practices of data lakes and data warehouses, filling the gap between scalable, flexible storage and transactional, dependable analytics. It provides an integrated platform where organisations can oversee the entire lifecycle of the data — from ingestion and processing to analytics and machine learning.
Traditional data lakes are planned for raw, bulk storage but do not include the capabilities necessary for enterprise-level analytics, including schema enforcement, data versioning, and ACID guarantees. Data warehouses have these features but are expensive, inflexible, and usually associated with proprietary vendors.
The lakehouse resolves this trade-off by putting warehouse-like capability into open data structures (such as Parquet and ORC), while keeping the scalability and economics of object storage systems (such as S3, HDFS, or GCS).
Apache Iceberg: Modern table format for the lakehouse
Denne historien er fra December 2025-utgaven av Open Source For You.
Abonner på Magzter GOLD for å få tilgang til tusenvis av kuraterte premiumhistorier og over 9000 magasiner og aviser.
Allerede abonnent? Logg på
FLERE HISTORIER FRA Open Source For You
Open Source For You
I2C and I3C: How Modern Devices Communicate
I3C and I2C are both two-wire communication protocols that help exchange data between multiple devices. While I3C preserves the simplicity of I2C, it introduces new features suited for today's sensor-rich devices.
8 mins
March 2026
Open Source For You
Data Deduplication Done the Right Way
Deduplication helps to save space on Linux-based storage systems. Choose the right platform and check whether it meets your goals.
5 mins
March 2026
Open Source For You
The Relevance of Rubber Duck Debugging in the Age of AI
Discover why rubber duck debugging is a powerful process today.
4 mins
March 2026
Open Source For You
Sending IoT Sensor Data to Public or Private Servers
This IoT system shows a simple and effective way to send sensor data using an ESP8266 microchip.
3 mins
March 2026
Open Source For You
Optimising RAM Usage with Python
Discover how we can make better use of RAM by applying various Python optimisation techniques.
7 mins
March 2026
Open Source For You
How a Job Portal Benefited from Microservices Architecture
Microservices architecture has emerged as a preferred pattern for building scalable and maintainable software applications. Here's how a monolithic job portal application was re-engineered into a microservices-based system. The migration process, key design decisions, the technology stack used, and measurable improvements in performance and flexibility are all laid out for you.
9 mins
March 2026
Open Source For You
The Role that Software Architects Play
Software architects design software projects and ensure these meet their goals. They must balance tech skills with leadership and mentoring abilities.
5 mins
March 2026
Open Source For You
The Path to Cybersecurity in the Quantum Era
The rise of quantum computing will be accompanied by a failure of conventional cryptography. Post-quantum cryptography and advanced threat detection methodologies, among other techniques, are being evolved to counter security threats in the quantum era.
8 mins
March 2026
Open Source For You
Why Open Source Large Language Models are Popular
Open source large language models mark a pivotal moment in the evolution of generative Al. By lowering barriers to entry and fostering collaborative innovation,these models are enabling a broader spectrum of organisations to benefit from Al.
7 mins
March 2026
Open Source For You
Ant Group open sources two frontier AI models
Ant Group has open sourced two trillion-parameter frontier AI models-Ling2.5-1T and Ring-2.5-1T-placing advanced large language and reasoning systems directly in the hands of developers and researchers worldwide.
1 min
March 2026
Listen
Translate
Change font size
