Monet DB - The Column-Store Pioneer
Open Source For You|August 2019
Monet DB - The Column-Store Pioneer

Persistence of data is what makes application software usable. Data persistence has come a long way from simple file based storage to the latest sophisticated distributed databases. This article explores an efficient database system called MonetDB, the primary feature of which is column based storage. MonetDB also comes loaded with support for various languages such as Python, Ruby, R and PHP.

Dr K.S Kuppusamy

Just as the physical world is made up of atoms, the cyber world is made up of data. The effectiveness with which the data is stored and retrieved defines the quality and usability of software applications. Persistence of data in the earlier days was carried out by simply putting the data in flat files. In contrast, modern-day databases are very sophisticated in terms of their architecture, data handling efficiency, etc. The world of database management systems is loaded with plenty of choices. This article explores an interesting option called MonetDB.

Most of the database management systems are row-major systems. They work on the assumption that all values of a row are fetched at a time. However, there are instances when we need to do some aggregation operations on the values of columns. In such circumstances, it would be better if the values of columns are stored together in the memory block. Column based databases enable the retrieval of the necessary and related data with efficient disk access. Such databases are better suited for online analytical processing or OLAP (e.g., data warehouses) workloads.

Some of the column-based databases (Figure 1) are listed below:

Apache Kudu

ClickHouse

InfiniDB

MonetDB

Apache Druid

Metakit

Though there are many similarities between the above options, each of them has some unique features.

MonetDB was developed by a team at Centrum Wiskunde & Informatica (CWI), Netherlands. Initially, it was called Monet after the famous French painter, Claude Monet. Later it was updated to MonetDB. It is written in C language and supports various platforms. The latest stable release was on April 2019. The main features (Figure 2) of MonetDB are listed below:

Column store database kernel

High-performance system

Multi-core parallel execution

Support for different query languages (this is achieved through its proprietary algebraic language called MonetDB Assembly Language or MAL)

Extensible database system

Support for a broad palette of application domains with the integration of external libraries such as PCRE, Raptor, libxml, Geos, etc.

Open-source

MonetDB has a three-layer architecture. The top layer is for providing an SQL interface, the middle layer holds optimizers for MonetDB Assembly Language (MAL), and the bottom-most layer is the database kernel for providing access to the Binary Association Tables.

Installation

Detailed installation instructions for various platforms are available in the official documentation site, https://www. monetdb.org/Downloads. Though Windows 32-bit and 64-bit installers are given, the official documentation recommends Linux 64-bit binaries for scenarios in which the tables are expected to grow larger than 2GB disk space. If you wish to compile from the source, follow the instructions at https:// www.monetdb.org/Developers/SourceCompile.

articleRead

You can read up to 3 premium stories before you subscribe to Magzter GOLD

Log in, if you are already a subscriber

GoldLogo

Get unlimited access to thousands of curated premium stories, newspapers and 5,000+ magazines

READ THE ENTIRE ISSUE

August 2019