The New Revamped GridDB 5.0 Features Pluggable Data Store Architecture

~ New architecture enables multiple data model utilization ~

April 22, 2022
Toshiba Digital Solutions Corporation

KAWASAKI―Toshiba Digital Solutions Corporation (Toshiba) today announced the general availability of GridDB 5.0 Enterprise Edition, a purpose-built database for IoT and Big Data workloads with a new architecture that can handle multiple data model.

In the past, GridDB developed Event-Driven Engine^*1 and Autonomous Data Distribution Algorithm (ADDA)^*2 to support Big Data and IoT systems that require fast, scalable and reliable data store. However, in recent years, IoT data and its utilization have diversified immensely resulting in the need to handle different types of data using different data models accordingly. The current solutions are to utilize multiple Database Management Systems (DBMS) or to store all data into a single data model, which introduces a variety of problems such as system complexity, high operational costs, and a drop in performance.

GridDB 5.0 Enterprise Edition comes with a revamped architecture that features pluggable data store where multiple data model can be managed in a single DBMS. In addition to the current data store optimized for high-frequency high-volume data ingestion, other data stores for executing complex analyses at high speed and for storing text such as logs can be incorporated.

Recently, in addition to the value provided by IoT systems in storing and visualizing large amounts of sensor data, there has been a growing need to utilize the data by performing complex analyses to gain new business insights. The ability to store large volumes of high frequency data and perform complex analyses at high speed are conflicting requirements for a DBMS.

The pluggable data store function makes it possible to integrate multiple data stores optimized for specific workload into a single DBMS. Rather than using multiple DBMSs, integrated processing can be executed in a single DBMS and thus avoiding increased system complexity, and higher construction and operational costs.

A data store that is able to perform complex analyses at high speed and another optimized for storing text data will be provided sequentially in the future.

In GridDB 5.0, the unique checkpoint^*3 algorithm based on HCAL (Highly Efficient Checkpoint Algorithm) results in a reduced amount of log written to files during checkpoint and lowers the disk I/O load. Database performance can be improved by reducing the load to systems that frequently perform data ingestion and update.

In addition, unique blocks can now be assigned to each table to speed up table scan^*4 and deletion. This is particularly effective for data analysis queries where table scan is frequently utilized. Table deletion performance is also improved as tables to be deleted can be specified.

According to a database benchmark test (TPC-H^*5), these performance enhancements have resulted in improvements in the region of 17% to 46% (26% on average).

Future development of GridDB and GridDB Cloud will continue towards the support for digital transformation and cyber physical system using IoT and Big Data.

About GridDB
Toshiba developed GridDB entirely in-house, drawing on its extensive expertise in diverse industry verticals. GridDB ensures efficient accumulation of massive volumes of time series data, and to deliver scale-out performance. It features time series data management, petabyte-scale performance, excellent scalability and unwavering reliability, and developer-friendly API—essential characteristics of IoT and Big Data database.

GridDB Product Site
http://www.griddb.com/
GridDB Open Source Site
https://github.com/griddb/
GridDB Developers Site
https://griddb.net/

[Note]

*1　Event driven engine: A technology that reduces database overhead by minimizing the amount of resources required in non-synchronous data processing and eliminating exclusive processing of memory and disk access.

*2　Autonomous Data Distribution Algorithm (ADDA): An algorithm that automatically distributes data to multiple GridDB server to balance the load.

*3　Checkpoint: The process of writing changes to the database to a file residing on external storage device.

*4　Table scan: The process of searching for a specific row depending on the search condition specified in the SQL statement. It takes time as it searches through the row one by one.

*5　TPC-H: A popular benchmark for comparing performance of database systems.

GridDB is a registered trademark of Toshiba Digital Solutions Corporation in Japan.