As objects around the world are increasingly being connected to the Internet of Things (IoT), more and more attention is being drawn to cyber-physical systems (CPS). CPSs are systems that collect diverse information from IoT devices and sensors in the real world (physical space) and analyze them in real-time in the virtual world (cyberspace) using large-scale data processing technologies, etc. The information and value created in the virtual world are then returned to the real world to stimulate industry and help solve social problems. In this three-part series, we will explain the data platform technologies essential for creating CPSs and the database management system (DBMS) at the core of the data platform.
Part 1 looked at the background that led to the creation of NoSQL rather than RDBMS, and GridDB, which is Toshiba’s specialized database for big data and IoT systems. Among GridDB’s core features, we also introduced the key container data model as GridDB’s unique data model. In Part 2, we illustrated the features of the data platform technologies used by the constantly evolving GridDB and actual examples of the contributions GridDB has made to the utilization of petabyte-level manufacturing data. In Part 3, the final part of this series, we will discuss what has led to the need for the cloud-based Database as a Service (DBaaS), GridDB Cloud (GridDB’s DBaaS), and GridDB’s open source software (OSS) activities.
What prompted the need for DBaaS?
In the past, the most common database usage format was the on-premise deployment, in which devices that were managed by users were installed on-site and managed using servers and database management systems (DBMSs). With this format, users needed to handle the management of devices and various other elements of the system in-house. Subsequent technical innovations led to advances such as virtualization and high-speed internet connectivity, and from the late 2000s, cloud computing service usage became more widespread. With these services, service providers managed the IT devices and software. Users then accessed them remotely via the internet to use servers and applications.
As the use of cloud computing services grew, there also arose a need for cloud services known as Database as a Service (DBaaS) for DBMSs. DBaaS use has grown tremendously over the past few years, and the DBaaS market is expected to surpass the on-premise DBMS market*.
* In the DBMS market, DBaaS and on-premise market share are roughly equivalent.
Microsoft has the largest market share. Amazon (AWS) has surpassed Oracle to take the number 2 spot, according to research by Gartner (https://www.itmedia.co.jp/news/articles/2204/18/news063.html)
Let’s look at the differences between an on-premise DBMS and DBaaS (Fig. 1).
DBaaS offers various benefits over an on-premise DBMS. However, DBaaS cannot be used in all cases. For example, there are situations where servers must be located on user premises due to legal requirements or to ensure security. Although DBaaS is a highly effective option for many situations, it is important to select the appropriate DBMS based on an analysis of the objectives and requirements of the system with which it will be used.
GridDB Cloud for smooth data operation and management
In order to meet these rising expectations for DBaaS, Toshiba has developed GridDB Cloud. This is a managed service in which Toshiba’s own GridDB database management system is operated on Microsoft’s cloud service, Azure, making operation management even easier.
In addition to the technical strengths of GridDB, GridDB Cloud also has the following features.
- Can be deployed faster as design construction is no longer required
- A wide array of data visualization functions
- Improved operational efficiency
- Can be easily integrated with OSS and Azure services
- Easy to add resources when data or processing volume increases
Let’s look at a few of these in greater detail.
GridDB Cloud has extensive graphical user interface (GUI) functions for visualizing data -- functions not available in previous versions of GridDB. These make it easy for application developers and database operators to determine the state of data collection and search for specific data. This helps with application debugging and the detection of missing data. For example, creating and displaying a graph of a year’s worth of daily aggregated data obtained from multiple sensors on GridDB Cloud’s operation screen is very simple (Fig. 2).
Furthermore, since GridDB Cloud is available on Azure, it can be integrated with Azure services such as Azure Functions and Azure IoT Hub, as well as Microsoft’s Power BI Service. In addition, GridDB Cloud can be integrated with OSS tools like Fluentd, for collecting log data, or Grafana, for analyzing and visualizing collected data (Fig. 3).
There are diverse GridDB Cloud plans to meet different needs and applications. The Standard, Professional, and Enterprise plans encompass a variety of CPU, memory, and storage (solid state drive (SSD)) options, and both 1 and 3 node configurations are available. Users can select the options that best fit their own use cases (Fig. 4).
After starting GridDB Cloud, the configurations can be flexibly adjusted to accommodate users’ evolving needs. For example, in cases where resource augmentation becomes necessary due to increased frequency of database access or accumulating data volume, users can promptly address this by adding nodes or storage through available options.
GridDB Cloud is a cloud service that offers the DBaaS benefits of cost reductions, scalability, availability, and security, together with the features of an IoT database, such as high reliability, performance, and expandability.
GridDB’s OSS activities
Every year, interest in OSS rises. Let’s look at the example of GitHub, where developers can save or release their program code and design data. According to GitHub’s annual report, “The state of the Octoverse 2023,” released in November 2023, GitHub was used by over 100 million developers in 2023 (an increase of 26% compared to the previous year). Over 2.8 million developers of these developers are located in Japan, up 31% from last year. One of the issues that companies grapple with is bringing in new program development personnel, a need that GitHub is perfectly situated to assist with. In recent years, more and more OSS development projects are being carried out using GitHub, and the amount of investment in these projects is on the rise. 30% of the companies on the Fortune 100* now have established Open Source Program Offices (OSPOs) responsible for OSS management and strategy formulation. The number of companies using OSPOs to create effective strategies for leveraging open source software will continue to rise.
* Fortune 100: The top 100 companies in the Fortune 500, a ranked list of the top 500 American companies in terms of net sales, published each year by U.S.-based magazine Fortune.
A similar transformation is happening with DBMSs. According to DB-Engines, a site that ranks DBMS popularity, OSS DBMSs have been more popular than commercial DBMSs since 2021. This is because with OSS, DBMSs can be tested before actually building the entire system, making it easy to select the best-suited programs and services.
Toshiba has been offering GridDB since 2013, and benchmark tests*1 have confirmed its high level of performance as a NoSQL database. GridDB serves as the backbone for storing sensor data, and is being used in many IoT/Big Data systems. In 2016, we began open-sourcing GridDB, leveraging our past experience to foster more widespread use of big data technology. GridDB source code is available on GitHub, under the AGPL-3.0 open source license.
* 1: See page 17 of this presentation for the results of the benchmark testing.
* 2: GridDB source code can be used from the GitHub GridDB Repository.
As of December 2023, GridDB has received more than 2,200 stars (the GitHub repository version of a “Like”) and almost 5,000 forks (repository clones). It has a powerful and active community. In addition to the database server source code, the repository includes 31 other modules. These additional modules provide connectors for connecting GridDB to open source software such as Kafka and Fluentd, drivers for various programming languages, such as Node.js and Rust, and more. This multifaceted approach has created a seamless ecosystem of diverse database technologies and software, with GridDB at its heart.
To bring GridDB to even more users, in addition to GridDB Cloud, Toshiba also offers GridDB Community Edition (GridDB CE) and GridDB Enterprise Edition (GridDB EE) (click here for a comparison of GridDB CE and GridDB EE). GridDB CE supports multiple operating systems and can be used for development in a variety of languages. Anyone can get started using it right away simply by downloading and installing it. It has been well-received for both its high level of performance and its flexibility, and has already been downloaded hundreds of thousands of times by users around the world. The other GridDB edition, GridDB EE, offers functions such as data distribution functions that make it possible to continue using databases in the event of a data center accident or disaster, as well as functions for performing time-series data aggregation, missing data interpolation, and nanosecond level processing. With products and support that are quality assured and professionally managed, companies can use GridDB with confidence in their mission-critical systems to support their business.
GridDB can be installed easily using Linux package managers such as Yellowdog Updater Modified (YUM) or Advanced Package Tool (APT). Docker images for running GridDB on Docker* is also available on Docker Hub. For GridDB client applications, pre-compiled binaries are disseminated through various public repositories such as the Maven Central Repository for the GridDB JDBC driver and PyPI (The Python Package Index) for the GridDB Python library.
* Docker: A software platform that uses application containers for easy configuration, testing, and deployment. Docker images are template files that provide container operation environments.
We believe that the application developer community will be the heart that drives the promotion of GridDB usage. To provide developers with wide-ranging knowledge about GridDB, we have partnered with a third party to establish and operate GridDB Developers website (Japanese/English). GridDB Developers website offers technical blogs that are updated every week, providing information and insights to deepen knowledge about GridDB from various perspectives. In addition to these blogs, we also publish numerous technical papers covering various topics, including performance comparisons of different time-series databases, architectural analyses of various NoSQL databases, and practical guides for building solutions using GridDB.
We noted that in a Stack Overflow Developer Survey conducted in 2023, over 75% of respondents said that they acquired their coding skills using online resources such as technical documentation, blogs, and how-to videos. That is why we offer technical documentation for GridDB users in both Japanese and English, and we have created a GridDB YouTube channel for users who prefer to learn through video. This channel contains both practical videos and an archive of past online seminars. Our online on-demand video courses are already being used by over 1,000 people in 85 countries. We will continue to provide support to ensure that GridDB is effectively utilized by a wide range of users.
Through this three-part series, we have discussed what led to the appearance of NoSQL and GridDB, the technical features and applications of GridDB, the provision of GridDB as a cloud service, and OSS utilization. Moving forward, we are committed to accelerating the evolution of GridDB through open innovation, aiming to contribute to society by providing a foundation that supports data utilization in business and society. We hope you will look forward to the future developments of GridDB.
* We actively welcome feedback from the GridDB community. Please share your comments through the GridDB GitHub page. For general inquiries, please use the product page inquiry form.
CHIBA Kazuki
Specialist
Software Development Dept. Group 2
Software Systems Research and Development Center
Toshiba Digital Solutions Corporation
Since joining Toshiba, CHIBA Kazuki has been involved in the research and development of IoT platforms. He is now involved in the development of cloud services for Toshiba’s GridDB database.
FUJITA Shinichi
Expert
Software Development Dept. Group 2
Software Systems Research and Development Center
Toshiba Digital Solutions Corporation
Upon joining Toshiba, FUJITA Shinichi was involved in the development of knowledge-related software such as Knowledge Meister, after which he became involved in the research and development of IoT platforms. He now develops GridDB cloud services.
SUHERMAN Angga
Specialist
Data Business Promotion Dept. New Business Development Group
ICT Solutions Division
Toshiba Digital Solutions Corporation
Since joining Toshiba, SUHERMAN Angga has been involved in GridDB (IoT Database) product planning.
- The corporate names, organization names, job titles and other names and titles appearing in this article are those as of January 2024.
- GridDB is a registered trademark of Toshiba Digital Solutions Corporation in Japan.
- All other company names or product names mentioned in this article may be trademarks or registered trademarks of their respective companies.
>> Related information
Related articles
Running feature: Data platform technology for real-time processing of the massive time-series data generated by the IoT(Article list)