Virtualization Technology News and Information
The Three Stages of the Apache Ignite Distributed Database Journey

By Denis Magda, Vice President, Developer Relations, GridGain Systems

Enterprises seeking to implement real-time analytics and end-to-end business processes across their distributed organizations soon confront issues of performance and scale. They have massive amounts of data in siloed datastores spread across a hybrid multicloud environment, yet they want to bring subsets of this data together - along with streaming data sources - for a variety of specific real-time use cases, from creating 360-degree customer views to predictive maintenance to fraud detection. 

Apache Ignite, a distributed database designed for high-performance computing with in-memory speed, enables companies to implement these use cases and much more by addressing the performance and scale challenges inherent in disk-based databases.

Apache Ignite is typically deployed on a cluster of distributed machines and pools the available RAM, CPUs and storage resources of the cluster to create a high-performance data store. Ignite can be deployed on-premises, on core systems running on mainframes, in a public or private cloud, or in a hybrid environment. Industry leaders across a range of verticals rely on Apache Ignite for production environments, including American Airlines, DreamWorks, JP Morgan Chase, PayPal, UPS, Vertex Pharma and VMware.

The logical steps of the Apache Ignite journey

While the benefits of Apache Ignite have been clearly and repeatedly demonstrated, bringing it into an infrastructure requires careful consideration to ensure optimal benefits with minimal disruption. Organizations with the largest number of Ignite deployments (clusters running in production), did not deploy Ignite and immediately switch most of their services to it at once. Instead, their Ignite adoption journeys have been incremental and steady, typically based on the following increasingly complex use cases:

  1. Application acceleration and data caching
  2. A high-performance computing environment for compute-intensive logic or multi-step business operations
  3. A distributed database for mixed workloads that grow beyond the available in-memory capacity

This step-by-step journey allows application designers and developers to fully understand how and where Ignite can deliver the best value to their organizations, the impact on existing applications of introducing an in-memory computing layer, and the types of internal and external resources they will need to speed development while ensuring optimal use of the platform and a highly reliable implementation.

Following is a more detailed explanation of each step in the Apache Ignite journey.

Step 1: Application acceleration and data caching

The Problem

With so much banking being conducted online, it is common for banks to experience exploding usage of mobile and web applications that rely on APIs to access multiple core systems. To complete most transactions and avoid impacting the core systems, the application must cache frequently accessed data about the customers and the accounts involved. However, with the existing infrastructure, scaling this cache to avoid application bottlenecks is extremely expensive and puts tremendous pressure on development resources.

The Solution

Apache Ignite configured as an in-memory data grid (IMDG) or an even more advanced digital integration hub (DIH) supports an extremely low-latency and scalable in-memory data store with support for a wide array of APIs, as well as SQL queries. This makes it easier to implement than other caching solutions.

Using Apache Ignite, a bank can build a DIH, using an IMDG to create a common data access layer for aggregating and processing data from multiple on-premises and cloud-based sources and streaming data feeds. The DIH architecture enables multiple customer-facing applications to access a single view of the aggregated data. Unlike other distributed caches, an Ignite-powered DIH can operate as an advanced cache that writes through changes made by the consuming applications back to underlying databases. Ignite also supports SQL, compute APIs and transactions. Queries are processed at in-memory speeds without movement of the data over the network.

Further, while other caching solutions are limited to caching, with Ignite, caching is just the first step in the journey, as the following sections describe.

Step 2: High performance computing

The Problem:

In addition to the challenge of caching ever-growing amounts of hybrid data, enterprises eventually encounter the challenge of performance. Once again, financial services provides an easy-to-understand use case. Portfolio managers rely on exposure management applications to determine the level of risk in response to activities in a particular industry or movement related to financial vehicles such as U.S. stocks or bonds. To support such an application, the bank must essentially build a very large, very fast pivot table with customer analytics running as the pivot table views are generated. The processing requirements are enormous: hundreds of thousands of calculations must be run per second with sub-second response times. The application also requires large and constantly shifting datasets as inputs, as well as the ability to support custom queries - and all these activities must be auditable. In addition, data must be continually moved over the network to the application, which is very expensive and has a dramatic impact on performance.

The Solution

Apache Ignite enables applications to execute custom logic - functions, lambdas - on the data in the in-memory computing cluster - complete with massively parallel processing, dropping the time required for calculations from minutes to milliseconds. It also dramatically reduces the movement of data over the network. Further, with traditional databases, in-place calculations require stored procedures written in a language such as PL/SQL. With Ignite, modern JVM languages, C# or C++, can be used to develop and execute custom tasks across the distributed database.

Step 3: A distributed database for mixed workloads

The Problem:

A nationwide bank wants to implement a 24x7 omnichannel banking platform, including interactions via phone, web, chat, social media, mobile, etc. This would require creating a DIH to be able to rapidly query data aggregated from a large number of different source systems - customer accounts, customer service, risk analysis, fraud analysis, streaming social data, etc. - and return results in seconds in order to ensure a positive user experience. However, with the goal to eventually support millions of individual users and corporate customers, the bank must have the flexibility to rapidly scale capacity beyond the available pooled memory in the DIH.

The Solution

With Apache Ignite in place, the bank would be able to simultaneously support hundreds of thousands of users accessing the system across multiple web channels, mobile channels, and social media channels and support millions of individual and corporate users, as well as hundreds of thousands of concurrent users. In this advanced configuration, Ignite functions as a distributed database that can utilize RAM, SSDs, Intel Optane Persistent Memory and other storage devices to achieve persistence, so unlimited amounts of data can be kept in the cluster without worrying about data loss or inconsistencies, while also achieving optimal performance by allocating RAM only for the hot data. Because of this, the bank can balance cost and speed requirements, keeping some lower priority data on disk instead of in memory. The persistence feature also allows the bank to recover faster from downtime, minimizing the impact on the customer experience.


For companies that are just starting their digital transformation journey and need only application acceleration and data caching, it may be tempting to look for a simpler or cheaper alternative than Apache Ignite. However, if there is any chance a company will eventually require a high performance computing solution or need to scale mixed workloads beyond available memory, choosing Ignite is the wiser option. First, all the time and money invested in building the solution is applicable to future use cases - there would be no need to rip and replace a legacy caching solution. Equally important, the expertise developed in deploying Ignite for acceleration and caching is also applicable and will help the company evolve to the next step faster and more efficiently with a better network design, accelerating time to value.

To learn more about Apache Ignite, consider attending some of the many ongoing webinars, conferences and training sessions that are available, or explore Ignite's capabilities in depth on YouTube



Denis Magda 

Denis Magda is a Vice President, Developer Relations, at GridGain Systems, provider of enterprise-grade in-memory computing solutions powered by the Apache Ignite distributed database. An Apache Software Foundation Member, Apache Ignite committer, Project Management Committee member, and open-source software enthusiast, Denis helps software engineers and architects develop their expertise in in-memory computing. A well-known industry speaker, Denis can often be found at conferences, workshops, and other events sharing his knowledge about Apache Ignite, distributed systems, and open-source communities. 
Published Tuesday, August 10, 2021 7:04 AM by David Marshall
Filed under:
The Apache Ignite Journey: High Performance Computing : @VMblog - (Author's Link) - March 3, 2022 7:31 AM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<August 2021>