Virtualization Technology News and Information
Article
RSS
Kinetica 2023 Predictions: The Rise in Spatial and Temporal Data Analysis

vmblog-predictions-2023 

Industry executives and experts share their predictions for 2023.  Read them in this 15th annual VMblog.com series exclusive.

The Rise in Spatial and Temporal Data Analysis

By Chad Meley, CMO at Kinetica

In 2023, there will be a significant increase in the prevalence of analytic databases designed specifically for spatial and time-series analysis. Traditional analytic databases are often not well-suited to dealing with spatial and time-series data, which can be complex and difficult to analyze. As a result, we expect to see a rise in the development and adoption of new analytic databases that are specifically designed to handle this type of data. These systems will use advanced algorithms and specialized data structures to efficiently store and analyze spatial and time-series data, allowing businesses to gain valuable insights and make better-informed decisions. Overall, the increased use of these specialized analytic databases will enable businesses to more fully leverage their sensor data and drive growth and innovation in the coming years.

The cost of sensors and devices capable of broadcasting their longitude and latitude as they move through time and space is falling rapidly with commensurate proliferation. By 2025, projections suggest 40% of all connected IoT devices will be capable of sharing their location, up from 10% in 2020. Spatial thinking will help innovators optimize existing operations and drive long promised digital transformations in smart cities, connected cars, transparent supply chains, proximity marketing, new energy management techniques, and more.

Spatial data, also known as geospatial data or geographic information, refers to data that has a geographic component, such as the location of a physical object or the shape of a geographical feature. Spatial data can be represented in many different forms, including as coordinates, points, lines, polygons, and raster images. This data can be collected using a variety of methods, such as through global positioning systems (GPS), remote sensing, and aerial or satellite imagery. Spatial data is often stored and managed in specialized databases. Temporal data, also known as time-series data, refers to data that has a time-based component. This type of data is often used to track changes or trends over time, and can be collected at regular intervals or at specific points in time. Examples of temporal data include financial data, weather data, and mechanical readings such as vibrations or temperature. This data too has typically been stored and managed in specialized time series databases.

Spatio-temporal databases are the combination of both spatial and temporal data that create a more complete picture of a system or process. They are used to store and analyze data that changes over both time and space. This type of database is ideal for applications such as tracking the movement of objects, monitoring the change of geographic features, and analyzing the spread of disease. They provide a way to store and query data that is constantly changing, as well as the ability to display it in real-time.  At the start of this decade, spatio-temporal databases began to see production deployments by innovators in telecommunications, logistics, defense, financial services, energy, transportation, retail, and healthcare.

While spatial and time-series functions have been "features" in conventional analytic databases for years, they have failed to produce breakthrough results due to performance and scale limitations.  Spatial and temporal joins are particularly taxing on even the most advanced distributed, columnar, memory-first, cloud databases.  Unlike traditional primary and foreign key joins (e.g., customer_id in table one joined to customer_id in table two), a spatial join may include mapping a longitude and latitude in one table to a polygon in table two.  Just as the big data revolution was fueled by web 2.0 data and a rethinking of the systems used to store and analyze it, new technology in the form of vectorized databases have emerged to satisfy the unique requirements of spatio-temporal analytics.    

Vectorized databases are a type of distributed analytic database that uses vectorized query execution to boost performance. In contrast, conventional distributed analytic databases typically process data on a row-by-row basis, which can be slower and require more computational resources. In a vectorized query engine, data is stored in fixed-size blocks called vectors, and query operations are performed on these vectors in parallel, rather than on individual data elements. This allows the query engine to process multiple data elements simultaneously, resulting in faster query execution and improved performance.

Vectorized databases use the latest advances in NVIDIA GPUs and vectorized CPUs from Intel, and software to process data in large blocks, allowing them to execute queries more quickly and efficiently. This can be particularly beneficial for complex queries and spatio-temporal joins that involve large amounts of data, as it can reduce the amount of time and resources required to execute the query. Overall, vectorized databases offer improved performance and scalability compared to conventional distributed analytic databases. The leading vectorized database is Kinetica based on TPC-DS benchmarks.  Last year, Intel's Jeremy Rader, GM, Enterprise Strategy & Solutions for the Data Platforms Group proclaimed, "Kinetica's fully-vectorized database (sic) significantly outperforms traditional cloud databases for big data analytics."

##

ABOUT THE AUTHOR

Chad-Meley 

Chad Meley is CMO at Kinetica.  With over 20 years of experience as a leader in SaaS, big data, advanced analytics, and data driven marketing, Chad is known for innovative thinking, delivering high impact results, and building inclusive, global, forward-thinking teams.  Prior to joining Kinetica, Chad was VP of Marketing at Teradata, and held a variety of leadership roles centered around data and analytics while at Electronic Arts, Dell and FedEx.

Professional awards include Best Practice Award for Driving Business Results in Data Warehousing from The Data Warehouse Institute and Marketing Excellence Award from the Direct Marketing Association. Chad is a regular speaker at conferences, including The O'Reilly AI Conference, Strata, Constellation Connected Enterprise, and Analytics Universe. He is often quoted by major media publications such as CIO Magazine, Forbes, InformationWeek, and others.

Published Monday, December 12, 2022 7:34 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<December 2022>
SuMoTuWeThFrSa
27282930123
45678910
11121314151617
18192021222324
25262728293031
1234567