Edge computing - along with 5G - is a revolutionary technology for business. Companies are starting to double down on edge computing solutions. So as we fast forward our way through 2020, just how important will 5G and Edge be?
To better understand all of this, VMblog sat down with an industry expert on this topic, Simon Crosby, Chief Technology Officer at Swim. Swim focuses on streaming data and knows that to make it most valuable it needs to be handled, analyzed and acted on immediately, as opposed to today’s approach of storing the data first then accessing and analyzing which delays its usefulness. As an example, Telcos and their enterprise customers need a new solution that can handle all this data in real-time so users can derive insights and act on those insights (or consumer needs) immediately.
VMblog: Key shift in application arch thinking from
store-then-process to process-then-store - what's driving this?
Simon Crosby: Enterprises were seduced by the notion of "if it works
for google it must work for me" big-data promise a decade ago, but there
are two challenges: first, nowadays everything is connected so data volumes are
huge - and data sources never stop. So storing it all is not possible.
Second, what's the application? Things that generate data do so in context
- you might get a sensor reading of "5" - what does it mean? The
answer is always contextual i.e. "the sensor that's on the wall next to
the door on the left side of the building, reads 5 door
openings." Applications make meaning out of the data from things and
enterprises have lacked the tools to easily build them.
The big change is the need to continuously process data:
There's no way out because competition demands insights - from products,
production, supply chains and so on - so that decisions can be made in
real-time. So data needs to be analyzed on-the-fly as it is received, in a
context that can make meaning from it. Sure, you can store the raw data if
you want, but remember that disks etc., operate a million times slower than a
CPU and memory. The profound technological change is the need to derive
meaning - analyze, learn and predict - from data that is
boundless. Batches or even mini-batches such as those used in Spark are
not useful. What matters is a continuous analysis of the state of
real-world things and the constant availability of insights. So digital
twins of things that make sense of their own data in the context of their real
world deployments and relationships, self-train and learn rather than big batch
style models that require a store-then-analyze approach.
VMblog: What technologies are making it possible to analyze
lots of data fast, close to the edge?
Crosby: There is a profound shift from the traditional
database-as-truth mentality, to one in which continuous analysis happens as
data arrives. After all, why take the latency hit of a database access to
see the state of some data element and deliver a result a million times more
slowly than CPU and memory? Digital twins are essentially concurrent, in-memory
objects that maintain the state of the real-world thing they
represent. They are concurrent, active objects that can also compute on
their own data - so as new events arrive they can compute all sorts of
insights, on the fly. That in its own right is pretty cool, but recall
that information is contextual. If digital twins link to each other -
building a graph that captures the relationships that give contextual meaning
to data, then an even more profound opportunity presents itself: digital twins
can use the states of the other "things" to which they are linked,
with their own data, to accomplish analysis in context. Here's an
example: at traffic.swim.ai you will find
an app in which digital twins of the intersections with free running lights in
Palo Alto, CA, use both their own sensor data (about 80 sensors per
intersection) plus the sensor data from their neighbors, in context, to predict
the future signal state at the intersection, all in the blink of an eye, on
more than 4TB per day. The digital twin at each intersection uses its own
traffic data plus that of its neighbors, to continuously analyze and
predict. The insights are continuously streamed via an API (in
Azure).
In a nutshell: digital twins that analyze, learn and
predict, in a geospatial context, deliver streaming insights in the blink of an
eye.
VMblog: IoT. Is this going to drive 5G or will consumers?
Crosby: 5G is inevitable. Carriers have to do it just to be
competitive. So yes, their appeal to consumers plays a big role, but there are
strong enterprise facing features too. One of
the most frequently touted new features of 5G, beyond vastly increased
bandwidth, is the opportunity for carriers to deliver secure, isolated network
services to enterprises and smart cities. A customer can avoid the
complexity of a dedicated or wired network for remote or temporary sites; or
ensure that traffic from mobile devices in the hands of employees is isolated
cleanly from the Internet. Slicing offers an opportunity for carriers to
eliminate the complexity of VPN management, and to offer services that isolate
and secure enterprise traffic - a significant opportunity given the
ever-present risks of cyber-attacks and rapid growth of a distributed
workforces.
IoT is really a catch-all term for
two things: consumer devices other than mobile phones - like home control, etc.
- and the increasing instrumentation and automation in production lines,
monitoring in premises, instrumentation of products, and so on. It's not a
market per se, but a good catch-all for "everything going online."
VMblog: So is 5G a real boon to consumers, or are there
enterprise use cases?
Crosby: Sure it's a boon to consumers, but the consumer appetite for
bandwidth is driven by most consumption of media going online. It's a
cloud-to-consumer delivery model. The interesting problem that enterprise
use cases present is that they are heavily edge-to-cloud oriented. 5G
networks will be architected to deal with streaming in the opposite direction -
from devices to cloud fabric that can analyze the data. Slicing offers a new business opportunity to carriers because
it offers both isolation
and QoS and this in turn offers a guarantee
against denial of service attacks from 3rd parties and an assurance that
enterprise traffic will receive priority.
VMblog: Fast networking is a great opportunity - but will
edge computing, i.e. an "edge cloud" of Ericsson Edge Gravity -
actually be a thing?
Crosby: I hear the "edge cloud" marketing a lot. Look,
there is a clear need for a hybrid architecture to support real-time continuous
analysis, learning and prediction. Data comes from the edge and needs to
be reduced and cleaned before it can be analyzed. There are two
predominant architectures that I see. First, where the "things" are connected
to an enterprise network - so we know where the data is coming from - and
cleaning and sometimes analysis occurs in the "enterprise cloud" or
both the enterprise and a public cloud. Second, when the
"things" are mobile and data comes over the Internet, or perhaps over
a carrier network. Since things are mobile their context changes
continuously, and the graph of relationships is continually changing. Typically
when there's no clear location of the "things" then analysis in the
public cloud makes most sense.
Back to the question of specific offerings like Ericsson's
Edge Gravity, this is a serious thing, mostly because it allows carriers to run
customers' software close to the edge of the network, It also allows the
carrier to run some of their low latency service offerings close to the edge
where data is generated.
VMblog: Kubernetes and "edge workloads," what's the
lowdown?
Crosby: Most enterprises "at the edge," i.e. oil and gas or
metal bending manufacturers, lack "cloud native" skills in their
development organizations. Put it this way: if you're great at Kubernetes
you're likely working for a cloud company or some startup. So getting
these companies to a point of proficiency in cloud and all the services
available, and the individual software packages, is very very tough. That
said, abstractions like Kubernetes are crucial because they simplify what
application developers need to do. So yes, Kubernetes based
infrastructure in your favorite cloud, or on prem, is crucial. I'm a huge
fan of what VMware has done with Kubernetes because by wrapping it into their
offering they have reduced and will continue to reduce its complexity for real
world IT people. Swim also dramatically simplifies Kubernetes
deployments. Swim is usually deployed in containers using
Kubernetes. Thereafter all the complexity is hidden - the app just runs
and Swim automatically load balances the execution instances. Our goal is to
make Swim applications self-managing in terms of security, resource
consumption, resilience and scalability.
VMblog: Finally, how will application developers wrap their heads
around this stuff
Crosby: As I said, it's hard. DevOps in general is a rapidly
changing and strategic area for application development. My advice is
this, embrace it and let infrastructure become a service offered by your
favorite cloud provider. Write applications. Swim is quite remarkable in
this respect. With a tiny (a few thousand lines) program in Java, any developer
can quickly encode the contextual relationships between things - for example
sensors are contained by intersections, which are linked to their neighbors
based on geospatial position - and then use Swim's capabilities for continuous,
stateful analysis to produce startling insights. When data arrives, Swim
builds the digital twins that represent every real-world thing, on the
fly. They consume their own data, link to digital twins in their context
and then analyze. So from a simple program, it's easy to scale to a graph
of tens of millions of related "things" that analyze, learn and
predict and stream their insights. The graph is built by the data, and the
vertices in the graph are digital twins that learn. All the infrastructure
challenges like "where does this run?" "How do I visualize or access it?"
"How do I make the app resilient and scalable?" are addressed by the
underlying Swim runtime.
##