Virtualization Technology News and Information
VMblog Expert Interview: Simon Crosby of Swim Talks 5G Deployments at the Edge, Data and Use Trends

interview swim simon crosby 

Edge computing - along with 5G - is a revolutionary technology for business.  Companies are starting to double down on edge computing solutions.  So as we fast forward our way through 2020, just how important will 5G and Edge be?

To better understand all of this, VMblog sat down with an industry expert on this topic, Simon Crosby, Chief Technology Officer at Swim.  Swim focuses on streaming data and knows that to make it most valuable it needs to be handled, analyzed and acted on immediately, as opposed to today’s approach of storing the data first then accessing and analyzing which delays its usefulness.  As an example, Telcos and their enterprise customers need a new solution that can handle all this data in real-time so users can derive insights and act on those insights (or consumer needs) immediately.

VMblog:  Key shift in application arch thinking from store-then-process to process-then-store - what's driving this?

Simon Crosby:  Enterprises were seduced by the notion of "if it works for google it must work for me" big-data promise a decade ago, but there are two challenges: first, nowadays everything is connected so data volumes are huge - and data sources never stop. So storing it all is not possible. Second, what's the application? Things that generate data do so in context - you might get a sensor reading of "5" - what does it mean? The answer is always contextual i.e. "the sensor that's on the wall next to the door on the left side of the building, reads 5 door openings." Applications make meaning out of the data from things and enterprises have lacked the tools to easily build them.

The big change is the need to continuously process data: There's no way out because competition demands insights - from products, production, supply chains and so on - so that decisions can be made in real-time. So data needs to be analyzed on-the-fly as it is received, in a context that can make meaning from it. Sure, you can store the raw data if you want, but remember that disks etc., operate a million times slower than a CPU and memory. The profound technological change is the need to derive meaning - analyze, learn and predict - from data that is boundless. Batches or even mini-batches such as those used in Spark are not useful. What matters is a continuous analysis of the state of real-world things and the constant availability of insights. So digital twins of things that make sense of their own data in the context of their real world deployments and relationships, self-train and learn rather than big batch style models that require a store-then-analyze approach.

VMblog:  What technologies are making it possible to analyze lots of data fast, close to the edge?

Crosby:  There is a profound shift from the traditional database-as-truth mentality, to one in which continuous analysis happens as data arrives. After all, why take the latency hit of a database access to see the state of some data element and deliver a result a million times more slowly than CPU and memory? Digital twins are essentially concurrent, in-memory objects that maintain the state of the real-world thing they represent. They are concurrent, active objects that can also compute on their own data - so as new events arrive they can compute all sorts of insights, on the fly. That in its own right is pretty cool, but recall that information is contextual. If digital twins link to each other - building a graph that captures the relationships that give contextual meaning to data, then an even more profound opportunity presents itself: digital twins can use the states of the other "things" to which they are linked, with their own data, to accomplish analysis in context.  Here's an example: at you will find an app in which digital twins of the intersections with free running lights in Palo Alto, CA, use both their own sensor data (about 80 sensors per intersection) plus the sensor data from their neighbors, in context, to predict the future signal state at the intersection, all in the blink of an eye, on more than 4TB per day. The digital twin at each intersection uses its own traffic data plus that of its neighbors, to continuously analyze and predict. The insights are continuously streamed via an API (in Azure).  

In a nutshell: digital twins that analyze, learn and predict, in a geospatial context, deliver streaming insights in the blink of an eye.

VMblog:  IoT.  Is this going to drive 5G or will consumers?

Crosby:  5G is inevitable. Carriers have to do it just to be competitive. So yes, their appeal to consumers plays a big role, but there are strong enterprise facing features too. One of the most frequently touted new features of 5G, beyond vastly increased bandwidth, is the opportunity for carriers to deliver secure, isolated network services to enterprises and smart cities.  A customer can avoid the complexity of a dedicated or wired network for remote or temporary sites; or ensure that traffic from mobile devices in the hands of employees is isolated cleanly from the Internet. Slicing offers an opportunity for carriers to eliminate the complexity of VPN management, and to offer services that isolate and secure enterprise traffic - a significant opportunity given the ever-present risks of cyber-attacks and rapid growth of a distributed workforces.

IoT is really a catch-all term for two things: consumer devices other than mobile phones - like home control, etc. - and the increasing instrumentation and automation in production lines, monitoring in premises, instrumentation of products, and so on. It's not a market per se, but a good catch-all for "everything going online."

VMblog:  So is 5G a real boon to consumers, or are there enterprise use cases?

Crosby:  Sure it's a boon to consumers, but the consumer appetite for bandwidth is driven by most consumption of media going online. It's a cloud-to-consumer delivery model. The interesting problem that enterprise use cases present is that they are heavily edge-to-cloud oriented. 5G networks will be architected to deal with streaming in the opposite direction - from devices to cloud fabric that can analyze the data. Slicing offers a new business opportunity to carriers because it offers both isolation and QoS and this in turn offers a guarantee against denial of service attacks from 3rd parties and an assurance that enterprise traffic will receive priority. 

VMblog:  Fast networking is a great opportunity - but will edge computing, i.e. an "edge cloud" of Ericsson Edge Gravity - actually be a thing?

Crosby:  I hear the "edge cloud" marketing a lot. Look, there is a clear need for a hybrid architecture to support real-time continuous analysis, learning and prediction. Data comes from the edge and needs to be reduced and cleaned before it can be analyzed. There are two predominant architectures that I see. First, where the "things" are connected to an enterprise network - so we know where the data is coming from - and cleaning and sometimes analysis occurs in the "enterprise cloud" or both the enterprise and a public cloud. Second, when the "things" are mobile and data comes over the Internet, or perhaps over a carrier network.  Since things are mobile their context changes continuously, and the graph of relationships is continually changing. Typically when there's no clear location of the "things" then analysis in the public cloud makes most sense.

Back to the question of specific offerings like Ericsson's Edge Gravity, this is a serious thing, mostly because it allows carriers to run customers' software close to the edge of the network,  It also allows the carrier to run some of their low latency service offerings close to the edge where data is generated.

VMblog:  Kubernetes and "edge workloads," what's the lowdown?

Crosby:  Most enterprises "at the edge," i.e. oil and gas or metal bending manufacturers, lack "cloud native" skills in their development organizations. Put it this way: if you're great at Kubernetes you're likely working for a cloud company or some startup. So getting these companies to a point of proficiency in cloud and all the services available, and the individual software packages, is very very tough. That said, abstractions like Kubernetes are crucial because they simplify what application developers need to do.  So yes, Kubernetes based infrastructure in your favorite cloud, or on prem, is crucial.  I'm a huge fan of what VMware has done with Kubernetes because by wrapping it into their offering they have reduced and will continue to reduce its complexity for real world IT people.  Swim also dramatically simplifies Kubernetes deployments. Swim is usually deployed in containers using Kubernetes. Thereafter all the complexity is hidden - the app just runs and Swim automatically load balances the execution instances. Our goal is to make Swim applications self-managing in terms of security, resource consumption, resilience and scalability.

VMblog:  Finally, how will application developers wrap their heads around this stuff

Crosby:  As I said, it's hard. DevOps in general is a rapidly changing and strategic area for application development. My advice is this, embrace it and let infrastructure become a service offered by your favorite cloud provider. Write applications. Swim is quite remarkable in this respect. With a tiny (a few thousand lines) program in Java, any developer can quickly encode the contextual relationships between things - for example sensors are contained by intersections, which are linked to their neighbors based on geospatial position - and then use Swim's capabilities for continuous, stateful analysis to produce startling insights. When data arrives, Swim builds the digital twins that represent every real-world thing, on the fly. They consume their own data, link to digital twins in their context and then analyze.  So from a simple program, it's easy to scale to a graph of tens of millions of related "things" that analyze, learn and predict and stream their insights. The graph is built by the data, and the vertices in the graph are digital twins that learn. All the infrastructure challenges like "where does this run?" "How do I visualize or access it?" "How do I make the app resilient and scalable?" are addressed by the underlying Swim runtime.


Published Thursday, March 19, 2020 8:47 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<March 2020>