Just ahead of VMworld, I was able to catch up with Steve
Francis, founder and chief evangelist at LogicMonitor, to learn more about the company's monitoring solution, challenges facing today's enterprise organizations, industry trends and more.
VMblog: To kick
the conversation off, can you give me an idea of how LogicMonitor works in the
enterprise?Steve Francis: LogicMonitor
was conceived as a platform for SaaS companies - our initial market approach
didn't even consider enterprises. However, it turns out the solution we
provide - providing comprehensive, automated monitoring for everything in a
datacenter, delivered through a SaaS platform, without requiring your best
engineers to run or configure the monitoring - resonates very much with
Enterprises. So we've had a variety of large enterprises come across
us and deploy us, such as ZenDesk, Nielsen, and Pacific Life Insurance.
But to
answer what you meant - LogicMonitor works in the enterprise like people hope
their monitoring systems would work, but never do. Our SaaS based, agentless
architecture means it's very quick to deploy and be up and running, without
requiring any infrastructure. Our built in automation and knowledge about what
to monitor mean the monitoring is always current with the state of the devices
and doesn't reduce the agility of your dev teams. Our elegant UI means that you
don't need to centralize control to an expert monitoring group. Our analytics
mean that you can extract meaning from your data, and easily identify anomalies
amongst hundreds of systems. And our flexibility means that you can extend the
monitoring to your custom business metrics, getting only the right alerts to
the right people, by the right mechanisms, at the right time.
VMblog: What
does it mean when you talk about "full stack" monitoring?
Francis: It
really means considering the business use of the IT infrastructure, not the
discrete functions. For example, an enterprise may be running a customer-facing
application that consists of a load balancer, several front end web servers
running on VMware, and a database. "Full stack" monitoring means monitoring all
the components both vertically and horizontally.
So,
that means you need to monitor the chassis of the VMware server for hardware
issues such as a power supply failed. You also need to monitor the hypervisor,
to alert on things like CPU contention. But you also need to monitor the
virtual machines, the web servers on the VMs, and so on. Everything on all
devices, from physical up to the application layer. That's one definition of
full stack.
However,
in order to get a view of your infrastructure as it relates to its business
function, you also need to monitor the load balancer, the storage array
containing the datastore, the fiber channel switches, and the database
performance, so you can see their performance together.
Having
point monitoring tools looking at all these components individually has been
shown to result in more outage incidents, and to make time to repair longer.
Compare that to a tool that can provide visibility into the full stack, and
show you all the components at once - it will be much quicker to identify the
real cause of an issue.
VMblog: What do
you see as some of the biggest challenges facing enterprises today, and how is
monitoring helping to alleviate them?
Francis: Enterprises
are dealing with the velocity of changes going on in the technical space, and
it's not an easy issue. It's a good challenge - the faster the enterprise can
deliver and iterate on applications, the more quickly they learn, and can start
to reveal and harness value.
But
there's a lot of complexity in trying to manage change - not just technologies,
like Docker and the cloud, but organizationally - DevOps, and so forth. How do
you get your engineering teams to release faster, without sacrificing quality,
or performance, or the relationship with the ops teams?
Well,
one thing to consider is to make sure your monitoring doesn't impede the
velocity of your teams.
If you
have a nice process aligning dev and ops, you will likely be able to release
quicker - but if production releases are delayed by the need to get new servers
or containers or cloud resources into monitoring, and that is a manual task, it
negates a lot of the value you get from your agility. You need a monitoring
system that can be automated and integrated into the tools you use.
VMblog: What
applications or trends are you seeing the most right now?
Francis: Two
related trends - enterprises are really diving head first into enterprise transformation
initiatives, and it is driving a lot of change - cloud adoption, DevOps,
adopting SaaS where they can for non-core applications, and also new
technologies like Kafka and containers. And then the realization comes that
they've just shifted the bottleneck - they've improved things, but in order to
get the full benefit of the agility that all the new technologies and processes
they've adopted can enable, they have to remove other bottlenecks. Monitoring
is often one of the bottlenecks, but container and cloud resource orchestration
is the bigger, more impactful change. The rapid adoption and evolution of tools
like Terraform and Kubernetes to automate even further and maximize software
development agility is compelling.
VMblog: What's
in the future for monitoring, and what can we look forward to?
Francis: Monitoring
will have to ride along with the transformation and automation/orchestration
waves. Monitoring is essential to production systems, and that is the way they
are going, so monitoring will necessarily move that way too. LogicMonitor is
releasing tight integrations with Kubernetes, for example, so as the
orchestration systems create or move resources, the monitoring is updated.
The job of
monitoring, like the rest of IT, is to advance the business' goals. Monitoring
in the future will deal with objects coming and going as part of services. The
services will expose a lot of application specific data, but the underlying
health of infrastructure and supporting software (storage arrays, container
hosts, operating systems, etc.) will always be vital. Then a challenge will be
to extract meaning from all the data, and deliver only the issues that are
meaningful and actionable to people at the right time via the right method.
Machine learning may have a part to play here - less likely in production,
because assuming you remediate and do a proper post mortem on incidents, they
shouldn't be repeated. This removes patterns that ML can be trained on. Earlier
in the development cycle, however, monitoring can alert you to issues in
releases. Serving a web page used to take two database requests. Now it takes
10. Identifying deviations in performance between production and dev code and
determining whether they are significant is going to be an increasingly
important role of monitoring as development agility increases.
##