Innovation in AI is
exploding, with companies eager to leverage the technology to deliver new
applications and services. But the development of AI applications is
significantly limited by the high costs of compute resources, and both the cost
and demand for compute continue to grow. To understand a new approach to AI
inference compute management, I spoke with Neetu Pathak and Sushant Tripathy -
the founders and respective CEO and CTO of Skymel. Skymel emerged from stealth
today with NeuroSplit, an interesting "adaptive inferencing" technology that
changes the way AI applications consume resources.
VMblog:
Tell us about Skymel. What's the backstory, and why did you found the company?
Neetu
Pathak: Skymel was founded by myself and my
co-founder, Sushant
Tripathy. I serve as CEO while Sushant is our CTO.
Previously, I worked at Redis and Fortella while Sushant led machine-learning
initiatives at Google and PayPal. While working on an on-device ML project for
Google Assistant, Sushant realized that a lot of compute power is present but
under-utilized on user devices (such as smartphones, laptops, and desktops).
This underutilization is part and parcel of the cloud-centric services that are
prevalent at this time. In order to provide personalized services, most companies
prefer to upload user data rather than process it on-device.
At the same time, I
noticed a few emerging trends. Customer companies were ready to pay a premium
for decreasing the response latency of their user-facing products.
Additionally, most companies lacked the proper skillset or had poor
communication amongst different verticals, which resulted in poorly organized
network configuration of cloud assets. This led to additional cost increases.
As both of us were
cognizant of these points, we decided to create Skymel's NeuroSplit, a
technology that will harness idle compute on users' devices to enhance their
user experience and provide corporations with the lower response latencies that
they crave while reducing their AI inference costs on existing infrastructure
(by up to 60%).
VMblog: It sounds like NeuroSplit is your core offering. What does it
do, and who is it for? How is it differentiated from what's already out there?
Sushant Tripathy: NeuroSplit dynamically assesses available
idle compute on the user device, using it first to power the AI application
while utilizing cloud-based compute as needed to ensure optimal AI application performance. It is specifically
engineered to smartly off-load AI compute tasks onto users' devices based on
the match between the device's available idle compute, and the compute
requirement of appropriate stub models that partially process the user data.
NeuroSplit also routes the stub model's outputs to the right cloud model
endpoints to finish the AI inference tasks.
NeuroSplit scales better
than pure cloud offerings, as every new user brings a device. It enables
AI-powered application providers to reduce the specifications (both in terms of
average Video Random-Access Memory (VRAM) and compute capacity), and over-provisioning
(usually around 30% to handle peak traffic) of their AI servers.
NeuroSplit reduces the
amount of compute needed by application developers to around 60-70% of what was
previously needed, enabling the use of fewer and/or lower-cost GPUs to develop
and deliver AI-powered applications. For example, an application that previously
required multiple Nvidia A100s at an average cost of $2.74 per hour can use
either a single A100 or multiple V100/Ss at 83 cents per hour when using
NeuroSplit. It does so while enhancing user experience by reducing response
latency between application servers and the end user.
NP:
No one else is splitting compute for a single AI model's inferences between
user devices and cloud servers today. Our NeuroSplit technology does so
dynamically, adapting to the idle compute available on the user's device, thus
ensuring optimal user experience - especially in simultaneously running
applications. Other companies that look like us fall into three categories.
Some, like Lambda and CoreWeave, offer leased specialized hardware on the
cloud. Others, like Hugging Face, Baseten, Replicate, Together.ai and Groq,
offer a managed cloud platform for serving AI models. Still others, like Deci
and Tiny ML, offer edge-optimized models. None of these services are utilizing
the combined power of local and cloud compute optimally. Skymel works with the
currently existing cloud backends and will provide up to 60% cost reductions on
ongoing cloud costs. Further, Skymel serves large complex models that cannot
fit on local devices.
VMblog:
What about funding; where are you in your fundraising journey, and how will you
use current funding to expand Skymel?
NP: We raised $500,000 in pre-seed equity funding
from Unusual Ventures as part of the firm's Unusual Academy and have an
additional $25,000 in angel funding. We will use these funds and the funds from
the seed round we are currently raising to expand the engineering team so that
we can accelerate NeuroSplit to market to serve the needs of our existing
enterprise customers. Sushant and I
are currently the only full-time employees, and we have a couple of
contractors. Our plan is to have around 10 full-time employees by the end of
the year, which will allow us to really speed up our development and get this
technology to more users.
I would add that we are
currently pre-revenue and are not yet projecting ARR, but with the savings we
offer application developers/providers and the high cost of AI compute,
compounded by the shortage of GPUs and data center space, we aspire to play a
key role in addressing these issues to bolster the incredible growth of AI use.
We have also received enthusiastic interest from executives of large enterprise
companies. In view of all these observations, we expect that Skymel will have a
healthy revenue stream.
VMblog: You mentioned that NeuroSplit offloads AI compute onto
end-user devices. Is there any concern for users about the data on their
devices interacting with Skymel?
ST: Skymel's
NeuroSplit runs AI inference pipelines on user data on their devices.
Neither the data nor the inference results are the property of Skymel - Skymel
is merely the orchestration agent and doesn't store any data. Any data
retention policies are determined by the enterprises that own the AI
application that uses Skymel's NeuroSplit, so our customers can be confident
that neither their data nor their user experience will be affected as a result
of using our technology.
VMblog: What does NeuroSplit mean for the data or technical
decision-maker? For example, why should a C-suite-level manager in the IT
department care about it; what's the enterprise angle?
NP: Most enterprises that run applications on
end-user devices are extremely motivated to provide a highly engaging user
experience while keeping their cloud costs low. Skymel's NeuroSplit caters to
both of these needs without requiring a codebase overhaul or cloud
infrastructure change. It can be simply included in the client-facing
application with a few lines of code.
VMblog: Are there any key challenges you are facing while bringing a
new company into the AI industry?
NP: The need for more and better talent is
definitely a challenge in this space. Still, we are confident that with the
uniqueness of our offering and vision and with the funds we are raising in our
seed round, we can attract and retain the right people. The other big challenge
is the broadness of the messaging and positioning in the AI space. Even though
no one is doing what we do, and so many of the VCs and others we are talking to
say they didn't even think it could be done, there is a lot of messaging out there
that confuses people into thinking that the leased GPU and other offerings are
like NeuroSplit. Cutting through the clutter in AI is tough right now, with so
much focus on the industry.
That said, our burn rate
is very low and our value prop is directly tied to customer savings and
increased user engagement. Given that, we feel really good about our prospects
and are certain we will build a strong team to deliver NeuroSplit's value to more
customers.
VMblog: Is there anything else we should
know? What can VMblog readers expect from Skymel in the future?
NP: We launched out of stealth on May 22, and
NeuroSplit is currently available as a preview. Companies can contact us
directly and we will work with them to deploy and trial the solution.
NeuroSplit will be made available as beta in coming months, and we are excited
to see the benefits it brings to customers, and the innovation it will unlock.
##