Virtualization Technology News and Information
Article
RSS
VMblog Expert Interview: Skymel Launches NeuroSplit Adaptive Inferencing, Tackling AI Compute Costs by Letting You Run Latest GenAI Models on Older GPUs

vmblog interview skymel 

Innovation in AI is exploding, with companies eager to leverage the technology to deliver new applications and services. But the development of AI applications is significantly limited by the high costs of compute resources, and both the cost and demand for compute continue to grow. To understand a new approach to AI inference compute management, I spoke with Neetu Pathak and Sushant Tripathy - the founders and respective CEO and CTO of Skymel. Skymel emerged from stealth today with NeuroSplit, an interesting "adaptive inferencing" technology that changes the way AI applications consume resources.

VMblog: Tell us about Skymel. What's the backstory, and why did you found the company?

Neetu Pathak: Skymel was founded by myself and my co-founder, Sushant Tripathy. I serve as CEO while Sushant is our CTO. Previously, I worked at Redis and Fortella while Sushant led machine-learning initiatives at Google and PayPal. While working on an on-device ML project for Google Assistant, Sushant realized that a lot of compute power is present but under-utilized on user devices (such as smartphones, laptops, and desktops). This underutilization is part and parcel of the cloud-centric services that are prevalent at this time. In order to provide personalized services, most companies prefer to upload user data rather than process it on-device.

At the same time, I noticed a few emerging trends. Customer companies were ready to pay a premium for decreasing the response latency of their user-facing products. Additionally, most companies lacked the proper skillset or had poor communication amongst different verticals, which resulted in poorly organized network configuration of cloud assets. This led to additional cost increases.

As both of us were cognizant of these points, we decided to create Skymel's NeuroSplit, a technology that will harness idle compute on users' devices to enhance their user experience and provide corporations with the lower response latencies that they crave while reducing their AI inference costs on existing infrastructure (by up to 60%).

VMblog: It sounds like NeuroSplit is your core offering. What does it do, and who is it for? How is it differentiated from what's already out there?

Sushant Tripathy: NeuroSplit dynamically assesses available idle compute on the user device, using it first to power the AI application while utilizing cloud-based compute as needed to ensure optimal AI  application performance. It is specifically engineered to smartly off-load AI compute tasks onto users' devices based on the match between the device's available idle compute, and the compute requirement of appropriate stub models that partially process the user data. NeuroSplit also routes the stub model's outputs to the right cloud model endpoints to finish the AI inference tasks.

NeuroSplit scales better than pure cloud offerings, as every new user brings a device. It enables AI-powered application providers to reduce the specifications (both in terms of average Video Random-Access Memory (VRAM) and compute capacity), and over-provisioning (usually around 30% to handle peak traffic) of their AI servers.

NeuroSplit reduces the amount of compute needed by application developers to around 60-70% of what was previously needed, enabling the use of fewer and/or lower-cost GPUs to develop and deliver AI-powered applications. For example, an application that previously required multiple Nvidia A100s at an average cost of $2.74 per hour can use either a single A100 or multiple V100/Ss at 83 cents per hour when using NeuroSplit. It does so while enhancing user experience by reducing response latency between application servers and the end user.   

NP: No one else is splitting compute for a single AI model's inferences between user devices and cloud servers today. Our NeuroSplit technology does so dynamically, adapting to the idle compute available on the user's device, thus ensuring optimal user experience - especially in simultaneously running applications. Other companies that look like us fall into three categories. Some, like Lambda and CoreWeave, offer leased specialized hardware on the cloud. Others, like Hugging Face, Baseten, Replicate, Together.ai and Groq, offer a managed cloud platform for serving AI models. Still others, like Deci and Tiny ML, offer edge-optimized models. None of these services are utilizing the combined power of local and cloud compute optimally. Skymel works with the currently existing cloud backends and will provide up to 60% cost reductions on ongoing cloud costs. Further, Skymel serves large complex models that cannot fit on local devices.

VMblog: What about funding; where are you in your fundraising journey, and how will you use current funding to expand Skymel?

NP: We raised $500,000 in pre-seed equity funding from Unusual Ventures as part of the firm's Unusual Academy and have an additional $25,000 in angel funding. We will use these funds and the funds from the seed round we are currently raising to expand the engineering team so that we can accelerate NeuroSplit to market to serve the needs of our existing enterprise customers. Sushant and I are currently the only full-time employees, and we have a couple of contractors. Our plan is to have around 10 full-time employees by the end of the year, which will allow us to really speed up our development and get this technology to more users.

I would add that we are currently pre-revenue and are not yet projecting ARR, but with the savings we offer application developers/providers and the high cost of AI compute, compounded by the shortage of GPUs and data center space, we aspire to play a key role in addressing these issues to bolster the incredible growth of AI use. We have also received enthusiastic interest from executives of large enterprise companies. In view of all these observations, we expect that Skymel will have a healthy revenue stream.

VMblog: You mentioned that NeuroSplit offloads AI compute onto end-user devices. Is there any concern for users about the data on their devices interacting with Skymel?

ST: Skymel's NeuroSplit runs AI inference pipelines on user data on their devices. Neither the data nor the inference results are the property of Skymel - Skymel is merely the orchestration agent and doesn't store any data. Any data retention policies are determined by the enterprises that own the AI application that uses Skymel's NeuroSplit, so our customers can be confident that neither their data nor their user experience will be affected as a result of using our technology.

VMblog: What does NeuroSplit mean for the data or technical decision-maker? For example, why should a C-suite-level manager in the IT department care about it; what's the enterprise angle?

NP: Most enterprises that run applications on end-user devices are extremely motivated to provide a highly engaging user experience while keeping their cloud costs low. Skymel's NeuroSplit caters to both of these needs without requiring a codebase overhaul or cloud infrastructure change. It can be simply included in the client-facing application with a few lines of code.

VMblog: Are there any key challenges you are facing while bringing a new company into the AI industry?

NP: The need for more and better talent is definitely a challenge in this space. Still, we are confident that with the uniqueness of our offering and vision and with the funds we are raising in our seed round, we can attract and retain the right people. The other big challenge is the broadness of the messaging and positioning in the AI space. Even though no one is doing what we do, and so many of the VCs and others we are talking to say they didn't even think it could be done, there is a lot of messaging out there that confuses people into thinking that the leased GPU and other offerings are like NeuroSplit. Cutting through the clutter in AI is tough right now, with so much focus on the industry.

That said, our burn rate is very low and our value prop is directly tied to customer savings and increased user engagement. Given that, we feel really good about our prospects and are certain we will build a strong team to deliver NeuroSplit's value to more customers. 

VMblog: Is there anything else we should know? What can VMblog readers expect from Skymel in the future?

NP: We launched out of stealth on May 22, and NeuroSplit is currently available as a preview. Companies can contact us directly and we will work with them to deploy and trial the solution. NeuroSplit will be made available as beta in coming months, and we are excited to see the benefits it brings to customers, and the innovation it will unlock.

##

Published Thursday, May 23, 2024 11:57 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<May 2024>
SuMoTuWeThFrSa
2829301234
567891011
12131415161718
19202122232425
2627282930311
2345678