Virtualization Technology News and Information
Article
RSS
How to make ML accessible to all

By Josh Mesout Chief Innovation Officer, Civo

Generative AI may be garnering all of the attention right now, but there are many branches of Artificial Intelligence (AI) that start-ups and enterprises have been looking to capitalize on for a long time. Machine Learning (ML) is no exception. Indeed, 74% of executives anticipate that AI will deliver more efficient business processes, and 55% said it will enable the creation of new products and services.

Yet in reality, 85% of ML projects fail to deliver and only 53% of projects make it from prototype to production. Engineers and developers know the high potential that ML has for their organizations, but there are huge challenges in realizing it.

The hurdles

Part of the issue with ML projects is that it takes a vast amount of time and resources to build the supporting infrastructure for only a minimal amount of ML insight. These components range from complex areas such as feature extraction to more labor-intensive tasks such as setting up process management tools.

According to D Scully at Google Research, eight hours of ML engineering is dependent on 96 hours of infrastructure engineering, 32 hours of data engineering, and 24 hours of feature engineering. To break down this whole process by percentages, 60% of hours are spent on infrastructure engineering, 20% on data engineering, 15% on feature engineering, and only 5% on ML engineering.

Organizations are having to spend vast amounts of time reconfiguring their adjacent infrastructure to achieve their ML goals. For those of a smaller size with more limited resources, they do not have this time to spare for such a small amount of reward.

Open sourcing 

As a result of these significant internal demands to run ML, more and more engineers are becoming dependent on open source to help resolve these issues. According to Anaconda's State of Data Science 2022 report, 65% of companies lack the investment in tooling to enable high-quality ML production, with 87% of organizations already leveraging open source software.  

Organizations look towards open source machine learning products for a variety of reasons.

For start-ups or organizations of a smaller size, it delivers the most cost and resource effective method for running ML algorithms. Spending two months learning how to use AWS SageMaker before accessing ML insights isn't a feasible use of time for many businesses, especially when the non-proprietary infrastructure is available instantly.  

On top of being more economical, it often offers the most in demand tooling alongside superior product quality. Not being stuck behind proprietary dependencies means the tooling can be easily adjusted for specific cases, reducing the complexity involved in withdrawing ML's valuable insights.

Open source also allows organizations to leverage the latest ML expertise available. Many of the most popular projects, such as Kubeflow, are frequently contributed to by some of the best and the brightest minds in the industry, so organizations can capitalize on external knowledge that may otherwise be out of their reach and focus their domain expertise on the problem.

All of these benefits go some way to resolving the common issues found with ML. Yet there is always more that can be done, 32% of ML engineers want to see further simplification in the data science community, which can further smooth the learning curve for drawing insights from ML.

What else can be done?

To drive the accessibility of ML to the next level, a constructive ecosystem needs to be built and maintained. The users and resources are already there but need to be channeled in the correct manner. Investing in open source cloud ecosystems can remove barriers to the adoption of ML and make it more accessible.

The first port of call is developing more interoperable tooling. It is clear what tools are favored by developers, and the correct infrastructure and maintenance can help to support them sustainably. By increasing the accessibility of the tooling that developers are familiar with using, time can be reduced - developers don't have to go through a learning period when setting up algorithms for new use cases and instances.  

Barriers to ML can also be reduced through a variety of different solutions. GPU Edge boxes will allow ML to be run as effectively in on-prem, hybrid, and edge-based use cases, ideal for secure workloads that need to be kept in-house.

GPU instances provide streamlined methods for running ML, with fast launch times and bandwidth pooling. More importantly however for organizations, GPU instances provide a transparent pricing model. As such there will be fewer unknown costs that can take smaller companies by surprise and leave a huge dent in their budget.

Fractional GPU instances provide similar benefits to GPU instances but may be more appropriate to those of a lower scale, either small businesses or hobbyists. By incorporating those into the ecosystem who traditionally may not have had access to ML, understanding, and accessibility can be increased for all.

ML shouldn't be a closed shop, where its potential is only realized by those of scale. Through prioritization of developer's needs and open source tooling, ML can be made accessible to all.

##

To learn more about the transformative nature of cloud native applications and open source software, join us at KubeCon + CloudNativeCon Europe 2023, hosted by the Cloud Native Computing Foundation, which takes place from April 18-21.    

ABOUT THE AUTHOR

Josh Mesout Chief Innovation Officer, Civo 

Josh-Mesout 

Mesout spent seven years at AstraZeneca, where he led the teams building the company's Enterprise Machine learning and AI platforms. In addition to this, Mesout has led the technical implementation of Deep Learning based clinical diagnostics using cloud native technologies, built rapid prototypes in its Innovation Lab and won an award for his contribution towards implementing ML into the AstraZeneca COVID-19 vaccine. Mesout has a long history with cloud and ML, assisting in the creation of learning materials, qualifications and exams for AWS' ML platform, SageMaker.
 
Mesout now works to accelerate Civo's mission of building a better cloud-native world. He currently leads Civo's ML program, building ML infrastructure that will lessen the workload on developers, significantly reducing the time from ML idea to insight.

Published Monday, April 10, 2023 10:09 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<April 2023>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456