Industry executives and experts share their predictions for 2023. Read them in this 15th annual VMblog.com series exclusive.
Synthetic Data and Generative AI Will Simulate the World - What to Expect in 2023
By Yashar Behzadi, CEO and Founder of Synthesis AI
Artificial Intelligence (AI) continues to fuel innovation across
industries, and the field is poised to heat up even more in the year ahead.
With many predicting a recession, there will be a greater focus on faster
R&D cycles and cutting costs from "data inflation" - the upward creep of
expenses associated with collecting and labeling data for machine learning
models. Additionally, the widespread adoption of generative AI is bringing
ethical and privacy concerns to the forefront.
Synthetic data will play a major role in combating data inflation
and addressing consumer privacy issues in computer vision model development. A report from
Synthesis AI found that 89% of decision-makers believe that synthetic data will
be critical to the future of many industries and will lead to widespread
change, indicating that these technologies will only continue to impact
organizations in all sectors as we enter the new year.
Synthetic Data: The Key to
Addressing Generative AI Ethical Concerns
Generative AI has dominated headlines, and the hype surrounding the
technology is continuing to grow. Data remains the most critical aspect in
building generative AI systems, but using real-world data poses ethical and
privacy concerns. The use of real-world data is only becoming more challenging
as individual countries and economic blocs implement a patchwork of regulations
for data collection, data storage, and more.
Development teams will increasingly use synthetic data when
creating ML models to limit bias and address privacy concerns associated with
datasets collected from the real world. AI adoption is steadily rising, with
over 55% of organizations
indicating AI as a core function in 2021, up from 50% in 2020. As
innovation only continues to increase in the space, it will be imperative for
organizations to invest in the tools and technologies that help mitigate bias
and ensure generative AI models are built in a more ethical and
privacy-compliant way.
Ramping Innovation, Reducing Costs:
How to Address Data Inflation in AI Development
Do more with less: It's the mantra of corporate America, especially
during times of economic uncertainty. During cyclical downturns, it's important
to remember that scaling back does not have to stifle innovation. Organizations
will need to continue investing in the tools and technology required to advance
their processes, products and services--but in a much smarter and more
efficient way.
Over the past several years,
several factors have led to spiraling costs for machine learning
training data. First, the hardware used for data collection has suffered from
the same supply-chain challenges as the rest of the economy, with many
specialized providers relying on a small number of niche semiconductor
companies for chip components. Even as supply chain issues resolve over time,
the fact remains that it's expensive to build and deploy customized hardware
arrays for collecting data. Second, labeling data is a complicated task for
humans, with no economies of scale. As computer vision systems increase in
complexity, so, too, does the work required to label data for training them,
which increases labor costs. And third, more and more limitations are being
placed on the commercial usage of public datasets coming from academia as
awareness of the problem of "data
laundering" becomes more widespread. Limiting supply (by reducing commercial
access to public datasets) while increasing demand (as more companies invest in
ML model development), all other factors being equal, results in higher costs
for labeled, real-world data.
Synthetic data provides an elegant solution for addressing all of
these cost drivers. No specialized hardware is required for data collection.
Synthetic data is labeled perfectly during the creation process, completely
eliminating the need for human annotation. Finally, there are very few supply
constraints with synthetic data - its availability is functionally limitless -
and because it's typically owned and managed by the company that creates it, ML
practitioners don't have to worry about whether their training data has been
ethically sourced.
Generative AI Changed How We Create
Pictures. Emerging 3D Generative AI Will Simulate the World
Generative AI enabled the creation of 2D images from text prompts,
creating new opportunities for artistic expression. Over the next year, we'll
see the technology spur further transformation as companies take these models
one step further to generate 3D models. This emerging capability will change
the way games are built, visual effects are produced, and immersive 3D
environments are developed. For industrial uses, democratizing this technology
will create opportunities for digital twins and simulations to train complex
computer vision systems, such as autonomous vehicles.
There is no doubt that the momentum around AI and other
technologies will continue to transform our lives. Each wave of AI innovation
builds upon the last, and generative AI's opportunity to create virtual 3D
worlds by simulation is no different.
Despite being nascent technologies only beginning to scratch the
surface with enterprise adoption, synthetic data and generative AI holds great
promise to disrupt the AI paradigm as we know it. As we enter 2023, the
potential for synthetic data and generative AI is boundless.
##
ABOUT THE AUTHOR
Yashar Behzadi, CEO and Founder of Synthesis AI
Yashar Behzadi is an experienced entrepreneur who has built
transformative businesses in AI, medical technology, and IoT markets. Now the
CEO at Synthesis AI, he spent the last 14 years in Silicon Valley building and
scaling data-centric technology companies. His work at Proteus Digital Health
was recognized by Wired as one of the top 10 technological breakthroughs of
2008 and as a Technology Pioneer by the World Economic Forum. Yashar has over
30 patents and patents pending and a Ph.D. in Bioengineering from UCSD.