Virtualization Technology News and Information
Generative AI Startup DataCebo Launches to Bring Synthetic Data to All Enterprises With $8.5 Million in Seed Funding

DataCebo emerged with SDV Enterprise, a commercial offering of the popular open source product, Synthetic Data Vault (SDV). With SDV Enterprise, developers can easily build, deploy and manage sophisticated generative AI models for enterprise-grade applications when real data is limited or unavailable. SDV Enterprise's models create higher quality synthetic data that is statistically similar to original data so developers can effectively test applications and train robust ML models. SDV Enterprise is currently in beta with the Global 2000.

Today Global 2000 organizations have 500 to 2000 applications for which they need to create synthetic data 12 times a year.

"Our developers spend a lot of time creating data manually to test their applications. We had been looking for generative AI-based solutions that can automate and create high quality synthetic data for our needs. Of all the solutions we looked at, DataCebo's SDV Enterprise was the best fit to handle the complexity of our data. With SDV Enterprise, we were able to generate synthetic data within hours - what otherwise took our developers days or weeks in some cases. Currently we have used SDV Enterprise in 13 applications and the demand is growing exponentially," said Wim Blommaert, product owner of AI-powered synthetic data generation at ING Belgium, one of the biggest banks in the world.

"Synthetic data can help companies reduce the bias and privacy risks that are common with real-world data. DataCebo helps data teams generate synthetic tabular datasets using generative adversarial networks (GANs) in a rapid, accurate way, so they can train more ML models in a given timeframe," said Kevin Petrie, vice president of research at Eckerson Group.

DataCebo co-founders Kalyan Veeramachaneni (CEO) and Neha Patki (vice president of product) created SDV when at MIT's Data to AI Lab. SDV lets developers build a proof-of-concept generative AI model for small tabular and relational datasets with simple schemas and create synthetic data. SDV has been downloaded more than a million times and has the largest community around synthetic data. DataCebo was then founded in 2020 to revolutionize developer productivity at enterprises by leveraging generative AI. 

Veeramachaneni said: "The ability to build generative models on-prem is critical for enterprises. Their data is proprietary and is very specific. In our first year, we quickly learned that this unique capability that SDV Enterprise provides is a massive enabler for them. Our customers often ask whether they need massive hardware or specific hardware requirements to use SDV Enterprise. They are often surprised that with SDV Enterprise, they can train generative models on a single machine. This opens up a new horizon of possibilities for training and using these models and applying them to a variety of use cases. As one customer said, if we have to spend $100,000 to train a model, it simply reduces the number of use cases we can use it for." 

DataCebo's first product - SDV Enterprise - takes SDV to the next level by providing every team with synthetic data 1000x faster with 10x the quality. Key features include:

  • Scalability: developers can train a generative AI model with much larger datasets and complex schemas with hundreds of interconnected tables
  • Deep Data Understanding: developers can train models that understand the deeper meaning behind real world data concepts like the structure of a phone number and which geographical areas it represents
  • Programmability: developers can fine-tune the generative AI model stack using low-code APIs by supplying their data schema, business logic and evaluation criteria
  • Integration: developers can deploy synthetic data applications by ingesting and exporting data in a variety of different formats
  • Management: developers can manage multiple synthetic data applications, track changes and update their generative AI models as their applications grow and change

DataCebo Raises $8.5 Million in Seed Funding

DataCebo also announced that it has raised $8.5 million in seed funding co-led by Link Ventures and Zetta Venture Partners. Uncorrelated Ventures also participated. The company plans to use the funding to advance its product and to build a go-to-market team. 

"We are thrilled to support this world-class MIT team. Their leadership in generative modeling for complex enterprise data is unlike others in the synthetic data industry. We are confident in this team's ability to lead the category, enabling the next users of AI models to connect statistical to computational outcomes and sew the fabric of open to closed source synthetic data generation," said Dave Blundin, co-founder and managing partner at Link Ventures and a DataCebo board member.

"The huge enthusiasm of the open source community and the ROI enjoyed by early commercial adopters have shown DataCebo to be a product leader in the emerging field of generative AI for synthetic data. It is rare to find a company whose products serve as both pathbreakers and standard-bearers, and we are very excited to invest in this amazing team from MIT knowing that they will continue to push the envelope," said Mark Gorenberg, founder and managing director at Zetta Venture Partners and a DataCebo board member.

Published Thursday, December 07, 2023 2:36 PM by David Marshall
Filed under:
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<December 2023>