DataCebo, a company founded by Kalyan Veeramachaneni and Neha Patki from MIT Data Lab, has launched an enterprise version of its open source synthetic data library called Synthetic Data Vault (SDV). With a vision to create data using generative AI, DataCebo has developed a software that allows businesses to generate synthetic data from relational and tabular databases. This innovative approach enables companies to use quality business data for various purposes, including training and testing machine learning models, without compromising sensitive information.
Key Takeaway
DataCebo releases an enterprise version of its synthetic data library, enabling businesses to generate custom synthetic data using generative AI. This technology revolutionizes data generation and offers a scalable solution for businesses to safely test and build models without exposing sensitive information. With seed funding of $8.5 million, DataCebo is set to drive innovation and expand its operations in the coming year.
Revolutionizing Data Generation
Traditionally, companies have relied on manual methods for creating synthetic data, a labor-intensive and error-prone process that is difficult to scale. However, with DataCebo’s generative AI technology, the process becomes much simpler. Users can describe their desired data, and the software automatically analyzes the characteristics of the original data set to generate a realistic synthetic counterpart. This breakthrough allows businesses to harness the power of generative AI to create meaningful data sets without exposing any sensitive information.
The Power of Open Source
DataCebo initially developed an open source version of SDV, which gained significant popularity with over a million downloads and an active community of users. This public platform has not only provided validation for the company’s core algorithms but has also allowed for continuous improvement. Through user feedback and bug reports, DataCebo has been able to fine-tune its software and address any issues promptly.
Scaling for Enterprise
While the open source version of SDV was designed to handle a limited number of tables, the enterprise version is built to accommodate up to a hundred tables. This scalability empowers businesses to build complex models based on a larger quantity of data sources. DataCebo’s commitment to providing a comprehensive solution for enterprise-scale data generation has positioned them as a leader in the field.
Driving Innovation and Growth
With $8.5 million in seed funding led by Link Ventures and Zetta Venture Partners, and participation from Uncorrelated Ventures, DataCebo is well-positioned to drive innovation and expand its operations. Currently employing 11 individuals, the company plans to hire additional talent and expects to have a team of around 20 within the next year, depending on business growth.