Home > News > ChatGPT Maker Raises Concerns: Chinese AI Models May Leverage OpenAI Data

ChatGPT Maker Raises Concerns: Chinese AI Models May Leverage OpenAI Data

by Daniel Feb 20,2025

OpenAI suspects that China's DeepSeek AI models, significantly cheaper than Western counterparts, may have been trained using OpenAI data. This revelation, coupled with DeepSeek's rapid rise in popularity, triggered a stock market plunge for major AI players. Nvidia, a key GPU provider for AI, suffered the largest single-day loss in Wall Street history, losing nearly $600 billion in market value. Other companies like Microsoft, Meta, Alphabet, and Dell Technologies also experienced significant drops.

DeepSeek's R1 model, built on the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) compared to Western models. While this claim is disputed, it highlights the potential threat of cheaper alternatives and raises concerns about the massive investments made by American tech companies in AI.

OpenAI and Microsoft are investigating whether DeepSeek utilized OpenAI's API or employed "distillation," a technique that extracts data from larger models, violating OpenAI's terms of service. OpenAI acknowledges that Chinese companies constantly attempt to replicate leading US AI models and emphasizes its efforts to protect its intellectual property, including collaborating with the US government.

Donald Trump's AI advisor, David Sacks, supports the claim that DeepSeek used distillation, suggesting that leading AI companies will implement measures to prevent this practice.

The situation is ironic, given OpenAI's own history. OpenAI previously argued that creating AI models like ChatGPT is impossible without using copyrighted material, a stance supported by their submission to the UK's House of Lords and further highlighted by lawsuits from the New York Times and 17 authors alleging copyright infringement. OpenAI maintains that its training practices constitute "fair use." The ongoing debate underscores the complexities surrounding copyright in the rapidly evolving landscape of generative AI. A 2018 US Copyright Office ruling that AI-generated art is not copyrightable further complicates the legal issues.