OpenAI Can’t Fix Sora’s Copyright Infringement Problem Because It Was Built With Stolen Content

Fuente

OpenAI Can’t Fix Sora’s Copyright Infringement Problem Because It Was Built With Stolen Content por Emanuel Maiberg en 404media. Consultado el 13 de noviembre de 2025.

Resumen

Este artículo plantea la dificultad que representa impedir la generación de contenido artificial libre de violación de Propiedad Intelectual por parte de los grandes modelos de IAGen como Sora 2.
Habla sobre la dificultad de controlar las instrucciones (prompts) que se le dan al modelo, dado que los usuarios encuentran formas de evadir las restricciones.
También comenta que, incluso si no se genera material que viole la propiedad intelectual, los datos de entrenamiento siguen siendo creaciones y contenido creado por otras personas, y las grandes corporaciones de IA no están dispuesta a pagar por ello.

Captura

The flaw in OpenAI’s attempt to stop users from generating videos of Nintendo and popular cartoon characters exposes a fundamental problem with most generative AI tools: it is extremely difficult to completely stop users from recreating any kind of content that’s in the training data, and OpenAI can’t remove the copyrighted content from Sora 2’s training data because it couldn’t exist without it.
There are several ways to moderate generative AI tools, but the simplest and cheapest method is to refuse to generate prompts that include certain keywords.
However, this method is prone to failure because users find prompts that allude to the image or video they want to generate without using any of those banned words.
The biggest AI companies in the world have admitted that they need this copyrighted content, and that they can’t pay for it.
The reason OpenAI and other AI companies have such a hard time preventing users from generating certain types of content once users realize it’s possible is that the content already exists in the training data.
For OpenAI to actually stop the copyright infringement it needs to make its Sora 2 model “unlearn” copyrighted content, which is incredibly expensive and complicated.
Even when the generated video isn’t recognizably lifting from someone else’s work, that’s what it’s doing. There’s literally nothing else there. It’s just other people’s stuff.

Propiedad Intelectual, TIC, Cultura Digital, IAGen

Publicada:

Modificada:

Páginas que mencionan esta página

TIC

Conoce más sobre mí.

Puedes seguirme en SoundCloud, LinkedIn, y GitHub.
También puedes dejarme un correo.

Todos los derechos del material citado pertenecen a sus respectivos titulares.

TyC