Copyright Group Takes Down Dutch Language AI Dataset

copyright-group-takes-down-dutch-language-ai-dataset
Copyright Group Takes Down Dutch Language AI Dataset

Posted by msmash from the tussle-continues dept.

Dutch-based copyright enforcement group BREIN has taken down a large language dataset that was being offered for use in training AI models, the organization said on Tuesday. From a report: The dataset included information collected without permission from tens of thousands of books, news sites, and Dutch language subtitles harvested from “countless” films and TV series, BREIN said in a statement. Director Bastiaan van Ramshorst told Reuters it was not clear whether or how widely the dataset may already have been used by AI companies. “It’s very difficult to know, but we are trying to be on time” to avoid future lawsuits, he said. He said the European Union’s AI Act will require AI firms to disclose what datasets they have used to train their models.

Machines take me by surprise with great frequency. – Alan Turing

Working…