Monday, November 18, 2024

 Is this the best source for training AI?

https://archive.is/TmYqM#selection-905.16-913.25

The Hollywood AI Database

I can now say with absolute confidence that many AI systems have been trained on TV and film writers’ work. Not just on The Godfather and Alf, but on more than 53,000 other movies and 85,000 other TV episodes: Dialogue from all of it is included in an AI-training data set that has been used by Apple, Anthropic, Meta, Nvidia, Salesforce, Bloomberg, and other companies. I recently downloaded this data set, which I saw referenced in papers about the development of various large language models (or LLMs). It includes writing from every film nominated for Best Picture from 1950 to 2016, at least 616 episodes of The Simpsons, 170 episodes of Seinfeld, 45 episodes of Twin Peaks, and every episode of The Wire, The Sopranos, and Breaking Bad. It even includes prewritten “live” dialogue from Golden Globes and Academy Awards broadcasts. If a chatbot can mimic a crime-show mobster or a sitcom alien—or, more pressingly, if it can piece together whole shows that might otherwise require a room of writers—data like this are part of the reason why.





Those who do not study history are doomed to repeat it?

https://timesofindia.indiatimes.com/world/rest-of-world/when-machines-took-over-ais-sarcastic-take-on-industrial-revolution/articleshow/115399605.cms

When machines took over: AI’s sarcastic take on industrial revolution



No comments: