This Week in AI: AI isn’t world-ending — but it’s still plenty harmful

this-week-in-ai:-ai-isn’t-world-ending-—-but-it’s-still-plenty-harmful
This Week in AI: AI isn’t world-ending — but it’s still plenty harmful

Hiya, folks, welcome to TechCrunch’s regular AI newsletter.

This week in AI, a new study shows that generative AI really isn’t all that harmful — at least not in the apocalyptic sense.

In a paper submitted to the Association for Computational Linguistics’ annual conference, researchers from the University of Bath and University of Darmstadt argue that models like those in Meta’s Llama family can’t learn independently or acquire new skills without explicit instruction.

The researchers conducted thousands of experiments to test the ability of several models to complete tasks they hadn’t come across before, like answering questions about topics that were outside the scope of their training data. They found that, while the models could superficially follow instructions, they couldn’t master new skills on their own.

“Our study shows that the fear that a model will go away and do something completely unexpected, innovative and potentially dangerous is not valid,” Harish Tayyar Madabushi, a computer scientist at the University of Bath and co-author on the study, said in a statement. “The prevailing narrative that this type of AI is a threat to humanity prevents the widespread adoption and development of these technologies, and also diverts attention from the genuine issues that require our focus.”

There are limitations to the study. The researchers didn’t test the newest and most capable models from vendors like OpenAI and Anthropic, and benchmarking models tends to be an imprecise science. But the research is far from the first to find that today’s generative AI tech isn’t humanity-threatening — and that assuming otherwise risks regrettable policymaking.

In an op-ed in Scientific American last year, AI ethicist Alex Hanna and linguistics professor Emily Bender made the case that corporate AI labs are misdirecting regulatory attention to imaginary, world-ending scenarios as a bureaucratic maneuvering ploy. They pointed to OpenAI CEO Sam Altman’s appearance in a May 2023 congressional hearing, during which he suggested — without evidence — that generative AI tools could go “quite wrong.”

“The broader public and regulatory agencies must not fall for this maneuver,” Hanna and Bender wrote. “Rather we should look to scholars and activists who practice peer review and have pushed back on AI hype in an attempt to understand its detrimental effects here and now.”

Theirs and Madabushi’s are key points to keep in mind as investors continue to pour billions into generative AI and the hype cycle nears its peak. There’s a lot at stake for the companies backing generative AI tech, and what’s good for them — and their backers — isn’t necessarily good for the rest of us.

Generative AI might not cause our extinction. But it’s already harming in other ways — see the spread of nonconsensual deepfake porn, wrongful facial recognition arrests and the hordes of underpaid data annotators. Policymakers hopefully see this too and share this view — or come around eventually. If not, humanity may very well have something to fear.

News

Google Gemini and AI, oh my: Google’s annual Made By Google hardware event took place Tuesday, and the company announced a ton of updates to its Gemini assistant — plus new phones, earbuds and smartwatches. Check out TechCrunch’s roundup for all the latest coverage.

AI copyright suit moves forward: A class action lawsuit filed by artists who allege that Stability AI, Runway AI and DeviantArt illegally trained their AIs on copyrighted works can move forward, but only in part, the presiding judge decided on Monday. In a mixed ruling, several of the plaintiffs’ claims were dismissed while others survived, meaning the suit could end up at trial.

Problems for X and Grok: X, the social media platform owned by Elon Musk, has been targeted with a series of privacy complaints after it helped itself to the data of users in the European Union for training AI models without asking people’s consent. X has agreed to stop EU data processing for training Grok — for now.

YouTube tests Gemini brainstorming: YouTube is testing an integration with Gemini to help creators brainstorm video ideas, titles and thumbnails. Called Brainstorm with Gemini, the feature is currently available only to select creators as part of a small, limited experiment.

OpenAI’s GPT-4o does weird stuff: OpenAI’s GPT-4o is the company’s first model trained on voice as well as text and image data. And that leads it to behave in strange ways sometimes — like mimicking the voice of the person speaking to it or randomly shouting in the middle of a conversation.

Research paper of the week

There are tons of companies out there offering tools they claim can reliably detect text written by a generative AI model, which would be useful for, say, combating misinformation and plagiarism. But when we tested a few a while back, the tools rarely worked. And a new study suggests the situation hasn’t improved much.

Researchers at UPenn designed a dataset and leaderboard, the Robust AI Detector (RAID), of over 10 million AI-generated and human-written recipes, news articles, blog posts and more to measure the performance of AI text detectors. They found the detectors they evaluated to be “mostly useless” (in the researchers’ words), only working when applied to specific use cases and text similar to the text they were trained on.

“If universities or schools were relying on a narrowly trained detector to catch students’ use of [generative AI] to write assignments, they could be falsely accusing students of cheating when they are not,” Chris Callison-Burch, professor in computer and information science and a co-author on the study, said in a statement. “They could also miss students who were cheating by using other [generative AI] to generate their homework.”   

There’s no silver bullet when it comes to AI text detection, it seems — the problem’s an intractable one.

Reportedly, OpenAI itself has developed a new text-detection tool for its AI models — an improvement over the company’s first attempt — but is declining to release it over fears it might disproportionately impact non-English users and be rendered ineffective by slight modifications in the text. (Less philanthropically, OpenAI is also said to be concerned about how a built-in AI text detector might impact perception — and usage — of its products.)

Model of the week

Generative AI is good for more than just memes, it seems. MIT researchers are applying it to flag problems in complex systems like wind turbines.

A team at MIT’s Computer Science and Artificial Intelligence Lab developed a framework, called SigLLM, that includes a component to convert time-series data — measurements taken repeatedly over time — into text-based inputs a generative AI model can process. A user can feed these prepared data to the model and ask it to start identifying anomalies. The model can also be used to forecast future time-series data points as part of an anomaly-detection pipeline. 

The framework didn’t perform exceptionally well in the researchers’ experiments. But if its performance can be improved, SigLLM could, for example, help technicians flag potential problems in equipment like heavy machinery before they occur.

“Since this is just the first iteration, we didn’t expect to get there from the first go, but these results show that there’s an opportunity here to leverage [generative AI models] for complex anomaly detection tasks,” Sarah Alnegheimish, an electrical engineering and computer science graduate student and lead author on a paper on SigLLM, said in a statement.

Grab bag

OpenAI upgraded ChatGPT, its AI-powered chatbot platform, to a new base model this month — but released no changelog (well, barely a changelog).

there’s a new GPT-4o model out in ChatGPT since last week. hope you all are enjoying it and check it out if you haven’t! we think you’ll like it 😃

— ChatGPT (@ChatGPTapp) August 12, 2024

So what to make of it? What can one make of it, exactly? There’s nothing to go on but anecdotal evidence from subjective tests.

I think Ethan Mollick, a professor at Wharton studying AI, innovation and startups, had the right take. It’s hard to write release notes for generative AI models because the models “feel” different in one interaction to the next; they’re largely vibes-based. At the same time, people use — and pay for — ChatGPT. Don’t they deserve to know what they’re getting into?

It could be the improvements are incremental, and OpenAI believes it’s unwise for competitive reasons to signal this. Less likely is the model relates somehow to OpenAI’s reported reasoning breakthroughs. Regardless, when it comes to AI, transparency should be a priority. There can’t be trust without it — and OpenAI has lost plenty of that already.