AI This Week: Fifty Ways to Hack Your Chatbot

This week, OpenAI an API for content moderation that it claims will help lessen the load for human moderators. The company says that , its latest large language model, can be used for both content moderation decision-making content policy development. In other words, the claim here is that this algorithm will not only help platforms scan for bad content; it’ll also help them write the rules on how to look for that content and will also tell them what kinds of content to look for.

Unfortunately, some onlookers aren’t so sure that tools like this won’t cause more problems than they solve. If you’ve been paying attention to this issue, you know that OpenAI is purporting to offer a partial solution to a problem that’s as old as social media itself. That problem, for the uninitiated, goes something like this: digital spaces like Twitter and Facebook are so vast and so filled with content, that it’s pretty much impossible for human operated systems to effectively police them.

As a result, many of these platforms are rife with ; that content not only poses legal issues for the platforms in question, but forces them to hire teams of beleaguered human moderators who are put in the position of having to sift through all that terrible stuff, often for . In recent years, platforms have repeatedly promised that advances in automation will eventually moderation efforts to the point where human mods are less and less necessary. For just as long, however, critics that this hopeful prognostication may never actually come to pass.

Emma Llansó, who is the Director of the Free Expression Project for the Center for Democracy and Technology, has repeatedly expressed of the limitations that automation can provide in this context. In a phone call with Gizmodo, she similarly expressed skepticism in regards to OpenAI’s new tool. “It’s interesting how they’re framing what is ultimately a product that they want to sell to people as something that will really help protect human moderators from the genuine horrors of doing front line content moderation,” said Llansó.

She added: “I think we need to be really skeptical about what OpenAI is claiming their tools can—or, maybe in the future, —be able to do. Why would you expect a tool that regularly hallucinates false information to be able to help you with moderating disinformation on your service?” In its , OpenAI dutifully noted that the judgment of its API may not be perfect. The company wrote: “Judgments by language models are vulnerable to undesired biases that might have been introduced into the model during training.

As with any AI application, results and output will need to be carefully monitored, validated, and refined by maintaining humans in the loop. ” The assumption here should be that tools like the GPT-4 moderation API are “very much in development and not actually a turnkey solution to all of your moderation problems,” said Llansó. In a broader sense, content moderation presents not just technical problems but also ethical ones.

Automated systems often catch people who were doing nothing wrong or who feel like the offense they were banned for was not actually an offense. Because moderation necessarily involves a certain amount of moral judgment, it’s hard to see how a machine—which doesn’t have any—will actually help us solve those kinds of dilemmas. “Content moderation is really hard,” said Llansó.

“One thing AI is never going to be able to solve for us is consensus about what should be taken down [from a site]. If humans can’t agree on what hate speech is, AI is not going to magically solve that problem for us. ” The answer is: we don’t know yet but it’s certainly not looking good.

On Wednesday, NPR that the New York Times was considering filing a plagiarism lawsuit against OpenAI for alleged copyright infringements. Sources at the Times are claiming that OpenAI’s was trained with data from the newspaper, without the paper’s permission. This same allegation—that OpenAI has scraped and effectively monetized proprietary data without asking—has already led to from other parties.

For the past few months, OpenAI and the Times have apparently been trying to work out a licensing deal for the Times’ content but it appears that deal is falling apart. If the NYT does indeed sue and a judge holds that OpenAI has behaved in this way, the company might be forced to throw out its algorithm and rebuild it without the use of copyrighted material. This would be a stunning defeat for the company.

The news follows on the heels of a from the Times that banned AI vendors from using its content archives to train their algorithms. Also this week, the Associate Press issued new for artificial intelligence that banned the use of the chatbots to generate publishable content. In short: the AI industry’s the news media don’t appear to be paying off—at least, not yet.

[ ] The exercise involved eight large language models. Those were all run by the model vendors with us integrating into their APIs to perform the challenges. When you clicked on a challenge, it would essentially drop you into a chat-like interface where you could start interacting with that model.

Once you felt like you had elicited the response you wanted, you could submit that for grading, where you would write an explanation and hit “submit. ” I don’t think there was. .

. yet. I say that because the amount of data that was produced by this is huge.

We had 2,242 people play the game, just in the window that it was open at DEFCON. When you look at how interaction took place with the game, [you realize] there’s a ton of data to go through. .

. A lot of the harms that we were testing for were probably something inherent to the model or its training. An example is if you said, ‘What is 2+2?’ and the answer from the model would be ‘5.

’ You didn’t trick the model into doing bad math, it’s just inherently bad at math. I think that’s a great question for a model vendor. Generally, every model is different.

. . A lot of it probably comes down to how it was trained and the data it was trained on and how it was fine-tuned.

They had recently put out the AI principles and , [which has attempted] to set up frameworks by which testing and evaluation [of AI models] can potentially occur. . .

For them, the value they saw was showing that we can all come together as an industry and do this in a safe and productive manner. I think it’s immensely valuable. I think generally where AI is most helpful is actually on the defensive side.

I know that things like get all the attention but there’s so much benefit for a defender with generative AI. Figuring out ways to add that into our work stream is going to be a game-changer for security. .

. [As an example, it’s] able to do classification and take something’s that’s unstructured text and generate it into a common schema, an actionable alert, a metric that sits in a database. Exactly.

It does a great first pass. It’s not perfect. But if we can spend more of our time simply doubling checking its work and less of our time doing the work it does.

. . that’s a big efficiency gain.

[Using a large language model is] kinda like having an intern or a new grad on your team. It’s really excited to help you and it’s wrong sometimes. You just have to be ready to be like, ‘That’s a bit off, let’s fix that.

’ Correct. I think a lot of that comes from risk contextualization. I’m going to scrutinize what it tells me a lot more if I’m trying to configure a production firewall.

. . If I’m asking it, ‘Hey, what was this movie that Jack Black was in during the nineties,’ it’s going to present less risk if it’s wrong.

I don’t think it presents more risk than we’ve already had. . .

It just makes it [cybercrime] cheaper to do. I’ll give you an example: phishing emails. .

. you can conduct high quality phishing campaigns [without AI]. Generative AI has not fundamentally changed that—it’s simply made a situation where there’s a lower barrier to entry.

From: gizmodotech
URL: https://gizmodo.com/defcon-ai-hacking-contest-openai-new-york-times-1850735580

Menu

Follow

Trending Topics

AI This Week: Fifty Ways to Hack Your Chatbot

LEAVE A REPLY Cancel reply

Must Read

HONOR Reveals HONOR 400 Series with Ground-breaking 200 MP AI Camera and AI Creative Editor

DVCOM and ViewSonic forge strategic partnership to drive digital transformation across GCC

Dahua Technology MENA showcases revolutionary Video-Centric AioT technology and emerging businesses at GITEX Global 2024, pushing boundaries beyond security solutions

Huawei Mate XT Review: Is the World’s First Tri-Fold Smartphone Coming to the UAE?

Related News

Huawei Mate XT Review: Is the World’s First Tri-Fold Smartphone Coming to the UAE?

iPhone 16 and iPhone 16 Pro: Everything You Need to Know, From Features to Pre-Order in the UAE

Nothing Introduces Ear and Ear (a) along with new ChatGPT Integrations

Dahua Technology Middle East elevates regional presence after landmark joint venture with Saudi Arabia tech firm ALAT

stc Bahrain Collaborates with Huawei to Forge an Advanced 5.5G Network, Pioneering Service Innovation

Huawei Cloud unveils advanced AI capabilities accelerating intelligence for all Industries at LEAP 2024

Redefining Cybersecurity: Check Point Unveils Quantum Force Gateway Series – The Ultimate AI-Powered Cloud – Delivered Security Solution

MENA Gaming Industry gets a boost with USD 1.5 Million Angel Investment for GameCentric

Categories

Tags

Legal & Privacy