OpenAI's Dilemma with Text Watermarking

avatar
(Edited)

OpenAI used to have a 'text classifier' for detecting AI-generated text, but they discontinued it as they deemed it to have low accuracy. They have since had a much better way to detect AI-generated text, but they are now contemplating releasing it and its effect on different layers of society.

Image credits: The Verge

AI-generated text detectors play an important role, especially in academic systems. For teachers, it is very helpful in curbing the behavior of students turning to AI to do their work for them, inhibiting the actual learning experience. With how advanced AI has gotten in even so little time, it has gotten a lot harder to detect them.

OpenAI claims that their new method is very accurate and "99.9% effective." And how it works is by making small changes to how ChatGPT puts words together when it generates text. It would do so in a way that would create a recognizable pattern that enables a separate tool to detect the text as AI-generated text watermarking. This method focuses only on ChatGPT models and not other AI models from other companies.

This may turn out to be a boon for teachers in academic systems, but it poses potential issues, especially for those who don't speak English natively. "While it has been highly accurate and even effective against localized tampering, such as paraphrasing," OpenAI says in a blog post, "it is less robust against globalized tampering, like using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character, making it trivial to circumvention by bad actors."

OpenAI considers the fact that there are people who may rely on AI assistance with their texts. Augmenting with AI is the future, and even as a part of the daily lives of some people, it's a breakthrough. With this new method of detecting AI-generated text, "it could stigmatize the use of AI as a useful writing tool" for non-native English speakers. Imagine people taking genuine works that may have just been adjusted as necessary to be inauthentic; the AI text detector simply indicates the invisible watermarks that are present.

“The text watermarking method we’re developing is technically promising but has important risks we’re weighing while we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers,” an OpenAI spokesperson said.

Interestingly, OpenAI is concerned about the effect this watermarking can have on ChatGPT users. In a survey, about 30 percent claimed that they would actually use the software less if this new watermarking method came to light.

The idea is similar to that of non-native English speakers. No one really wants to have their work flagged as AI-generated each time they sometimes have to work with AI.

The company admits that this method is "trivial to circumvention by bad actors,
but they have at least developed something that actually works for some cases, as it could be very effective. They're not sure how it will work, but they are confident that there will not be false positives.

Make earnings with your content on Hive via InLeo while you truly own your account. If you're new, sign up in a few minutes by clicking here! Here's a guide if you need one.


Posted Using InLeo Alpha



0
0
0.000
0 comments