"On Wednesday morning, the company unveiled Tay [@Tayandyou], a chat bot meant to mimic the verbal tics of a 19-year-old American girl, provided to the world at large via the messaging platforms Twitter, Kik and GroupMe. According to Microsoft, the aim was to ‘conduct research on conversational understanding.’ Company researchers programmed the bot to respond to messages in an ‘entertaining’ way, impersonating the audience it was created to target: 18- to 24-year-olds in the US. ‘Microsoft’s AI fam from the internet that’s got zero chill,’ Tay’s tagline read." ( Wired )
Then it all went wrong, and Microsoft quickly pulled the plug:
"Hours into the chat bot’s launch, Tay was echoing Donald Trump’s stance on immigration, saying Hitler was right, and agreeing that 9/11 was probably an inside job. By the evening, Tay went offline, saying she was taking a break ‘to absorb it all.’ " ( Wired )
Why did it go "terribly wrong"? Here are two articles that assert the problem is in the AI:
- "It’s Your Fault Microsoft’s Teen AI Turned Into Such a Jerk" – Wired tl;dr: "this is just how this kind of AI works"
- "Why Microsoft’s ‘Tay’ AI bot went wrong…AI experts explain why it went terribly wrong" – TechRepublic tl;dr: "The system is designed to learn from its users, so it will become a reflection of their behavior".
I claim: the explanations that blame AI are wrong , at least in the specific case of tay.ai.
(WARNING: Foul, profane, and offensive language in images below)
Poor Software QA is Root Cause
Sleuthing by @daviottenheimer led to discovery of evidence that twitter users were exploiting a hidden feature of Tay: "repeat after me". He found evidence in tweets and replies. I later found evidence in the on-line board 4chan.org/pol/ . /pol/ is a chat and photo sharing board that appeals to people who are "anti-normie" to an extreme. They like "popping bubbles" (my words), i.e. trolling people, especially public people, who are mainstream, normal, proper, and/or politically correct. It is a free public board with no membership required and no real-name requirement.
The first /pol/ thread "Tay – New AI from Microsoft" started Wednesday, March 23, at 11:00am. ( Here ). This thread and subsequent threads show that the /pol/ community was enthusiastically trolling Tay, essentially at random and without any evidence of knowledge of the inner workings of Tay. Contrary to the experts in the two articles above, Tay was not poisoned by direct trolling. Here is an example:
|(click to enlarge)|
While many of the tweets and replies by Tay led to many laughs by the trolls, Tay was not poisoned in the first few hours.
Then, at about 16:50, @MacreadyKurt stumbled upon the undocumented command: "repeat after me".
|(Click to enlarge)|
At 16:55, @BASED_AN0N posted instructions on 4chan.org/pol/ (trip code "yp45OVHP"), along with proof of concept (POC).
|(Click to enlarge)|
Within minutes, this instruction was repeated on 4chan. Many more twitter trolls following @Tayandyou saw the exploit in their timeline, and the real poisoning began [ but see Update ].
[Original post ]
This constitutes poisoning because Tay was not simply repeating the text a single time. Instead, the text was added into it’s Natural Language Processing (NLP) system so that these words and phrases became reusable and remixable in future discourse. And, since Tay’s NLP was designed to continuously learn and "improve" in real-time ("on-line learning"), the more the trolls conversed with Tay using these foul words and phrases, the more Tay reinforced them and used them.
[ Update 5:25pm PDT ]
I now think that the previous paragraph is wrong. After some searching, I haven’t found evidence that the seeded/inserted text was used later by Tay. Instead, it appears that Tay ONLY repeated the text after the "repeat after me" prompt. Then, trolls would retweet and/or screen grab and tweet the photos or post on 4chan. Two examples are here and here .
[ Update 6:45pm PDT ] This article in Business Insider says that "In some — but not all — instances, people managed to have Tay say offensive comments by asking them to repeat them." Some of the images in the article give the impression that "repeat after me" did not immediately precede the worst Tay tweets. However they appear to be using pictures posted on social media by trolls, not pictures they got from Twitter threads. Therefore we can’t vouch for the "…not all.." statement.
[ Update 7:00pm PDT ] Here is one example from Business Insider of "repeat after me" not appearing before the offending Tay tweet. I captured this image from Google cache of Twitter thread.
|(Click to enlarge)|
OK. This isn’t a case of "repeat after me". But also isn’t as foul/profane as the other examples. Instead, it looks like a fairly typical NLP generated sentence drawing on a large corpus. Here the NLP is linking Hitler, totalitarianism, and atheism, but putting them inappropriately in the context of Ricky Gervais. If the question was "Is Ricky Gervais a Christian?" Tay might have replied "Ricky Gervais learn religion from Jesus, the inventor of Christianity" using the same sentence structure. This sort of mistaken semantic construction is fairly common in generative AI, but if the sentences/phrases are short enough, then human readers tend to overlook them or interpolate some reasonable meaning (much like adults do with very young children).
Q: Was "Repeat after me" a Result or Consequence AI? A: No.
It is just too imperative. It’s not chatty. Instead, I’d guess that this was a rule-based design feature, probably left over from the early stages of development where software engineers were the only people interacting with the Tay bot. Very simply, the "repeat after me" allows the developer-user to manually seed Tay’s NLP system and then immediately see what happens with further interaction. [ Update 8:19pm ] Or it may be an early feature put in before the full NLP system was working.
Put another way: the sort of AI you need to make chat work does not also work well to act on imperatives. Just compare robot AI (where natural language AI is sometimes used) to conversational/social AI and you’ll see that they don’t share common functionality, and often have completely different architectures.
Also, there were quite a few rule-based behaviors that overrode and/or by-passed the NLP AI. Microsoft called it "a lot of filtering". One example is anti-trolling rule for the topic "Gamergate".
|(Click to enlarge)|
Nearly all social/interactive AI outside of academic research has some "cheats" or "kludges" that are hard-coded by developers to control behavior in a way that would be hard/complicated/costly to do with the AI engine itself. The "repeat after me" command was just one, probably to aid development and testing.
Q: Why Was It There? A: Bad QA.
The Official Microsoft Blog post, titled " Learning from Tay’s introduction ", does NOT acknowledge that the root cause was exploit of a hidden feature. Instead, they describe the root cause as a "critical oversight for this specific attack". In other words, they claim they didn’t do enough troll testing.
Instead, I believe that it is more likely that the root cause is poor software QA, which is different than "penetration testing" as you would do to test if your system was vulnerable to trolling. If "repeat after me" was, in fact, a rule-driven behavior explicitly put in by a developer, the the QA failure was not detecting it and not making sure it was removed. The Microsoft blog post does not describe their QA process, and it may be that they do not have any engineers dedicated to software QA. After all, this is a project of Microsoft Research, not one of the product divisions/groups.
I don’t have access to the Tay code, test results, processes and procedures, or organization charts. My claims above are extrapolations from the evidence, plus builds on my own experience in software development and in working with corporate software development teams. As such, I may be wrong and someone might be able to produce contrary evidence. I hope so.
Other Articles and Their Explanations
[ Update 8:00pm PDT ]
- " Microsoft is deleting its AI chatbot’s incredibly racist tweets " – Business Insider tl;dr "The reason it spouted garbage is that racist humans on Twitter quickly spotted a vulnerability — that Tay didn’t understand what it was talking about — and exploited it." BUT this article does call out "repeat after me" : "In some — but not all — instances, people managed to have Tay say offensive comments by asking them to repeat them."
- " Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day " – The Verge tl;dr "many of the bot’s nastiest utterances have simply been the result of copying users. If you tell Tay to "repeat after me," it will — allowing anybody to put words in the chatbot’s mouth. However some of its weirder utterances have come out unprompted."
- " Microsoft terminates its Tay AI chatbot after she turns into a Nazi " – Ars Technica tl;dr Let’s hope that Tay isn’t like Skynet, who retaliated on humans that tried to shut it down.
- " Tay, Microsoft’s AI chatbot, gets a crash course in racism from Twitter " – The Guardian tl;dr "Tay in most cases was only repeating other users’ inflammatory statements, but the nature of AI means that it learns from those interactions. It’s therefore somewhat surprising that Microsoft didn’t factor in the Twitter community’s fondness for" trolling.
- " Microsoft Created a Twitter Bot to Learn From Users. It Quickly Became a Racist Jerk. " – New York Times tl;dr We wrote a summary of The Guardian and Business Insider articles.
转载本站任何文章请注明：转载至神刀安全网，谢谢神刀安全网 » Poor Software QA Is Root Cause of TAY-Fail (Microsoft's AI Twitter Bot)