To the French electorate, last Friday’s eleventh-hour dump of hacked Emmanuel Macron documents was a snoozer. Voters yawned Macron into the presidency with nearly twice the votes of his far-right opponent Marie Le Pen. But if you’d weighed the import of MacronLeaks by the social media reaction you’d think it was an expose of Eiffel Tower proportions. From the moment a right-wing commenter announced the leaks on Twitter, a relentless drumbeat of tweets and retweets kept it trending for hours. One account alone pushed out 486 MacronLeaks tweets in the first twelve hours: a tweet every minute and a half.
That made us wonder: were those real people doing all the tweeting? The Macron hack and the Twitter push that amplified it struck many as eerily reminiscent of Russia’s meddling in the US election, when Vladimir Putin used hack attacks, leaks and a mix of human and automated social media trolls to hurt Hillary Clinton and help Donald Trump. So The Daily Beast decided to investigate with a bot of our own. We wrote software to capture each of the 150 thousand #MacronLeaks tweets that crossed Twitter during the first thirteen hours after the leak. Then we ran the users through a bot-detection algorithm called BotOrNot.
Created by scientists at the Indiana University School of Informatics and Computing, with funding from the Defense Department and the National Science Foundation, BotOrNot is a complex machine learning system that distills more than 1,000 indicators from a Twitter account to weigh the likelihood that software, rather than a human, is controlling it. Factors including the account’s interactions with other users, profile creation time and reported location, characteristics of the account’s followers, timing patterns in status updates, and linguistic and emotional cues parsed from the content of those updates are all boiled together to produce a bot score between zero and 100. The higher the number, the more certain the account is to be a bot.
ADVERTISEMENT
When it was released in 2014, BotOrNot provided a pretty reliable indicator of whether any given Twitter user had a heartbeat. Today the most advanced social bots can fool any bot-detector not armed with the Voight-Kampff machine from Blade Runner. In their most recent study last March, the BotOrNot team charted a 15 percent false positive rate – in which a human is given a bot score above 50—and an 11 percent false negative rate. False positives tend to accrue to highly active accounts run by multiple people with the aid of tweet-scheduling software, like @TheDailyBeast (bot score: 53). False negatives occur when a bot is packing some serious artificial intelligence under the hood.
“[H]uman accounts and simple bots are very easy to identify, both by other humans and by our models,” wrote BotOrNot’s lead scientist, Emilion Ferrera, in March. However “there exist a family of sophisticated social AIs that systematically escape identification by our models and by human snap-judgment.”
For that reason, we won’t accuse any particular Twitter account of being a bot, just in case it’s one of the 15 percent that fall into the false positive bucket. But BotOrNot works well for sniffing out the overall automation level of a large group of users, like the ones tweeting about the Marcon documents.
We began our unscientific survey at the beginning, May 5th at 2:49:13 PM EDT, the moment far-right video commentator Jack Posobiec first publicized the leak on Twitter, pointing his followers to an anonymous post on the politics message board at 4chan.org. “Massive doc dump at /pol/,” he wrote. "’Correspondence, documents, and photos from Macron and his team.’" We ended a little over 13 hours later, having captured 152,328 tweets from 49,472 distinct Twitter accounts.
The top MacronLeaks tweet in our sample was WikiLeaks’ declaration that the documents marked “a significant leak.” WikiLeaks was the dominant presence in MacronLeaks, writing six of the top ten tweets accounted for 19,138 retweets by the 13-hour mark. The remaining four are, in ascending order of popularity: a tweet in French mistakenly crediting WikiLeaks for MacronLeaks; a tweet by a Parisian technologist mocking the leak as inconsequential; a tweet from a Le Pen advisor expressing hope that the leaks would reveal dirt about Macron that French journalists concealed; and an incorrect claim by Posobiec that France had blocked access to 4chan.
To provide a baseline of botness, we also grabbed Twitter data for thousands of tweets Monday night containing the name "Draymond Green,” the NBA all-star who’d just led the Golden State Warriors in a four-game sweep of the conference semifinals against the Utah Jazz.
Running both collections of tweets through BotOrNot reveals a clear difference between the two sets. On a per-tweet basis, the median bot score for MarconLeaks is 38 percent higher than for Draymond Green, with MarconLeaks tweets chalking up a score of 36 versus Green’s 26.
Both groups have large numbers of users with bot scores between 40 and 80, a range the Indiana University scientists dub the “gray zone,” where human-like bots and bot-like humans mingle freely, giving BotOrNot the most trouble. So we next looked at the tweet volume from accounts at the extreme ends of the bot curve.
Accounts with bot scores above 80 – the I’m-a-bot-and-I-don’t-care-who-knows-it level – contributed virtually nothing to MacronLeaks, while on Draymond Green Twitter they accounted for 14 percent of the tweets, some teasing stories about the NBA but linking to services that pay for social media traffic.
The starkest difference between the two groups comes at the human end of the scale, where even sophisticated AI driven bots are unlikely to be found. Accounts with a bot score of 25 or less accounted for fully 52 percent of the Draymond Green tweets. In MacronLeaks they made up a piddling 12 percent.
Our conclusion: MacronLeaks was as botty as a Japanese assembly line.
Out of sheer curiosity, we also looked for crossover between the MacronLeaks Twitter campaign and the similar social media push that accompanied WikiLeaks’ “Vault 7” dump of CIA documents. That leak revealed interesting tidbits about the CIA’s hacking capability that garnered tweets across the board. But at the same time thousands of Twitter accounts used the opportunity to push the false claim that Vault7 exonerated Russia of hacking the Democratic National Committee and fingered the CIA as the “real Russian hackers.”
We found that one-third of the users tweeting about MarconLeaks last week were doing the same with Vault7 in March, with a few tweaking their profiles in between – for example, adding “Le Pen” to their account description, or in the case of one account, changing the baffling sentiment "HaveMoreWhiteBabies" to the more on-point "MakeFranceGreatAgain.”
In the aggregate, this group’s Vault7 retweets focused largely on WikiLeaks, which won seven of the top 10 spots on their most-retweeted list. The remaining three tweets, once again in ascending order of popularity: a tweet from white nationalist Mike Cernovich attacking the CIA and defending Russia’s intelligence services; a tweet from the conspiratorial right-wing Infowars radio show pushing the false CIA Framed Russia meme; and a tweet from the account @TenGOP (“The Unofficial Twitter of Tennessee Republicans”) seeking Michael Flynn’s reappointment as Trump’s national security advisor, a job Flynn lost after press revelations that he’d lied about his telephone discussions with the Russian ambassador after the election hacks. “In light of the #Vault7 release I think that General Flynn should be reinstated!!!,” this tweet urged. “Retweet if you agree!”
And retweet they did, two thousand retweets from our MarconLeaks users, 11 thousand overall. But so far Flynn hasn’t returned to the White House. Now with last week’s Twitter storm failing to install Marine Le Pen in Élysée Palace, the influence of social media bots appears to have momentarily stalled. But if there’s one thing we know from science fiction movies, in the end, the robots always win.