New Tool in the Works To Allow Social Media Companies To Censor Content More Easily
Of course it came from the University of California at Berkeley. Of course it did.
I’m talking about a tool that would build an “Online Hate Index” for social media companies. Thus, “hate speech” could be taken off without any sort of human involvement, according to Campus Reform.
The idea, according to an article in the University of California alumni magazine, sounds profoundly innocent.
“It started with a conversation. About two years ago, Claudia von Vacano, executive director of UC Berkeley’s social science D-Lab, had a chat with Brittan Heller, the then-director of technology and society for the Anti-Defamation League (ADL),” the article reads.
“The topic: the harassment of Jewish journalists on Twitter. Heller wanted to kick the offending trolls off the platform, and Vacano, an expert in digital research, learning, and language acquisition, wanted to develop the tools to do it. Both understood that neither humans nor computers alone were sufficient to root out the offending language. So, in their shared crusade against hate speech and its malign social impacts, a partnership was born.”
The D-Lab program works via a “scalable detection” system which utilizes “artificial intelligence, machine learning, natural language processing, and good old human brains to winnow through terabytes of online content.”
“The tools that were — and are — available are fairly imprecise and blunt,” D-Lab head Vacano said, “mainly involving keyword searches. They don’t reflect the dynamic shifts and changes of hate speech, the world knowledge essential to understanding it. (Hate speech purveyors) have become very savvy at getting past the current filters — deliberately misspelling words or phrases.”
It’s not just a matter of deliberately misspelling words and phrases, either. Take “Shrinky Dinks,” the 1970s children’s toy which would shrink in the oven. However, anti-Semites have started using it as a euphemism for Jewish people, invoking the use of the Nazis’ use of ovens in the extermination camps of the Third Reich.
This all sounds very well-meaning until you look at a graphic included in the article which identifies some of the words they say are “strongly identified with hate speech.”
Some of them are probably guessable: “hate” is an obvious one, as are “Jew,” “race,” “black,” “white,” that sort of thing.
Some, however, seem more of a problem if they’re integrated into a wider system of social media speech suppression. For instance, “people,” “culture,” “country” and “nation” are included, as is the f-word. And speaking of problem, “problem” is also included.
Aside from loving an explanation on that last one, this is, well, a problem. The terms “country” and “nation” seem to indicate that the system has an explicit bias not only toward nationalism but toward the very concept of a country; apparently, Westphalian sovereignty is now potentially hateful. Careful what phrases you post on social media when you realize the import of this, as you might end up using another potential hate-speech word which begins with the sixth letter of the alphabet.
And then there’s the problem of what social media companies consider hate speech and how it might differ from what you do. For instance, “misgendering” a transgender person can now get your account shut down on Twitter. These are the eventual clients of this technology, and what they want it for won’t be just for the kind of anti-Semitic trolls evoked by article in the alumni magazine. If your religious or scientific beliefs don’t necessarily jibe with, say, au courant theories on transgenderism, this is probably going to end up with you caught in its wide net, as well.
But, of course, the developers are distancing itself from the idea that creating the technology to remove what that technology considers hate speech from social media would actually be tantamount to actually removing it.
“We are developing tools to identify hate speech on online platforms, and are not legal experts who are advocating for its removal,” Vacano said via email. “We are merely trying to help identify the problem and let the public make more informed choices when using social media.”
This is a bit like the members of the Manhattan Project saying they didn’t take any responsibility for the atomic bomb being detonated because they just split the atom. One wonders whether any of them will ever have an Oppenheimer moment where they tell themselves, “Now I am become death, the destroyer of speech.” Given that they’re from Berkeley, probably not. After all, they probably view this as having far greater moral impact than the atomic bomb.
Another problem with the “scalable detection” system, unmentioned in the article, is that hate-speech euphemisms can often be reclaimed by those who detest them. One which Twitter or Reddit users might be familiar with is the use of triple-parentheses around Jewish or Jewish-sounding names popularized by members of the alt-right.
The practice was originally used to imply typically rebarbative Jewish conspiracies of power, but soon Jewish social media users — and even non-Jewish social media users looking to make a point about the foolishness of the alt-right and to identify with the targets of anti-Semitic hate — would put the parentheses around their own names.
While we’re unlikely to see something as mephitic as “Shrinky Dinks” reclaimed by the Jewish community, the article doesn’t make it clear just how an instance like the triple-parentheses would be dealt with by the “scalable detection” system, if it could be dealt with at all. In a case like that, it would seem difficult unless the system were to assign different classes of people along the political spectrum a likelihood of committing hate speech — and that’s another problem, provided we can still use the word “problem.”
Furthermore, see if you can identify the problem here, or at least identify bold text while I identify it: “D-Lab initially enlisted ten students of diverse backgrounds from around the country to ‘code’ the posts, flagging those that overtly, or subtly, conveyed hate messages. Data obtained from the original group of students were fed into machine learning models, ultimately yielding algorithms that could identify text that met hate speech definitions with 85 percent accuracy, missing or mislabeling offensive words and phrases only 15 percent of the time.”
Granted, this is still in what seems to be an early trial phase, but first, this assumes the lack of subjectivity in terms of “offensive words and phrases.” Second, they still missed by “only” a solid 15 percent on the very shaky assumption of an objective standard.
If this were rolled out on a wider scale — where the objectivity of “offensive words and phrases” wasn’t just being judged by the Berkeley D-Lab but by a wider social media community — what would the miss rate be and what would be considered acceptable? How much speech, in other words, is it acceptable to censor? And keep in mind, while bigger names would be able to appeal suspensions based on erroneous technology or assumptions, medium- to small-size names would not.
In other words, if Twitter or Facebook’s shiny new Berkeley algorithm brands you a hate speaker, dear reader, you pretty much don’t have any recourse. After all, it’s at least 85 percent right — that’s too much of a chance to countenance the danger of having you around. The algorithm hath spoken.
And people wonder why conservatives don’t trust social media.
Truth and Accuracy
We are committed to truth and accuracy in all of our journalism. Read our editorial standards.
Advertise with The Western Journal and reach millions of highly engaged readers, while supporting our work. Advertise Today.