Using AI, machine learning to understand extent of online hate

ADL says that the research led to several other interesting findings, including the fact that when searching for one kind of hate, it’s easy to find hate of all kinds. In the initial results, there were several words that appeared more frequently in hate speech than non-hate speech. The top five words most strongly associated with hate were: Jew, white, hate, women, and black.

The project also found patterns in the construction of hateful language.

— The average number of words in a hateful comment was typically longer than in non-hateful comments.

— There were slightly more words in all caps found in hateful comments than in non-hateful ones.

— The sentence length in hateful comments was slightly longer than in non-hateful comments.

The goal of the Online Hate Index is to examine speech from multiple social media sites and develop a model that will help companies better understand the extent of hateful content on their platforms by creating community-based definitions of hate speech.

For the first phase of the project, researchers collected 9,000 comments from a handful of communities on Reddit during two months in 2016. They chose to start their research with Reddit because of the site’s community structure, its large volume of easily accessible comments, and because speech on the platform is typical of what is seen in everyday conversations, both online and offline. In future phases of the study, the researchers intend to apply their findings to speech on other social media platforms.

At the same time, the D-Lab developed a social science methodology based on a specific definition of hate speech. The lab then assembled a team of researchers with diverse backgrounds, trained them on the definition and methodology, and then manually labeled each of the comments as either hate or not hate.

Once the researchers completed labeling the comments, they fed them into the machine learning model. The model established rules after evaluating a number of examples of what people have classified as hate speech or not hate speech.

“The machine learning algorithms can decipher whether text is hate speech or not.” said Claudia von Vacano, Executive Director of the D-Lab and the Digital Humanities at U.C. Berkeley. “Therefore, the Online Hate Index model does not have a static definition, but instead ingests labelled data that informs the predictive model.”

ADL notes that the next phase of the project will go beyond this simple hate analysis and evaluate specific populations in a more detailed manner. Additionally, the D-Lab will identify strategies to scale the process for labeling comments to deploy the model broadly. While there is still a long way to go with AI and machine-learning-based solutions, ADL and the D-Lab believe the technologies hold promise that we may find new ways to curb the vast amount of online hate speech.