An AI system that detects infractions of social norms

The APA Dictionary of Psychology provides a comprehensive definition of social norms as socially determined standards that indicate typical and appropriate behaviors within a specific social context. These norms may either be universal, applying broadly across cultures, or contextual, specific to particular cultural settings.

While social norms vary across cultures and contexts, social norm violations can often be grouped into a few general categories. These categories capture common themes that transcend cultural boundaries.

The automatic identification of social norms and their violations poses a significant challenge. To address this challenge effectively, the first step is to identify the features, signals, or variables that indicate when a social norm has been violated.

The researchers at Ben-Gurion University of the Negev have studied about automatic identification of social norm violations and a researcher has designed an AI system that can detect social norm violations. This study aimed to bridge the gap between social sciences and data science, recognizing the potential of integrating both fields to gain deeper insights into human behavior and societal dynamics.

The researchers built this system using Zero-shot text classification(Zero-shot classification is a specialized form of natural language inference (NLI) where the goal is to determine the likelihood that a given class label can be inferred or entailed from a textual premise), GPT-3(for generating synthetic data and identifying violated social norms through human domain expertise) and automatic rule discovery. The system that they made used a binary of ten social emotions as categories. As the number of social norms is enormous, the researchers grouped them into a limited number of social emotions.

The researchers trained the system to detect ten emotions which are: competence, politeness, trust, discipline, caring, agreeableness, success, conformity, decency, and loyalty. The system they made can classify a given text into one of these emotions and could further classify them as positive or negative.

The researchers first employed zero-shot classification to automatically identify social emotions in short textual data. They then used GPT-3 for generating synthetic data and identifying violated social norms through human domain expertise, resulting in a high-level taxonomy of norms represented by ten top-level categories. Additionally, they developed seven simple models based on features measuring social emotions, norm violation, and other factors for classifying cases involving norm violation or confirmation. These models were tested on two separate massive datasets of short texts.

The system’s performance was pretty impressive, scoring a 64% match between the zero-shot classifier’s top emotion and the emotions identified by human subjects. To achieve this, the researchers utilized the EmpatheticDialogues dataset, which contains around 25,000 conversations labeled with 32 different emotions. Their focus was on situations involving norm violations and emotions.

By leveraging this labeled data, they trained the models to automatically identify social norms and categorize them into top-level groups. The results were quite encouraging, with an accuracy of approximately 94% and a precision of about 96% in detecting norm violations.

Talking about the study, the researchers said that this is preliminary work, but it provides strong evidence that their approach is correct and can be scaled up to include more social norms.