Towards Social AI: A Survey on Understanding Social Interactions

Read original: arXiv:2409.15316 - Published 10/2/2024 by Sangmin Lee, Minzhi Li, Bolin Lai, Wenqi Jia, Fiona Ryan, Xu Cao, Ozgur Kara, Bikram Boote, Weiyan Shi, Diyi Yang and 1 other

Towards Social AI: A Survey on Understanding Social Interactions

Overview

This paper is a survey on understanding social interactions, focusing on verbal, non-verbal, and multimodal cues in multi-party scenarios.
It examines how AI systems can be designed to better perceive and model social interactions.
The key topics covered include language, beliefs, and cognitive mechanisms underlying social intelligence.

Plain English Explanation

This research paper is a comprehensive review of how we can create AI systems that can better understand and engage in social interactions. Social interactions involve more than just the words people say - they also include body language, facial expressions, and other non-verbal cues.

The paper looks at how AI can be designed to perceive and model these different types of social signals, in both one-on-one and group interactions. It examines the underlying cognitive mechanisms that allow humans to navigate complex social situations, such as our ability to infer the beliefs and intentions of others.

The goal is to develop AI that can more naturally and effectively interact with people, by giving it a deeper understanding of the nuances of human social behavior. This could have applications in areas like virtual assistants, social robots, and even mental health support.

Technical Explanation

The paper first provides an overview of verbal cues in social interactions, including things like turn-taking, grounding, and deixis. It discusses how language is used not just to convey information, but also to build rapport, negotiate, and signal social status.

The paper then explores non-verbal cues, such as facial expressions, gestures, and proxemics. It examines how these non-verbal signals are closely tied to the cognitive and affective processes underlying social interactions.

A key focus of the paper is on multimodal interactions, where verbal and non-verbal cues are combined. The authors discuss computational approaches for jointly modeling these different modalities to gain a more holistic understanding of social dynamics.

The survey also covers multi-party interactions, where there are more than two people involved. This introduces additional complexities, as the AI system must track the beliefs, intentions, and social relationships between multiple individuals.

Throughout the paper, the authors highlight the cognitive mechanisms that underpin social intelligence, such as theory of mind, joint attention, and social learning. They discuss how these capacities emerge in human development and how they might be replicated in artificial systems.

Critical Analysis

The paper provides a thorough and well-structured overview of the current state of research on understanding social interactions from an AI perspective. The authors do a commendable job of covering a wide range of topics, from low-level verbal and non-verbal cues to higher-level cognitive processes.

One potential limitation is that the survey is quite broad, and does not delve deeply into the specific technical details or empirical findings of the research it cites. This may make it less useful for readers looking for a more in-depth, technical understanding of the field.

Additionally, the paper does not discuss some of the potential challenges and ethical considerations that may arise as AI systems become more adept at perceiving and modeling human social behavior. For example, there could be privacy concerns around the collection and use of such personal data, or issues around bias and fairness if the AI systems fail to generalize across different cultural contexts.

Overall, this paper serves as a valuable introduction and roadmap for researchers interested in the emerging field of "social AI." However, further work is needed to translate these conceptual insights into practical, deployable systems that can truly engage with people in a natural and meaningful way.

Conclusion

This survey paper provides a comprehensive overview of the state of research on understanding social interactions from an AI perspective. It examines verbal, non-verbal, and multimodal cues in both one-on-one and multi-party scenarios, with a focus on the cognitive mechanisms that underpin social intelligence.

The insights from this paper could inform the development of AI systems that can more effectively and naturally interact with humans, with potential applications in areas like virtual assistants, social robots, and mental health support. However, the field still faces significant technical and ethical challenges that will need to be addressed as this technology continues to evolve.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Towards Social AI: A Survey on Understanding Social Interactions

Sangmin Lee, Minzhi Li, Bolin Lai, Wenqi Jia, Fiona Ryan, Xu Cao, Ozgur Kara, Bikram Boote, Weiyan Shi, Diyi Yang, James M. Rehg

Social interactions form the foundation of human societies. Artificial intelligence has made significant progress in certain areas, but enabling machines to seamlessly understand social interactions remains an open challenge. It is important to address this gap by endowing machines with social capabilities. We identify three key capabilities needed for effective social understanding: 1) understanding multimodal social cues, 2) understanding multi-party dynamics, and 3) understanding beliefs. Building upon these foundations, we classify and review existing machine learning works on social understanding from the perspectives of verbal, non-verbal, and multimodal social cues. The verbal branch focuses on understanding linguistic signals such as speaker intent, dialogue sentiment, and commonsense reasoning. The non-verbal branch addresses techniques for perceiving social meaning from visual behaviors such as body gestures, gaze patterns, and facial expressions. The multimodal branch covers approaches that integrate verbal and non-verbal multimodal cues to holistically interpret social interactions such as recognizing emotions, conversational dynamics, and social situations. By reviewing the scope and limitations of current approaches and benchmarks, we aim to clarify the development trajectory and illuminate the path towards more comprehensive intelligence for social understanding. We hope this survey will spur further research interest and insights into this area.

10/2/2024

Social Learning through Interactions with Other Agents: A Survey

Dylan hillier, Cheston Tan, Jing Jiang

Social learning plays an important role in the development of human intelligence. As children, we imitate our parents' speech patterns until we are able to produce sounds; we learn from them praising us and scolding us; and as adults, we learn by working with others. In this work, we survey the degree to which this paradigm -- social learning -- has been mirrored in machine learning. In particular, since learning socially requires interacting with others, we are interested in how embodied agents can and have utilised these techniques. This is especially in light of the degree to which recent advances in natural language processing (NLP) enable us to perform new forms of social learning. We look at how behavioural cloning and next-token prediction mirror human imitation, how learning from human feedback mirrors human education, and how we can go further to enable fully communicative agents that learn from each other. We find that while individual social learning techniques have been used successfully, there has been little unifying work showing how to bring them together into socially embodied agents.

8/1/2024

🔎

A social path to human-like artificial intelligence

Edgar A. Du'e~nez-Guzm'an, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, Joel Z. Leibo

Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) progress is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emerges at multiple scales in networks of interacting agents via collective living, social relationships and major evolutionary transitions, which contribute to novel data generation through mechanisms such as population pressures, arms races, Machiavellian selection, social learning and cumulative culture. Many breakthroughs in AI exploit some of these processes, from multi-agent structures enabling algorithms to master complex games like Capture-The-Flag and StarCraft II, to strategic communication in Diplomacy and the shaping of AI data streams by other AIs. Moving beyond a solipsistic view of agency to integrate these mechanisms suggests a path to human-like compounding innovation through ongoing novel data generation.

5/28/2024

Human-machine social systems

Milena Tsvetkova, Taha Yasseri, Niccolo Pescetelli, Tobias Werner

From fake social media accounts and generative-AI chatbots to financial trading algorithms and self-driving vehicles, robots, bots, and algorithms are proliferating and permeating our communication channels, social interactions, economic transactions, and transportation arteries. Networks of multiple interdependent and interacting humans and autonomous machines constitute complex social systems where the collective outcomes cannot be deduced from either human or machine behavior alone. Under this paradigm, we review recent research from across a range of disciplines and identify general dynamics and patterns in situations of competition, coordination, cooperation, contagion, and collective decision-making, with context-rich examples from high-frequency trading markets, a social media platform, an open-collaboration community, and a discussion forum. To ensure more robust and resilient human-machine communities, researchers should study them using complex-system methods, engineers should explicitly design AI for human-machine and machine-machine interactions, and regulators should govern the ecological diversity and social co-evolution of humans and machines.

7/15/2024