By Eran YessodiAugust 14th, 2023

In recent years, the field of social-signal-processing (i.e. the automatic analysis of nonverbal cues in human-human social interactions) has made significant progress. From understanding social traits to social roles and interaction dynamics, this area of research has the potential to revolutionize various applications, such as intelligent vehicles and social robots. However, there are still several challenges that need to be addressed in order to unlock the full potential of this technology. In this blog post, we’ll explore the current state of this research and discuss the future directions to consider.

The studies in this field have made considerable progress in integrating various Artificial Intelligence (AI) concepts, computational methodologies, nonverbal cues, interaction environments, and sensor setups. Researchers have developed novel machine learning and deep learning techniques that are capable of detecting and estimating a wide range of social and psychological phenomena in different scenarios. The diversity of sensors used in these studies has also grown significantly, allowing for more unconstrained, in-the-wild, and long-duration interactions.


Automatic analysis of nonverbal cues in human social interactions.


Despite the progress, there are several limitations and challenges that need to be addressed. Some of these include:

  1. Limited dataset availability: Many of the datasets used in these studies are not publicly available, which hinders reproducibility of the results and the development of new methodologies.
  2. Annotation reliability: Some studies lack the reliability analysis of annotations, which may affect the accuracy of the results.
  3. Scalability: The existing datasets are relatively small compared to other research domains using AI, which poses challenges for applying deep learning techniques that typically require large amounts of data for effective training.

To advance this field, several future research directions have been proposed:

  1. Unsupervised pre-training: Using unsupervised pre-training can lower data annotation effort and help develop more effective models.
  2. Generalization and domain adaptation: Investigating the generalization ability of proposed methods and integrating domain adaptation methods can enhance the performance of models across different datasets.
  3. Explainability and interpretability: Improving the explainability and interpretability of models will enable better communication across experts from different disciplines and ensure that the AI models are making decisions in a transparent and understandable manner.
  4. In-the-wild datasets and adaptive methods: Developing datasets captured in real-world settings and employing online learning and adaptive methods can help tackle real-life challenges when deploying the technology.
  5. Privacy-preserving approaches: Developing privacy-preserving nonverbal behavior representations, sensors, and computational models will ensure that individuals’ privacy is respected when analyzing social interactions.

The field of automatic analysis of nonverbal cues in human social interactions holds great promise for various applications. By addressing the challenges and limitations, and focusing on the proposed future directions, researchers can continue to advance this area and develop systems that can effectively analyze and respond to human social interactions in real-world scenarios, ultimately leading to improved human-computer interactions and enhanced understanding of human behavior.



CIGDEM BEYAN, Department of Information Engineering and Computer Science (DISI), University of Trento, Italy
ALESSANDRO VINCIARELLI, School of Computing Science, University of Glasgow, UK. Advisory board member at Substrata
ALESSIO DEL BUE, Pattern Analysis and Computer Vision (PAVIS), Istituto Italiano di Tecnologia (IIT), Italy


Face-to-Face Co-Located Human-Human Social Interaction Analysis using Nonverbal Cues: A Survey