
A research team led by Professor Hyun Ghang Jeong from the Department of Psychiatry at Korea University College of Medicine (Korea University Guro Hospital), in collaboration with the research team at Geovision Inc., has published the results of a large-scale validation study investigating the feasibility of early detection of self-harm behavior using artificial intelligence (AI) in psychiatric wards. The study was published in the journal Scientific Reports.
Early detection of self-harm behavior in closed psychiatric wards plays a critical role in ensuring patient safety. However, continuous human monitoring presents structural challenges, including staffing limitations and blind spots in observation. To address this issue, the research team evaluated how accurately video-based AI action recognition technology can detect self-harm behavior in real clinical settings, and whether AI models trained in controlled laboratory environments maintain their performance when applied to real-world hospital conditions.
The research team generated 1,120 simulated self-harm video samples in a studio environment designed to closely replicate psychiatric ward conditions. In addition, 118 real clinical video samples collected from the closed psychiatric ward at Korea University Guro Hospital were used as validation data. All clinical video data were fully de-identified prior to analysis, and clinical reliability was ensured through cross-verification with medical records. The team then trained and evaluated six state-of-the-art deep learning-based action recognition AI models under identical conditions to compare their performance in detecting self-harm behaviors.
The results showed that while AI models demonstrated relatively high performance in the simulated studio environment, their performance significantly declined when applied to real clinical video data. Even the latest transformer-based AI models struggled to generalize effectively due to the variability of real-world conditions, including diverse behavioral patterns, occlusion, and irregular movement characteristics. In particular, subtle and repetitive self-harm behaviors, such as scratching and skin picking, were identified as the most difficult types for existing AI models to detect.
Professor Jeong stated, “The most significant contribution of this study is that it quantitatively demonstrated both the potential and the limitations of AI-based self-harm detection technology. By systematically comparing simulated and real clinical datasets, we were able to clearly identify what improvements are needed for clinical implementation of AI models at the current technological level. Furthermore, the studio-based self-harm behavior dataset established through this study is expected to contribute to advancing research in psychiatry and medical AI.”
Publication details
Kanghee Lee et al, Benchmarking action recognition models for self-harm detection in studio and real-world datasets, Scientific Reports (2026). DOI: 10.1038/s41598-026-36999-w
Journal information:
Scientific Reports
Key medical concepts
Clinical categories
The content is provided for information purposes only.
