2025.acl-long.1049@ACL

Total: 1

#1 PsyDial: A Large-scale Long-term Conversational Dataset for Mental Health Support [PDF2] [Copy] [Kimi2] [REL]

Authors: Huachuan Qiu, Zhenzhong Lan

Dialogue systems for mental health counseling aim to alleviate client distress and assist individuals in navigating personal challenges. Developing effective conversational agents for psychotherapy requires access to high-quality, real-world, long-term client-counselor interaction data, which is difficult to obtain due to privacy concerns. Although removing personally identifiable information is feasible, this process is labor-intensive. To address these challenges, we propose a novel privacy-preserving data reconstruction method that reconstructs real-world client-counselor dialogues while mitigating privacy concerns. We apply the RMRR (Retrieve, Mask, Reconstruct, Refine) method, which facilitates the creation of the privacy-preserving PsyDial dataset, with an average of 37.8 turns per dialogue. Extensive analysis demonstrates that PsyDial effectively reduces privacy risks while maintaining dialogue diversity and conversational exchange. To fairly and reliably evaluate the performance of models fine-tuned on our dataset, we manually collect 101 dialogues from professional counseling books. Experimental results show that models fine-tuned on PsyDial achieve improved psychological counseling performance, outperforming various baseline models. A user study involving counseling experts further reveals that our LLM-based counselor provides higher-quality responses. Code, data, and models are available at https://github.com/qiuhuachuan/PsyDial, serving as valuable resources for future advancements in AI psychotherapy.

Subject: ACL.2025 - Long Papers