Comparing Chatbot Performance Enhanced with Persistent Homology

2026-06-29Machine Learning

Machine Learning
AI summary

The authors study how to improve chatbots, especially for mental health, when only small or private datasets are available. They use a method called persistent homology (PH) to transform the data in a way that might help training without needing more data. They compare chatbot performance with and without this PH addition across several tests. Their results show that while PH doesn't always help, it can sometimes significantly improve chatbot accuracy at almost no extra cost.

chatbotsmental health supportpersistent homologydata augmentationmachine learningprivate datasetsvectorizationmodel performancetraining datadata topology
Authors
Nithisha Raghavaraju, Barbara Giunti, Bastian Rieck
Abstract
Chatbots have become increasingly prevalent across various domains, offering automated assistance in many areas, especially mental health support. The training is done using extremely large datasets, which are sometimes not available in very specific domains. Moreover, it would sometimes be ideal to train the chatbot with personal information about the patients, which, of course, cannot be done on shared servers since it would violate patient confidentiality. Hence, being able to improve the performance of a chatbot, possibly trained locally and on a restricted dataset, without having to increase the dataset itself, would be extremely beneficial. In this work, we will enhance the input datasets using persistent homology (PH) vectorizations computed from the raw datasets themselves. Then we will compare, across several metrics, the performance of multiple chatbot models with or without the PH enhancement. Our experiments suggest that, while at times the PH enhancement is not particularly beneficial, it sometimes brings remarkable advantages for virtually no cost.