Real-Time Toxicity Filtering for Open-Source Code Reviews

2026-04-10Software Engineering

Software Engineering
AI summary

The authors created ToxiShield, a tool that works in your browser to spot and fix rude or toxic comments in open-source code reviews. It uses different AI models to detect toxic language, figure out why it is toxic, and rewrite the comments to be nicer without losing meaning. Their tests showed the tool is quite accurate, and software developers who tried it believe it helps make open-source communities friendlier. This could help teams work better together by reducing harmful interactions.

toxicity detectioncode reviewBERT classifiermulticlass classificationLlama modelstyle transferfluencycontent preservationprompt engineeringopen-source collaboration
Authors
Md Awsaf Alam Anindya, Showvik Biswas, Anindya Iqbal, Jaydeb Sarker, Amiangshu Bosu
Abstract
Toxic interactions in open-source software development harm community collaboration. To combat this, we propose ToxiShield, a realtime browser extension that identifies and detoxifies toxic code reviews. The framework comprises three modules: toxicity identification, reasoned multiclass classification, and code review detoxification. Our fine-tuned BERT-based binary classifier achieved a 97% F1-score on 38,761 code review texts. For multiclass classification, Claude 3.5 Sonnet with prompt engineering achieved a 39% MCC and 42% F1 on 1,200 samples. Finally, our fine-tuned Llama 3.2 detoxification model reached 95.27% style transfer accuracy, 97.03% fluency, 67.07% content preservation, and an 84% J-score. Validation with 10 software developers suggests ToxiShield effectively fosters a more inclusive open-source environment.