Sentiment and Emotion Classification of Indonesian E-Commerce Reviews via Multi-Task BiLSTM and AutoML Benchmarking
2026-04-27 • Computation and Language
Computation and Language
AI summaryⓘ
The authors studied how to understand feelings in Indonesian online product reviews, which mix normal words with slang and emojis, making usual methods unreliable. They used two approaches: one with common computer methods to find important words, and another using a neural network that reads reviews in both directions to guess sentiment and emotion. They cleaned data carefully with slang dictionaries and tested different model setups, finally sharing their code and apps online. Their work helps analyze emotions in complex language from Indonesian e-commerce sites.
Sentiment AnalysisEmotion ClassificationTF-IDFBiLSTMPyCaretPyTorchNatural Language ProcessingIndonesia Marketplace ReviewsSlang DictionaryGradio
Authors
Hermawan Manurung, Ibrahim Al-Kahfi, Ahmad Rizqi, Martin Clinton Tosima Manullang
Abstract
Indonesian marketplace reviews mix standard vocabulary with slang, regional loanwords, numeric shorthands, and emoji, making lexicon-based sentiment tools unreliable in practice. This paper describes a two-track classification pipeline applied to the PRDECT-ID dataset, which contains 5,400 product reviews from 29 Indonesian e-commerce categories, each labeled for binary sentiment (Positive/Negative) and five-class emotion (Happy, Sad, Fear, Love, Anger). The first track applies TF-IDF vectorization with a PyCaret AutoML sweep across standard classifiers. The second track is a PyTorch Bidirectional Long Short-Term Memory (BiLSTM) network with a shared encoder and two task-specific output heads. A preprocessing module applies 14 sequential cleaning steps, including a 140-entry slang dictionary assembled from marketplace corpora. Four configurations are benchmarked: BiLSTM Baseline, BiLSTM Improved, BiLSTM Large, and TextCNN. Training uses class-weighted cross-entropy loss, ReduceLROnPlateau scheduling, and early stopping. Both tracks are deployed as Gradio applications on Hugging Face Spaces. Source code is publicly available at https://github.com/ikii-sd/pba2026-crazyrichteam.