A HYBRID CONTEXTUAL EMBEDDING BASED CLUSTERING AND CLASSIFICATION TECHNIQUE FOR UNSUPERVISED IMPLICIT ASPECT CATEGORIZATION IN INDONESIAN REVIEWS

Main Article Content

Nur Hayatin
Suraya Alias
Lai Po Hung

Abstract

Aspect categorization is a grouping of reviews based on aspect categories that follow the review domain. The problem arises when only sentiment features appear as a clue to predict implicit aspects. On the other hand, implicit aspects play an important role in generating a summary. Without implicit aspect, we probably lose some important words needed for analyzing user’s reviews. Existing techniques face difficulties in utilizing the implicit aspects due to limited resources and computationally expensive problems. Hence, we propose an implicit aspect categorization model based on a hybrid contextual embedding-based clustering and classification technique. We developed the model using an unsupervised learning approach which is no need labelled data in training. A contextual embedding-based clustering technique generated train data from explicit sentences which will be used to classify implicit aspect categorization. Four steps of the proposed implicit aspects categorization model, i.e. preprocessing data, sentence feature selection, generating train data based on clustering, and finally categorizing implicit aspect using classification technique. We experiment with several classification techniques to get the best combination of the proposed technique (i.e. Logistic Regression, Support Vector Machine, Naïve Bayes, Decision Tree, and Random Forest). Based on the experiment, the combination of contextual embedding-based clustering and Random Forest algorithm produces higher accuracy than other classification techniques, with accuracy tent to 72.04% and F1 score in 0.6788.

Downloads

Download data is not yet available.

Article Details

How to Cite
Hayatin, N. ., Alias, S. ., & Hung, L. P. . (2025). A HYBRID CONTEXTUAL EMBEDDING BASED CLUSTERING AND CLASSIFICATION TECHNIQUE FOR UNSUPERVISED IMPLICIT ASPECT CATEGORIZATION IN INDONESIAN REVIEWS. Malaysian Journal of Computer Science, 38. Retrieved from https://mjcs.um.edu.my/index.php/MJCS/article/view/63775
Section
Articles