Toggle Main Menu Toggle Search

Open Access padlockePrints

EHSAN: Leveraging ChatGPT in a Hybrid Framework for Arabic Aspect-Based Sentiment Analysis in Healthcare

Lookup NU author(s): Eman Alamoudi, Dr Ellis SolaimanORCiD

Downloads


Licence

This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).


Abstract

Arabic-language patient feedback remains under-analysed because dialect diversity and scarce aspect-level sentiment labels hinder automated assessment. To address this gap, we introduce EHSAN, a data-centric hybrid pipeline that merges ChatGPT pseudo-labelling with targeted human review to build the first explainable Arabic aspect-based sentiment dataset for healthcare. Each sentence is annotated with an aspect and sentiment label (positive, negative, or neutral), forming a pioneering Arabic dataset aligned with healthcare themes, with ChatGPT-generated rationales provided for each label to enhance transparency. To evaluate the impact of annotation quality on model performance, we created three versions of the training data: a fully supervised set with all labels reviewed by humans, a semi-supervised set with 50% human review, and an unsupervised set with only machine-generated labels. We fine-tuned two transformer models on these datasets for both aspect and sentiment classification. Experimental results show that our Arabic-specific model achieved high accuracy even with minimal human supervision, reflecting only a minor performance drop when using ChatGPT-only labels. Reducing the number of aspect classes notably improved classification metrics across the board. These findings demonstrate an effective, scalable approach to Arabic aspect-based sentiment analysis (SA) in healthcare, combining large language model annotation with human expertise to produce a robust and explainable dataset. Future directions include generalisation across hospitals, prompt refinement, and interpretable data-driven modelling.


Publication metadata

Author(s): Alamoudi E, Solaiman E

Publication type: Conference Proceedings (inc. Abstract)

Publication status: Published

Conference Name: Database Engineered Applications. IDEAS 2025

Year of Conference: 2025

Pages: 17–33

Print publication date: 01/11/2025

Online publication date: 01/11/2025

Acceptance date: 23/06/2025

Date deposited: 15/08/2025

ISSN: 0302-9743

Publisher: Springer

URL: https://doi.org/10.1007/978-3-032-06744-9_2

DOI: 10.1007/978-3-032-06744-9_2

ePrints DOI: 10.57711/nxhp-2162

Data Access Statement: An anonymised version of the EHSAN dataset and the experimental code has been archived on Zenodo for perpetual access (https://doi.org/10.5281/zenodo.15418860).

Library holdings: Search Newcastle University Library for this item

Series Title: Lecture Notes in Computer Science

ISBN: 9783032067432


Share