Combining Rule Based, Lexicon Based and Support Vector Machine for Improve Accuracy in Sentiment Analysis of ChatGPT Usage
Keywords:
ChatGPT, Large Language Models, sentiment analysis, lexicon based,, rule based, Support Vector MachineAbstract
ChatGPT is one of the Large Language Models (LLM) which is an artificial intelligence (AI) based chatbot. ChatGPT caused controversy in various news media due to its ability to interact and provide natural, human-like responses. This controversy created skepticism in society regarding the brand image. Therefore, a sentiment analysis was carried out specifically targeting Indonesian speaking Twitter users with a focus on the ChatGPT brand image. Various studies have been carried out to analyze user sentiment towards ChatGPT by analyzing tweets shared regarding ChatGPT using machine learning and deep learning. This research uses a combination of rule-based, Support Vector Machine (SVM), and lexicon-based methods. Rule-based classification uses emoticons and comparatives, while lexicon-based classification uses the BabelSenticNet lexicon. The data set, obtained through a Twitter crawl, initially consisted of 2,500 tweets, which were then cleaned to 1,728 tweets. After classification, model evaluation is carried out by comparing the performance between the proposed classification model and the constituent models. The proposed classification model has better performance, achieving 86.3% accuracy, 87.1% precision, 86.5% recall, and 86.6% f-measure. Sentiment predictions from 1,728 tweets resulted in 739 positive, 385 negative and 604 neutral sentiments. In conclusion, ChatGPT's brand image in Indonesia tends to be positive, even though there are differences in views and objective assessments.