HATE SPEECH PREDICTION USING K-MEANS ALGORITHM
Abstract
Hate speech in social media nowadays is a common thing to happen. Inspired by the issue, this research utilize data mining algorithm and methods to predict and classify it. By using dataset from twitter, this research will focus to define Hate Speech. Before beginning to use the algorithm, firstly the dataset needs to be cleaned, after that the data will be converted tu numeric values by using TF-IDF. With N-Gram, the final results will be more stable in terms of accuracy. After the preprocessing is done, then the K-Means Algorithm is used. The final results of the research is that by using Tri-Gram, accuracy is better than Bi-Gram and Uni-Gram with highest reach of 80% efficiency.
Keywords
Full Text:
PDFReferences
Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings
of the 7th International Conference on Language Resources and Evaluation, LREC 2010, 1320–1326.
https://doi.org/10.17148/ijarcce.2016.51274
Saputra, T. I., & Arianty, R. (2019). Implementasi Algoritma K-Means Clustering Pada Analisis Sentimen
Keluhan Pengguna Indosat. Jurnal Ilmiah Informatika Komputer, 24(3), 191–198.
https://doi.org/10.35760/ik.2019.v24i3.2361
Lutfi, A. A., Permanasari, A. E., & Fauziati, S. (2018). Corrigendum: Sentiment Analysis in the Sales Review
of Indonesian Marketplace by Utilizing Support Vector Machine. Journal of Information Systems Engineering
and Business Intelligence, 4(2), 169. https://doi.org/10.20473/jisebi.4.2.169
Wangsanegara, N. K., & Subaeki, B. (2015). Implementasi Natural Language Processing Dalam Pengukuran
Ketepatan Ejaan Yang Disempurnakan (Eyd) Pada Abstrak Skripsi Menggunakan Algoritma Fuzzy Logic. Jurnal
Teknik Informatika, 8(2). https://doi.org/10.15408/jti.v8i2.3185
Parveen, H., & Pandey, S. (2017). Sentiment analysis on Twitter Data-set using Naive Bayes algorithm.
Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and
Communication Technology, ICATccT 2016, 416–419. https://doi.org/10.1109/ICATCCT.2016.7912034
Rezwanul, M., Ali, A., & Rahman, A. (2017). Sentiment Analysis on Twitter Data using KNN and SVM.
International Journal of Advanced Computer Science and Applications, 8(6), 19–25.
https://doi.org/10.14569/ijacsa.2017.080603
Riaz, S., Fatima, M., Kamran, M., & Nisar, M. W. (2019). Opinion mining on large scale data using
sentiment analysis and k-means clustering. Cluster Computing, 22, 7149–7164. https://doi.org/10.1007/s10586-
-1077-z
Windarto, A. P. (2017). Penerapan Datamining Pada Ekspor Buah-Buahan Menurut Negara Tujuan
Menggunakan K-Means Clustering Method. Techno.Com, 16(4), 348–357.
https://doi.org/10.33633/tc.v16i4.1447
Alkhairi, P., & Windarto, A. P. (2019). Penerapan K-Means Cluster pada Daerah Potensi Pertanian Karet
Produktif di Sumatera Utara. Seminar Nasional Teknologi Komputer & Sains, 762–767. http://seminar-
id.com/prosiding/index.php/sainteks/article/download/228/223
Dewi, S. M., Windarto, A. P., Damanik, I. S., & Satria, H. (2019). Analisa Metode K-Means pada
Pengelompokan Kriminalitas Menurut Wilayah. Seminar Nasional Sains & Teknologi Informasi (SENSASI),
–625. http://prosiding.seminar-id.com/index.php/sensasi/article/download/376/368
DOI: https://doi.org/10.24167/proxies.v3i2.12430
Copyright (c) 2024 Proxies : Jurnal Informatika
View My Stats