Sistem Pendeteksi Pengirim Tweet dengan Metode Klasifikasi Naive Bayes

Maresha Caroline Wijanto


Until Januari 2015, social media users reached 29% of the world population. In Indonesia itself had 28% active users from total populasi of Indonesia. The usage of social media gives positives and negatives effect. The negatives effect are the increasing number of fraud by using SMS or social media, such as Twitter. Many people are deceived by the tweet messages sent from known user account when in fact the sender is other person. Because of that, there is a need to have a system to detect wheteher the tweet sender is the same person or not. Naive Bayes classifiers method is used to classify that. The data source is taken from tokens selected based on two models, the minimum n-time number of occurrences and the n-th highest number of occurrences. Each tweets also processed into six different types of tweets, such as formal tweet or lowercase tweet. The test uses tenfold cross-validation and measured by the value of accuracy, precision, recall, and F-score. The common result shows 82,145% level of accuracy. Second model to select the tokens shows consistency level of accuracy for each types of tweets. The fifth types of tweets also get the highest level of accuracy for both models to select the tokens.

Full Text:



S. Kemp, "Digital, Social & Mobile in 2015," We Are Social, Singapore, 2015.

N. I. Setyani, S. Hastjarjo and N. N. Amal, "Penggunaan Media Sosial Sebagai Sarana Komunikasi Bagi Komunitas (Studi Deskriptif Kualitatif Penggunaan Media Sosial Twitter, Facebook, dan Blog sebagai Sarana Komunikasi bagi Komunitas Akademi Berbagi Surakarta)," Program Studi Ilmu Komunikasi Fakultas Ilmu Sosial dan Ilmu Politik Universitas Sebelas Maret, Surakarta, 2013.

A. H. Manumoyoso, "Penipu Lewat SMS Ditangkap di Jakarta -," 7 Oktober 2013. [Online]. Available: [Accessed 4 Agustus 2015].

R. K., "Waspada, Modus Penipuan Via Twitter," 10 April 2012. [Online]. Available: [Accessed 7 Agustus 2015].

H. Liauw, "Kejahatan di Dunia Maya Kian Berbahaya -," 2014 Oktober 2014. [Online]. Available: [Accessed 4 Agustus 2015].

W. Stefanus and M. C. Wijanto, "Detection the Similarity of the Message Sender On Short Message Service (SMS)," in PACLING 2015 (Pacific Association for Computational Linguistics Conference), Bali, 2015.

H. Purohit, A. Hampton, V. L. Shalin, A. P. Sheth, J. Flach and S. Bhatt, "What Kind of #Conversation is Twitter? Mining #Psycholinguistic Cues for Emergency Coordination," Computers in Human Behavior, vol. 29, no. 6, pp. 2438-2447, November 2013.

T. O. Ugheoke, "Detecting the Gender of a Tweet Sender," M.Sc. Project Report, Department of Computer Science, University of Regina, Regina, 2014.

"Company | About," 30 Juni 2015. [Online]. Available: [Accessed 9 Agustus 2015].

W. B. Croft, D. Metzler and T. Strohman, Search Engines Information Retrieval in Practice, Boston: Pearson Education, Inc., 2010.

D. Jurafski and J. H. Martin, Speech and Language Processing, 2nd Edition ed., New Jersey: Prentice Hall, 2009.

J. Han, M. Kamber and J. Pei, Data Mining Concepts and Techniques, Massachusetts: Morgan Kaufmann, 2012.

T. M. Mitchell, Machine Learning, United States of America: McGraw-Hill, 1997.

C. D. Manning, P. Raghavan and H. Schutze, Introduction to Information Retrieval, Cambridge: Cambridge University Press, 2008.

P. Gamallo and M. Garcia, "Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets," in 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, 2014.

B. T. Purnama, "," [Online]. Available: [Accessed 1 Agustus 2015].

I. H. Witten, E. Frank and M. A. Hall, Data Mining Pratical Machine Learning Tools and Techniques, Massachusetts: Morgan Kaufmann, 2011.



  • There are currently no refbacks.

Copyright (c) 2015 Maresha Caroline Wijanto