Sistem Pendeteksi Pengirim Tweet dengan Metode Klasifikasi Naive Bayes

Maresha Caroline Wijanto


Until Januari 2015, social media users reached 29% of the world population. In Indonesia itself had 28% active users from total populasi of Indonesia. The usage of social media gives positives and negatives effect. The negatives effect are the increasing number of fraud by using SMS or social media, such as Twitter. Many people are deceived by the tweet messages sent from known user account when in fact the sender is other person. Because of that, there is a need to have a system to detect wheteher the tweet sender is the same person or not. Naive Bayes classifiers method is used to classify that. The data source is taken from tokens selected based on two models, the minimum n-time number of occurrences and the n-th highest number of occurrences. Each tweets also processed into six different types of tweets, such as formal tweet or lowercase tweet. The test uses tenfold cross-validation and measured by the value of accuracy, precision, recall, and F-score. The common result shows 82,145% level of accuracy. Second model to select the tokens shows consistency level of accuracy for each types of tweets. The fifth types of tweets also get the highest level of accuracy for both models to select the tokens.

