Annals of Emerging Technologies in Computing (AETiC)

 
Paper #8                                                                             

Codeword Detection, Focusing on Differences in Similar Words Between Two Corpora of Microblogs

Takuro Hada, Yuichi Sei, Yasuyuki Tahara and Akihiko Ohsuga


Abstract: Recently, the use of microblogs in drug trafficking has surged and become a social problem. A common method applied by cyber patrols to repress crimes, such as drug trafficking, involves searching for crime-related keywords. However, criminals who post crime-inducing messages maximally exploit “codewords” rather than keywords, such as enjo kosai, marijuana, and methamphetamine, to camouflage their criminal intentions. Research suggests that these codewords change once they gain popularity; thus, effective codeword detection requires significant effort to keep track of the latest codewords. In this study, we focused on the appearance of codewords and those likely to be included in incriminating posts to detect codewords with a high likelihood of inclusion in incriminating posts. We proposed new methods for detecting codewords based on differences in word usage and conducted experiments on concealed-word detection to evaluate the effectiveness of the method. The results showed that the proposed method could detect concealed words other than those in the initial list and to a better degree than the baseline methods. These findings demonstrated the ability of the proposed method to rapidly and automatically detect codewords that change over time and blog posts that instigate crimes, thereby potentially reducing the burden of continuous codeword surveillance.


Keywords: Codewords Detect; Microblog; Twitter; Word Embedding.


 
Full Text

This work is licensed under a Creative Commons Attribution 4.0 International License. Creative Commons License


This browser does not support PDFs. Please download the PDF to view it: Download PDF.

 
 International Association for Educators and Researchers (IAER), registered in England and Wales - Reg #OC418009                         Copyright © IAER 2021