Search engine :
Return to the menu
| : /
Vote:
Results:
0 Votes
JANUARY-DECEMBER 2020 - Volume: 7 - Pages: [11 p.]
Download pdf
ABSTRACT:This article presents an intelligent system to detect Cybercrime lexicon on Web sites, to find knowledge about large amounts of information on the Internet in an acceptable response time. The proposed architecture uses a Web Scraper to locate and download information from the Internet. To obtain the linguistic corpus of Cybercrime, a parallel genetic strategy is executed, which distributes the processes of cleaning Web pages and the techniques for Natural Language Processing (tokenization, stop words, frequency of term, term frequency with inverse document frequency), together with lemmatization methods and synonyms. To obtain knowledge, a dataset was generated that makes use of a semantic ontology with the general characteristics of Cybercrime. To evaluate the efficiency of the model, supervised learning algorithms were used: Boosting, Neural Network and Random Forests in parallel. The results reveal 97.64% accuracy in the detection of Cybercrime vocabulary, which was verified by the LOOCV cross-validation technique, in addition, a time-saving was obtained in data recovery and knowledge search of 292% and 1220% respectively using parallel processing.Keywords: Cybercrime, Big Data Analytics, Web Mining, Semantic Web, Machine Learning, Intelligent Systems, Parallel Processing.
Share:
© DYNA New Technologies Journal
EDITORIAL: Publicaciones DYNA SL
Adress: Alameda Mazarredo 69 - 2º, 48009-Bilbao SPAIN
Email: info@dyna-newtech.com - Web: http://www.dyna-newtech.com
Regístrese en un paso con su email y podrá personalizar sus preferencias mediante su perfil
Name: *
Surname 1: *
Surname 2:
Email: *