Introduction
Social media’s growing popularity changed the way information is perceived and handled. Due to the active rise of social networks and their accessibility, most users perceive the news during critical events through online sources. Consequently, the growth of misleading, fake, controversial, or emotionally charged content appears on the internet. Furthermore, networks’ convenience and openness deliberately convey misinformation to the public for different purposes, including political or social manipulation.
The problem of fake news identifies through different approaches that reflect various aspects of the delivery, production, and personal perception of fake information. Over the last years, scholars have investigated fake news concerning the context of data, the structure of social media, and user demography. They used the Multi-source Multi-class Fake news Detection, exploiting the combination of automated feature extraction, multi-source fusion, and automated degrees of fakeness (Karimi et al., 2018). However, the user’s information, social media peculiarities, and temporal data have remained unexplored. Thus, further research must analyze works that investigate the personal perception of fake news, users as the main category, and the field of social media.
Methods of Naive Bayes Classifier and Web Data
Extraction
The first analyzed method deals with the text of fake news messages, utilizing artificial intelligence and mathematic algorisms to detect false information. The technics provided by Jain and Kasbe (2018) may serve as a ground for further complex research. The algorism of Naive Bayes serves as the primary mechanism of detecting whether or not the news is fake, based on the data collected from the different resources. Web Data Extraction or Web Scrapping is a technique that extracts large amounts of information from reliable platforms and creates datasets in real-time.
Such an approach is proven useful only in the mixed application of the mentioned methods. The Bayes theorem works based on the words in genuine and fake articles and claims. It calculates the use of the exact words in the text and the existing database. Therefore, applying a given method without prearranged dataset is impossible.
During the research, it is essential to understand the arrangements of words extracted and analyzed. According to the study, the results are more accurate – 0.931 against 0.912 – when the concept of n-grams was implied (Jain & Kasbe, 2018). The problem may appear if specific words are not present in the dataset. Consequently, the implementation skips these words and states them as neither real nor fake (Jain & Kasbe, 2018). It may affect the results or exclude a vital context not present in the dataset.
The advantage of this technic is acceptable and relatively simple computerized algorism that may be used to prove or verify the current results. Web scrapping must be revised to exclude the stop words – common words, such as prepositions, pronouns, and conjunctions, and to combine the wordstock into n-grams according to the context. Thus, the given research provides an overlook of methods that may help examine and prove certain news’s fakeness. However, it studies the text as a separate unity, excluding its social field and the source of information.
Deep learning and Usage of Semantic Knowledge
The approach of using deep learning technic of social media platforms through opinion mining and trustworthiness of user and event (SPOT) shows better results than the research of Jain and Kasbe, according to the broader characteristics of studied news. The work of Sabeeh et al. (2020) addresses and solves the failure to incorporate contextual information and the user’s engagement with the news during the interpretation. Researchers use the social network Twitter to study data transformation, opinion mining, and credibility analysis. Therefore, this work explores fake news not as an isolated phenomenon but as a social concept that depends on various factors.
Attention to the process of data transformation is crucial in terms of studying fake news shared by social media users. An average user’s tweet includes noise, slang words, emotional modifiers, abbreviations, and folksonomies. According to the SPOT method, the algorism extracts all essential information from the tweeted news. This approach helps create a neutral research environment based on a personally influenced message.
Scholars place the main emphasis in their work on the personal features of sharing news. It is evidenced by the application of opinion mining on user comments, which is a part of credibility algorism. Consequently, the SPOT analyzes the replies to identify the users’ conception regarding the particular messages, which reveals the trustworthiness of the tweet (Sabeeh et al., 2020). In terms of general usage, it may help differentiate satire from real news depending on the content and feedback. Nevertheless, there is still a risk of misinformation propagation through credible users’ tweets. Therefore, the evaluation of user credibility based on the social relationship alone lacks to identify the fake news.
Besides the user’s credibility and relevance of the comment section, SPOT applies the bi-directional gated recurrent neural networks. It resolves the problem of individual words, which appeared in the previous research. The model of the tweet-level semantic and syntactic relation ensures the detection of horizontal ties between lexemes in the message. Using the appropriate algorism of the SoftMax function, scholars distinguish the informative words from less contributive. Hence, the bi-directional GRNN presents a way of investigating the wordstock of fake news as an informational unity.
Thus, the research of Sabeeh et al. (2020) provides the solution for many existing issues in previous works. Among them are problems of contextual investigation, the position of user trustworthiness, and the social field of news sharing. The SPOT method achieves higher results in 97% accuracy and constantly expands depending on the evolving semantic field of fake news on Twitter. Thus, researchers present almost perfect automatized algorism.
Satirical Features in Identification of Fake News
Excluding the emotionally charged content from news makes it impossible to study satire as a way of forming misinformation. The words and phrases appeared as units without context in the first work. Hence Jain and Kasbe cannot calculate emotional connotation and polysemantic content, which often occurs in satire. The tested material of the second research is phrases and basic information that convey the meanings of news. Therefore, there is a need to explore satirical news according to special techniques and methods.
In their work, Rubin et al. (2016) develop a mechanism for studying satire in compliance with Absurdity, Humor, Grammar, Negative Affect, and Punctuation. These criteria correspond with Natural Language Processing (NLP) methods, combined with machine learning to detect language patterns, sentiment, rhetorical devices, and word occurrences common to satire. To detect humor through joke classification, the researchers use the same algorism as Jain and Kasbe – naïve Bayes, combined with support vector machines (SVM) based on joke-specific features (Rubin et al., 2016). Grammar, Negative Affect, and Punctuation are explored as the set of normalized term frequencies against the Linguistic Inquiry and Word Count dictionaries. The method provides means of studying commas, semi-colons, and question marks, in the test set.
The research expands technics of understanding satire as misleading news. SVM and NLP help to derive empirical cues indicative of deception. Rubin et al. (2016) reveal a problematic approach to emotive language and punctuation as a part of fake news formation. Furthermore, results present that more complex language patterns, such as deep syntax and grammatical patterns, might be detected in fake news through regular expression pattern matching techniques. It creates a broad field for further investigation in humor-influenced fake misinformation.
Behavioral Approach to Fake News Detection
The previously analyzed studies and the prevailing trends remain in the use of automatized approaches and machine algorisms. There is still a lack of understanding of human behavior during the social media interaction with fake news. Simko et al. (2019) develop research on the eye-tracking method in social media consumption. Scholars explore features that draw users’ attention to fake news and human fact-checkers that may be applied to machined technology. Observation of participants’ behavior points out the importance of veracity validation.
The study based on 44 participants creates a relevant database on human behavior. Results provided an understanding of reading habits are influenced by their interests. The indicative power of explicit actions online, such as reports, mainly correctly indicates fake messages. Out of twelve times, reports were made, and ten times revealed fake articles (Simko et al., 2019). The strength of opinion reflects the participants’ interest in most of the topics. Besides, some topics with lower interest have a common polarity – weight loss and vaccination, suggesting that the external environment influenced opinions forming mechanisms. Therefore, detecting fake news may be disordered by common human beliefs and personal interests in specific topics.
There are still many ways of expanding the research due to different groups of people, according to the demographic and professional sphere. The results may be more precise ground on a broader dataset. Overall the human behavioral approach is necessary for understanding the nature of personal activity on social media to develop more accurate machine algorisms that would easily calculate fake news in the social media space.
Conclusion
In conclusion, considering all the facts and research, scholars developed a substantial theoretical and practical field of methods and techniques to distinguish fake news. Each study contributes to understanding the phenomenon of misinformation as a whole. However, neither of them covers every aspect of online deception. Jain and Kasbe (2018) provide a more straightforward approach to machine algorisms based on the prearranged dataset of words and n-grams, excluding the function of contextual information. At the same time, the research of Sabeeh et al. (2020) contributes to the missing field of context and explores the user in social media as a part of fake news. However, there is a loss of emotional information, which restricts satire investigation. And the independent exploration of human behavior in news consumption is necessary to establish the most effective technics of identification, which means that all of the analyzed works should be considered as a united study of fake news, which contributes to the later investigation of the topic.
References
Jain, A., & Kasbe A. (2018). Fake news detection.International Students’ Conference on Electrical, Electronics and Computer Sciences. IEEE. Web.
Karimi, H., Roy, P., Saba-Sadiya, S., & Tang, J. (2018). Multi-source multi-class fake news detection. Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics.
Rubin, V. L., Conroy, N. J., Chen, Y., & Cornwell, S. (2016) Fake news or truth? using satirical cues to detect potentially misleading news. Proceedings of NAACL-HLT. Association for Computational Linguistics. Web.
Sabeeh, V., Zohdy, M., Mollah, A., & Bashaireh, R. A. (2020). Fake news detection on social media using deep learning and semantic knowledge sources. International Journal of Computer Science and Information Security, 18(2), 45 – 68.
Simko, J., Hanakova, M., Racsko, P., Tomlein, M., Moro, R., & Bielikova, M. (2019). Fake news reading on social media: an eye-tracking study. HT ’19: Proceedings of the 30th ACM Conference on Hypertext and Social Media. ACM Digital Library. Web.