Fake detection in imbalance dataset by Semi-supervised learning with GAN

Bordbar, Jinus; Ardalan, Saman; Mohammadrezaie, Mohammadreza; Ghasemi, Zahra

Full-text links:

Download:

Source

Current browse context:

cs.LG

< prev | next >

new | recent | 2212

Computer Science > Machine Learning

Title: Fake detection in imbalance dataset by Semi-supervised learning with GAN

Authors: Jinus Bordbar, Saman Ardalan, Mohammadreza Mohammadrezaie, Zahra Ghasemi

(Submitted on 2 Dec 2022 (v1), last revised 20 Dec 2023 (this version, v5))

Abstract: As social media continues to grow rapidly, the prevalence of harassment on these platforms has also increased. This has piqued the interest of researchers in the field of fake detection. Social media data, often forms complex graphs with numerous nodes, posing several challenges. These challenges and limitations include dealing with a significant amount of irrelevant features in matrices and addressing issues such as high data dispersion and an imbalanced class distribution within the dataset. To overcome these challenges and limitations, researchers have employed auto-encoders and a combination of semi-supervised learning with a GAN algorithm, referred to as SGAN. Our proposed method utilizes auto-encoders for feature extraction and incorporates SGAN. By leveraging an unlabeled dataset, the unsupervised layer of SGAN compensates for the limited availability of labeled data, making efficient use of the limited number of labeled instances. Multiple evaluation metrics were employed, including the Confusion Matrix and the ROC curve. The dataset was divided into training and testing sets, with 100 labeled samples for training and 1,000 samples for testing. The novelty of our research lies in applying SGAN to address the issue of imbalanced datasets in fake account detection. By optimizing the use of a smaller number of labeled instances and reducing the need for extensive computational power, our method offers a more efficient solution. Additionally, our study contributes to the field by achieving an 81% accuracy in detecting fake accounts using only 100 labeled samples. This demonstrates the potential of SGAN as a powerful tool for handling minority classes and addressing big data challenges in fake account detection.

Comments:	needed more investigation o final results
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Cite as:	arXiv:2212.01071 [cs.LG]
	(or arXiv:2212.01071v5 [cs.LG] for this version)

Submission history

From: Jinus Bordbar [view email]
[v1] Fri, 2 Dec 2022 10:22:18 GMT (91kb,D)
[v2] Wed, 6 Sep 2023 11:38:00 GMT (0kb,I)
[v3] Mon, 18 Dec 2023 10:47:28 GMT (573kb,D)
[v4] Tue, 19 Dec 2023 08:58:50 GMT (0kb,I)
[v5] Wed, 20 Dec 2023 08:18:14 GMT (0kb,I)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2212.01071

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Fake detection in imbalance dataset by Semi-supervised learning with GAN

Submission history