We gratefully acknowledge support from
the Simons Foundation and member institutions.

Social and Information Networks

New submissions

[ total of 10 entries: 1-10 ]
[ showing up to 500 entries per page: fewer | more ]

New submissions for Tue, 28 Jun 22

[1]  arXiv:2206.12678 [pdf, ps, other]
Title: Finding Proper Time Intervals for Dynamic Network Extraction
Comments: 19 pages, 12 figures
Journal-ref: J. Stat. Mech. (2021) 033414
Subjects: Social and Information Networks (cs.SI); Probability (math.PR)

Extracting a proper dynamic network for modelling a time-dependent complex system is an important issue. Building a correct model is related to finding out critical time points where a system exhibits considerable change. In this work, we propose to measure network similarity to detect proper time intervals. We develop three similarity metrics, node, link, and neighborhood similarities, for any consecutive snapshots of a dynamic network. Rather than a label or a user-defined threshold, we use statistically expected values of proposed similarities under a null-model to state whether the system changes critically. We experimented on two different data sets with different temporal dynamics: The Wi-Fi access points logs of a university campus and Enron emails. Results show that, first, proposed similarities reflect similar signal trends with network topological properties with less noisy signals, and their scores are scale invariant. Second, proposed similarities generate better signals than adjacency correlation with optimal noise and diversity. Third, using statistically expected values allows us to find different time intervals for a system, leading to the extraction of non-redundant snapshots for dynamic network modelling.

[2]  arXiv:2206.12735 [pdf, other]
Title: Cascading Failures in Smart Grids under Random, Targeted and Adaptive Attacks
Comments: Accepted for publication as a book chapter. arXiv admin note: substantial text overlap with arXiv:1402.6809
Subjects: Social and Information Networks (cs.SI); Discrete Mathematics (cs.DM); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI)

We study cascading failures in smart grids, where an attacker selectively compromises the nodes with probabilities proportional to their degrees, betweenness, or clustering coefficient. This implies that nodes with high degrees, betweenness, or clustering coefficients are attacked with higher probability. We mathematically and experimentally analyze the sizes of the giant components of the networks under different types of targeted attacks, and compare the results with the corresponding sizes under random attacks. We show that networks disintegrate faster for targeted attacks compared to random attacks. A targeted attack on a small fraction of high degree nodes disintegrates one or both of the networks, whereas both the networks contain giant components for random attack on the same fraction of nodes. An important observation is that an attacker has an advantage if it compromises nodes based on their betweenness, rather than based on degree or clustering coefficient.
We next study adaptive attacks, where an attacker compromises nodes in rounds. Here, some nodes are compromised in each round based on their degree, betweenness or clustering coefficients, instead of compromising all nodes together. In this case, the degree, betweenness, or clustering coefficient is calculated before the start of each round, instead of at the beginning. We show experimentally that an adversary has an advantage in this adaptive approach, compared to compromising the same number of nodes all at once.

[3]  arXiv:2206.12904 [pdf, other]
Title: Are We All in a Truman Show? Spotting Instagram Crowdturfing through Self-Training
Subjects: Social and Information Networks (cs.SI)

In 2021, Influencer Marketing generated more than $13 billion. Companies and major brands advertise their products on Social Media, especially Instagram, through Influencers, i.e., people with high popularity and the ability to influence the mass. Usually, more popular and visible influencers are paid more for their collaborations. As a result, many services were born to boost profiles' popularity, engagement, or visibility, mainly through bots or fake accounts. Researchers have focused on recognizing such unnatural activities in different social networks with high success. However, real people recently started participating in such boosting activities using their real accounts for monetary rewards, generating ungenuine content that is very difficult to detect. Currently, on Instagram, no works have tried to detect this new phenomenon, known as crowdturfing (CT).
In this work, we are the first to propose a CT engagement detector on Instagram. Our algorithm leverages profiles' characteristics through semi-supervised learning to spot accounts involved in CT activities. In contrast to the supervised methods employed so far to detect fake accounts, a semi-supervised approach takes advantage of the vast quantities of unlabeled data on social media to yield better results. We purchased and studied 1293 CT profiles from 11 providers to build our self-training classifier, which reached 95% accuracy. Finally, we ran our model in the wild to detect and analyze the CT engagement of 20 mega-influencers (i.e., with more than one million followers), discovering that more than 20% of their engagement was artificial. We analyzed the profiles and comments of people involved in CT engagement, showing how difficult it is to spot these activities using only the generated content.

[4]  arXiv:2206.12915 [pdf, ps, other]
Title: Disambiguating Disinformation: Extending Beyond the Veracity of Online Content
Comments: In Workshop Proceedings of the 15th International AAAI Conference on Web and Social Media (2021)
Subjects: Social and Information Networks (cs.SI)

Following the 2016 US presidential election and the now overwhelming evidence of Russian interference, there has been an explosion of interest in the phenomenon of "fake news". To date, research on false news has centered around detecting content from low-credibility sources and analyzing how this content spreads across online platforms. Misinformation poses clear risks, yet research agendas that overemphasize veracity miss the opportunity to truly understand the Kremlin-led disinformation campaign that shook so many Americans. In this paper, we present a definition for disinformation - a set or sequence of orchestrated, agenda-driven information actions with the intent to deceive - that is useful in contextualizing Russian interference in 2016 and disinformation campaigns more broadly. We expand on our ongoing work to operationalize this definition and demonstrate how detecting disinformation must extend beyond assessing the credibility of a specific publisher, user, or story.

[5]  arXiv:2206.13072 [pdf, other]
Title: Personalized recommendation system based on social relationships and historical behaviors
Comments: 28 pages, 7 figures
Subjects: Social and Information Networks (cs.SI)

Previous studies show that recommendation algorithms based on historical behaviors of users can provide satisfactory recommendation performance. Many of these algorithms pay attention to the interest of users, while ignore the influence of social relationships on user behaviors. Social relationships not only carry intrinsic information of similar consumption tastes or behaviors, but also imply the influence of individual to its neighbors. In this paper, we assume that social relationships and historical behaviors of users are related to the same factors. Based on this assumption, we propose an algorithm to focus on social relationships useful for recommendation systems through mutual constraints from both types of information. We test the performance of our algorithm on four types of users, including all users, active users, inactive users and cold-start users. Results show that the proposed algorithm outperforms benchmarks in four types of scenarios subject to recommendation accuracy and diversity metrics. We further design a randomization model to explore the contribution of social relationships to recommendation performance, and the result shows that the contribution of social relationships in the proposed algorithm depends on the coupling strength of social relationships and historical behaviors.

[6]  arXiv:2206.13456 [pdf, other]
Title: "Double vaccinated, 5G boosted!": Learning Attitudes towards COVID-19 Vaccination from Social Media
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Machine Learning (cs.LG)

To address the vaccine hesitancy which impairs the efforts of the COVID-19 vaccination campaign, it is imperative to understand public vaccination attitudes and timely grasp their changes. In spite of reliability and trustworthiness, conventional attitude collection based on surveys is time-consuming and expensive, and cannot follow the fast evolution of vaccination attitudes. We leverage the textual posts on social media to extract and track users' vaccination stances in near real time by proposing a deep learning framework. To address the impact of linguistic features such as sarcasm and irony commonly used in vaccine-related discourses, we integrate into the framework the recent posts of a user's social network neighbours to help detect the user's genuine attitude. Based on our annotated dataset from Twitter, the models instantiated from our framework can increase the performance of attitude extraction by up to 23% compared to state-of-the-art text-only models. Using this framework, we successfully validate the feasibility of using social media to track the evolution of vaccination attitudes in real life. We further show one practical use of our framework by validating the possibility to forecast a user's vaccine hesitancy changes with information perceived from social media.

Cross-lists for Tue, 28 Jun 22

[7]  arXiv:2206.12786 (cross-list from stat.ME) [pdf, other]
Title: Calibrated Nonparametric Scan Statistics for Anomalous Pattern Detection in Graphs
Subjects: Methodology (stat.ME); Artificial Intelligence (cs.AI); Social and Information Networks (cs.SI)

We propose a new approach, the calibrated nonparametric scan statistic (CNSS), for more accurate detection of anomalous patterns in large-scale, real-world graphs. Scan statistics identify connected subgraphs that are interesting or unexpected through maximization of a likelihood ratio statistic; in particular, nonparametric scan statistics (NPSSs) identify subgraphs with a higher than expected proportion of individually significant nodes. However, we show that recently proposed NPSS methods are miscalibrated, failing to account for the maximization of the statistic over the multiplicity of subgraphs. This results in both reduced detection power for subtle signals, and low precision of the detected subgraph even for stronger signals. Thus we develop a new statistical approach to recalibrate NPSSs, correctly adjusting for multiple hypothesis testing and taking the underlying graph structure into account. While the recalibration, based on randomization testing, is computationally expensive, we propose both an efficient (approximate) algorithm and new, closed-form lower bounds (on the expected maximum proportion of significant nodes for subgraphs of a given size, under the null hypothesis of no anomalous patterns). These advances, along with the integration of recent core-tree decomposition methods, enable CNSS to scale to large real-world graphs, with substantial improvement in the accuracy of detected subgraphs. Extensive experiments on both semi-synthetic and real-world datasets are demonstrated to validate the effectiveness of our proposed methods, in comparison with state-of-the-art counterparts.

[8]  arXiv:2206.13223 (cross-list from cs.LG) [pdf, other]
Title: MultiSAGE: a multiplex embedding algorithm for inter-layer link prediction
Subjects: Machine Learning (cs.LG); Discrete Mathematics (cs.DM); Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

Research on graph representation learning has received great attention in recent years. However, most of the studies so far have focused on the embedding of single-layer graphs. The few studies dealing with the problem of representation learning of multilayer structures rely on the strong hypothesis that the inter-layer links are known, and this limits the range of possible applications. Here we propose MultiSAGE, a generalization of the GraphSAGE algorithm that allows to embed multiplex networks. We show that MultiSAGE is capable to reconstruct both the intra-layer and the inter-layer connectivity, outperforming GraphSAGE, which has been designed for simple graphs. Next, through a comprehensive experimental analysis, we shed light also on the performance of the embedding, both in simple and in multiplex networks, showing that either the density of the graph or the randomness of the links strongly influences the quality of the embedding.

[9]  arXiv:2206.13416 (cross-list from physics.soc-ph) [pdf, other]
Title: A Majority-Vote Model On Multiplex Networks with Community Structure
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

We investigate a majority-vote model on two-layer multiplex networks with community structure. In our majority-vote model, the edges on each layer encode one type of social relationship and an individual changes their opinion based on the majority opinions of their neighbors in each layer. To capture the fact that different relationships often have different levels of importance, we introduce a layer-preference parameter, which determines the probability of a node to adopt an opinion when the node's neighborhoods on the two layers have different majority opinions. We construct our networks so that each node is a member of one community on each layer, and we consider situations in which nodes tend to have more connections with nodes from the same community than with nodes from different communities. We study the influence of the layer-preference parameter, the intralayer communities, and interlayer membership correlation on the steady-state behavior of our model using both direct numerical simulations and a mean-field approximation. We find three different types of steady-state behavior: a fully-mixed state, consensus states, and polarized states. We demonstrate that a stronger interlayer community correlation makes polarized steady states reachable for wider ranges of the other model parameters. We also show that different values of the layer-preference parameter result in qualitatively different phase diagrams for the mean opinions at steady states.

Replacements for Tue, 28 Jun 22

[10]  arXiv:2104.04331 (replaced) [pdf, other]
Title: The Burden of Being a Bridge: Analysing Subjective Well-Being of Twitter Users during the COVID-19 Pandemic
Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY)
[ total of 10 entries: 1-10 ]
[ showing up to 500 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2206, contact, help  (Access key information)