Social and Information Networks

New submissions for Wed, 30 Sep 20

[1]  arXiv:2009.13620 [pdf, other]
Title: The Emergence of Higher-Order Structure in Scientific and Technological Knowledge Networks
Comments: 33 pages, 12 figures
Subjects: Social and Information Networks (cs.SI); Algebraic Topology (math.AT)

The growth of science and technology is primarily a recombinative process, wherein new discoveries and inventions are generally built from prior knowledge. While the recent past has seen rapid growth in scientific and technological knowledge, relatively little is known about the manner in which science and technology develop and coalesce knowledge into larger structures that enable or constrain future breakthroughs. Network science has recently emerged as a framework for measuring the structure and dynamics of knowledge. While helpful, these existing approaches struggle to capture the global structural properties of the underlying networks, leading to conflicting observations about the nature of scientific and technological progress. We bridge this methodological gap using tools from algebraic topology to characterize the higher-order structure of knowledge networks in science and technology across scale. We observe rapid and varied growth in the high-dimensional structure in many fields of science and technology, and find this high-dimensional growth coincides with decline in lower-dimensional structure. This higher-order growth in knowledge networks has historically far outpaced the growth in scientific and technological collaboration networks. We also characterize the relationship between higher-order structure and the nature of the science and technology produced within these structural environments and find a positive relationship between the abstractness of language used within fields and increasing high-dimensional structure. We also find a robust relationship between high-dimensional structure and number of metrics for publication success, implying this high-dimensional structure may be linked to discovery and invention.

[2]  arXiv:2009.13659 [pdf, other]
Title: Network Analysis of the 2016 Presidential Campaign Tweets
Authors: Dmitry Zinoviev
Comments: 6 pages, 4 figures
Subjects: Social and Information Networks (cs.SI)

We applied complex network analysis to ~27,000 tweets posted by the 2016 presidential election's principal participants in the USA. We identified the stages of the election campaigns and the recurring topics addressed by the candidates. Finally, we revealed the leader-follower relationships between the candidates. We conclude that Secretary Hillary Clinton's Twitter performance was subordinate to that of Donald Trump, which may have been one factor that led to her electoral defeat.

[3]  arXiv:2009.13794 [pdf, other]
Title: From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data
Authors: Weiran Yao, Sean Qian
Subjects: Social and Information Networks (cs.SI); Machine Learning (cs.LG); Machine Learning (stat.ML)

The effectiveness of traditional traffic prediction methods is often extremely limited when forecasting traffic dynamics in early morning. The reason is that traffic can break down drastically during the early morning commute, and the time and duration of this break-down vary substantially from day to day. Early morning traffic forecast is crucial to inform morning-commute traffic management, but they are generally challenging to predict in advance, particularly by midnight. In this paper, we propose to mine Twitter messages as a probing method to understand the impacts of people's work and rest patterns in the evening/midnight of the previous day to the next-day morning traffic. The model is tested on freeway networks in Pittsburgh as experiments. The resulting relationship is surprisingly simple and powerful. We find that, in general, the earlier people rest as indicated from Tweets, the more congested roads will be in the next morning. The occurrence of big events in the evening before, represented by higher or lower tweet sentiment than normal, often implies lower travel demand in the next morning than normal days. Besides, people's tweeting activities in the night before and early morning are statistically associated with congestion in morning peak hours. We make use of such relationships to build a predictive framework which forecasts morning commute congestion using people's tweeting profiles extracted by 5 am or as late as the midnight prior to the morning. The Pittsburgh study supports that our framework can precisely predict morning congestion, particularly for some road segments upstream of roadway bottlenecks with large day-to-day congestion variation. Our approach considerably outperforms those existing methods without Twitter message features, and it can learn meaningful representation of demand from tweeting profiles that offer managerial insights.

[4]  arXiv:2009.13958 [pdf, other]
Title: A network approach to expertise retrieval based on path similarity and credit allocation
Subjects: Social and Information Networks (cs.SI); Computation (stat.CO)

With the increasing availability of online scholarly databases, publication records can be easily extracted and analysed. Researchers can promptly keep abreast of others' scientific production and, in principle, can select new collaborators and build new research teams. A critical factor one should consider when contemplating new potential collaborations is the possibility of unambiguously defining the expertise of other researchers. While some organisations have established database systems to enable their members to manually produce a profile, maintaining such systems is time-consuming and costly. Therefore, there has been a growing interest in retrieving expertise through automated approaches. Indeed, the identification of researchers' expertise is of great value in many applications, such as identifying qualified experts to supervise new researchers, assigning manuscripts to reviewers, and forming a qualified team. Here, we propose a network-based approach to the construction of authors' expertise profiles. Using the MEDLINE corpus as an example, we show that our method can be applied to a number of widely used data sets and outperforms other methods traditionally used for expertise identification.

[5]  arXiv:2009.14074 [pdf]
Title: Online platforms of public participation -- a deliberative democracy or a delusion?
Comments: 8 pages
Subjects: Social and Information Networks (cs.SI); Human-Computer Interaction (cs.HC)

Trust and confidence in democratic institutions is at an all-time low. At the same time, many of the complex issues faced by city administrators and politicians remain unresolved. To tackle these concerns, many argue that citizens should, through the use of digital platforms, have greater involvement in decision-making processes. This paper describes research into two such platforms, 'Decide Madrid' and 'Better Reykjavik'. Through the use of interviews, questionnaires, ethnographic observation, and analysis of platform data, the study will determine if these platforms provide greater participation or simply replicate what is already offered by numerous other digital tools. The findings so far suggest that to be successful platforms must take on a form of deliberative democracy, allowing for knowledge co-production and the emergence of collective intelligence. Based on this, we aim to identify key features of sustainable models of online participation.

Cross-lists for Wed, 30 Sep 20

[6]  arXiv:2009.13566 (cross-list from cs.LG) [pdf, other]
Title: Graph Neural Networks with Heterophily
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, most existing GNN models have an implicit assumption of homophily among the nodes connected in the graph, and therefore have largely overlooked the important setting of heterophily. In this work, we propose a novel framework called CPGNN that generalizes GNNs for graphs with either homophily or heterophily. The proposed framework incorporates an interpretable compatibility matrix for modeling the heterophily or homophily level in the graph, which can be learned in an end-to-end fashion, enabling it to go beyond the assumption of strong homophily. Theoretically, we show that replacing the compatibility matrix in our framework with the identity (which represents pure homophily) reduces to GCN. Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with significantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings.

[7]  arXiv:2009.13600 (cross-list from math.OC) [pdf, other]
Title: Patterns of Nonlinear Opinion Formation on Networks
Subjects: Optimization and Control (math.OC); Social and Information Networks (cs.SI); Systems and Control (eess.SY); Dynamical Systems (math.DS)

When communicating agents form opinions about a set of possible options, agreement and disagreement are both possible outcomes. Depending on the context, either can be desirable or undesirable. We show that for nonlinear opinion dynamics on networks, and a variety of network structures, the spectral properties of the underlying adjacency matrix fully characterize the occurrence of either agreement or disagreement. We further show how the corresponding eigenvector centrality, as well as any symmetry in the network, informs the resulting patterns of opinion formation and agent sensitivity to input that triggers opinion cascades.

[8]  arXiv:2009.13674 (cross-list from physics.soc-ph) [pdf, other]
Title: Vaccination strategies against COVID-19 and the diffusion of anti-vaccination views
Comments: 13 pages, 3 figures
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

Miss-information is usually adjusted to fit distinct narratives and can propagate rapidly through communities of interest, which work as echo chambers, cause reinforcement and foster confirmation bias. False beliefs, once adopted, are rarely corrected. Amidst the COVID-19 crisis, pandemic-deniers and people who oppose wearing face masks or quarantines have already been a substantial aspect of the development of the pandemic. With a potential vaccine for COVID-19, different anti-vaccine narratives will be created and, likely, adopted by large population groups, with critical consequences. Here, we analyse epidemic spreading and optimal vaccination strategies, measured with the average years of life lost, in two network topologies (scale-free and small-world) assuming full adherence to vaccine administration. We consider the spread of anti-vaccine views in the network, using a similar diffusion model as the one used in epidemics, which are adopted based on a persuasiveness parameter of anti-vaccine views. Results show that even if an anti-vaccine narrative has a small persuasiveness, a large part of the population will be rapidly exposed to them. Assuming that all individuals are equally likely to adopt anti-vaccine views after being exposed, more central nodes in the network are more exposed and therefore are more likely to adopt them. Comparing years of life lost, anti-vaccine views could have a significant cost not only on those who share them, since the core social benefits of a limited vaccination strategy (reduction of susceptible hosts, network disruptions and slowing the spread of the disease) are substantially shortened.

[9]  arXiv:2009.13734 (cross-list from cs.LG) [pdf, other]
Title: New GCNN-Based Architecture for Semi-Supervised Node Classification
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)

The nodes of a graph existing in a specific cluster are more likely to connect to each other than with other nodes in the graph. Then revealing some information about the nodes, the structure of the graph (the graph edges) provides this opportunity to know more information about the other nodes. From this perspective, this paper revisits the node classification task in a semi-supervised scenario by graph convolutional neural network. The goal is to benefit from the flow of information that circulates around the revealed node labels. For this aim, this paper provides a new graph convolutional neural network architecture. This architecture benefits efficiently from the revealed training nodes, the node features, and the graph structure. On the other hand, in many applications, non-graph observations (side information) exist beside a given graph realization. The non-graph observations are usually independent of the graph structure. This paper shows that the proposed architecture is also powerful in combining a graph realization and independent non-graph observations. For both cases, the experiments on the synthetic and real-world datasets demonstrate that our proposed architecture achieves a higher prediction accuracy in comparison to the existing state-of-the-art methods for the node classification task.

[10]  arXiv:2009.13825 (cross-list from physics.soc-ph) [pdf, other]
Title: An information theoretic network approach to socioeconomic correlations
Authors: Alec Kirkley
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

Due to its wide reaching implications for everything from identifying hotspots of income inequality to political redistricting, there is a rich body of literature across the sciences quantifying spatial patterns in socioeconomic data. In particular, the variability of indicators relevant to social and economic well-being between localized populations is of great interest, as it pertains to the spatial manifestations of inequality and segregation. However, heterogeneity in population density, sensitivity of statistical analyses to spatial aggregation, and the importance of pre-drawn political boundaries for policy intervention may decrease the efficacy and relevance of existing methods for analyzing spatial socioeconomic data. Additionally, these measures commonly lack either a framework for comparing results for qualitative and quantitative data on the same scale, or a mechanism for generalization to multi-region correlations. To mitigate these issues associated with traditional spatial measures, here we view local deviations in socioeconomic variables from a topological lens rather than a spatial one, and use a novel information theoretic network approach based on the Generalized Jensen Shannon Divergence to distinguish distributional quantities across adjacent regions. We apply our methodology in a series of experiments to study the network of neighboring census tracts in the continental US, quantifying the decay in two-point distributional correlations across the network, examining the county-level socioeconomic disparities induced from the aggregation of tracts, and constructing an algorithm for the division of a city into homogeneous clusters. These results provide a new framework for analyzing the variation of attributes across regional populations, and shed light on new, universal patterns in socioeconomic attributes.

[11]  arXiv:2009.14018 (cross-list from physics.soc-ph) [pdf]
Title: Toward the "New Normal": A Surge in Speeding, New Volume Patterns, and Recent Trends in Taxis/For-Hire Vehicles
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI)

Six months into the pandemic and one month after the phase four reopening in New York City (NYC), restrictions are lifting, businesses and schools are reopening, but global infections are still rising. This white paper updates travel trends observed in the aftermath of the COVID-19 outbreak in NYC and highlight some findings toward the "new normal."

Replacements for Wed, 30 Sep 20

[12]  arXiv:2006.07283 (replaced) [pdf, other]
Title: Dutch General Public Reaction on Governmental COVID-19 Measures and Announcements in Twitter Data
Comments: 17 pages, 8 figures
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
[13]  arXiv:1912.10419 (replaced) [pdf, other]
Title: Link prediction in dynamic networks using random dot product graphs
Subjects: Applications (stat.AP); Social and Information Networks (cs.SI)
[14]  arXiv:2001.11818 (replaced) [pdf, other]
Title: Community Detection in Bipartite Networks with Stochastic Blockmodels
Comments: 17 pages, 6 figures. Code is available at this https URL and a documentation at this https URL
Journal-ref: Phys. Rev. E 102, 032309 (2020)
Subjects: Physics and Society (physics.soc-ph); Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
