We gratefully acknowledge support from
the Simons Foundation and member institutions.

Social and Information Networks

New submissions

[ total of 10 entries: 1-10 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 13 May 21

[1]  arXiv:2105.05273 [pdf, other]
Title: An Empirical Study of Compression-friendly Community Detection Methods
Comments: 12 Pages, 4 Figures, 3 Tables
Subjects: Social and Information Networks (cs.SI)

Real-world graphs are massive in size and we need a huge amount of space to store them. Graph compression allows us to compress a graph so that we need a lesser number of bits per link to store it. Of many techniques to compress a graph, a typical approach is to find clique-like caveman or traditional communities in a graph and encode those cliques to compress the graph. On the other side, an alternative approach is to consider graphs as a collection of hubs connecting spokes and exploit it to arrange the nodes such that the resulting adjacency matrix of the graph can be compressed more efficiently. We perform an empirical comparison of these two approaches and show that both methods can yield good results under favorable conditions. We perform our experiments on ten real-world graphs and define two cost functions to present our findings.

[2]  arXiv:2105.05320 [pdf, ps, other]
Title: Seeing All From a Few: Nodes Selection Using Graph Pooling for Graph Clustering
Subjects: Social and Information Networks (cs.SI); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Graph clustering aiming to obtain a partition of data using the graph information, has received considerable attention in recent years. However, noisy edges and nodes in the graph may make the clustering results worse. In this paper, we propose a novel dual graph embedding network(DGEN) to improve the robustness of the graph clustering to the noisy nodes and edges. DGEN is designed as a two-step graph encoder connected by a graph pooling layer, which learns the graph embedding of the selected nodes. Based on the assumption that a node and its nearest neighbors should belong to the same cluster, we devise the neighbor cluster pooling(NCPool) to select the most informative subset of vertices based on the clustering assignments of nodes and their nearest neighbor. This can effectively alleviate the impact of the noise edge to the clustering. After obtaining the clustering assignments of the selected nodes, a classifier is trained using these selected nodes and the final clustering assignments for all the nodes can be obtained by this classifier. Experiments on three benchmark graph datasets demonstrate the superiority compared with several state-of-the-art algorithms.

[3]  arXiv:2105.05762 [pdf]
Title: Forecasting election results by studying brand importance in online news
Journal-ref: International Journal of Forecasting 36(2), 414-427 (2020)
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Physics and Society (physics.soc-ph)

This study uses the semantic brand score, a novel measure of brand importance in big textual data, to forecast elections based on online news. About 35,000 online news articles were transformed into networks of co-occurring words and analyzed by combining methods and tools from social network analysis and text mining. Forecasts made for four voting events in Italy provided consistent results across different voting systems: a general election, a referendum, and a municipal election in two rounds. This work contributes to the research on electoral forecasting by focusing on predictions based on online big data; it offers new perspectives regarding the textual analysis of online news through a methodology which is relatively fast and easy to apply. This study also suggests the existence of a link between the brand importance of political candidates and parties and electoral results.

[4]  arXiv:2105.05793 [pdf]
Title: Using social network analysis to prevent money laundering
Journal-ref: Expert Systems with Applications 67, 49-58 (2017)
Subjects: Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph); General Finance (q-fin.GN)

This research explores the opportunities for the application of network analytic techniques to prevent money laundering. We worked on real world data by analyzing the central database of a factoring company, mainly operating in Italy, over a period of 19 months. This database contained the financial operations linked to the factoring business, together with other useful information about the company clients. We propose a new approach to sort and map relational data and present predictive models, based on network metrics, to assess risk profiles of clients involved in the factoring business. We find that risk profiles can be predicted by using social network metrics. In our dataset, the most dangerous social actors deal with bigger or more frequent financial operations; they are more peripheral in the transactions network; they mediate transactions across different economic sectors and operate in riskier countries or Italian regions. Finally, to spot potential clusters of criminals, we propose a visual analysis of the tacit links existing among different companies who share the same owner or representative. Our findings show the importance of using a network-based approach when looking for suspicious financial operations and potential criminals.

Cross-lists for Thu, 13 May 21

[5]  arXiv:2105.05316 (cross-list from cs.LG) [pdf, other]
Title: A Computational Framework for Modeling Complex Sensor Network Data Using Graph Signal Processing and Graph Neural Networks in Structural Health Monitoring
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)

Complex networks lend themselves to the modeling of multidimensional data, such as relational and/or temporal data. In particular, when such complex data and their inherent relationships need to be formalized, complex network modeling and its resulting graph representations enable a wide range of powerful options. In this paper, we target this - connected to specific machine learning approaches on graphs for structural health monitoring on an analysis and predictive (maintenance) perspective. Specifically, we present a framework based on Complex Network Modeling, integrating Graph Signal Processing (GSP) and Graph Neural Network (GNN) approaches. We demonstrate this framework in our targeted application domain of Structural Health Monitoring (SHM). In particular, we focus on a prominent real-world structural health monitoring use case, i.e., modeling and analyzing sensor data (strain, vibration) of a large bridge in the Netherlands. In our experiments, we show that GSP enables the identification of the most important sensors, for which we investigate a set of search and optimization approaches. Furthermore, GSP enables the detection of specific graph signal patterns (mode shapes), capturing physical functional properties of the sensors in the applied complex network. In addition, we show the efficacy of applying GNNs for strain prediction on this kind of data.

[6]  arXiv:2105.05682 (cross-list from cs.LG) [pdf, other]
Title: Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning
Comments: 7 pages, 5 figures, 3 tables. Accepted by the 30th International Joint Conference on Artificial Intelligence (IJCAI-21)
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI)

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on the labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins.

[7]  arXiv:2105.05733 (cross-list from cs.IR) [pdf, other]
Title: Thematic recommendations on knowledge graphs using multilayer networks
Comments: 20 pages, 5 figures
Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG); Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

We present a framework to generate and evaluate thematic recommendations based on multilayer network representations of knowledge graphs (KGs). In this representation, each layer encodes a different type of relationship in the KG, and directed interlayer couplings connect the same entity in different roles. The relative importance of different types of connections is captured by an intuitive salience matrix that can be estimated from data, tuned to incorporate domain knowledge, address different use cases, or respect business logic.
We apply an adaptation of the personalised PageRank algorithm to multilayer models of KGs to generate item-item recommendations. These recommendations reflect the knowledge we hold about the content and are suitable for thematic and/or cold-start recommendation settings. Evaluating thematic recommendations from user data presents unique challenges that we address by developing a method to evaluate recommendations relying on user-item ratings, yet respecting their thematic nature. We also show that the salience matrix can be estimated from user data. We demonstrate the utility of our methods by significantly improving consumption metrics in an AB test where collaborative filtering delivered subpar performance. We also apply our approach to movie recommendation using publicly-available data to ensure the reproducibility of our results. We demonstrate that our approach outperforms existing thematic recommendation methods and is even competitive with collaborative filtering approaches.

[8]  arXiv:2105.05781 (cross-list from cs.CL) [pdf]
Title: The Semantic Brand Score
Journal-ref: Journal of Business Research 88, 150-160 (2018)
Subjects: Computation and Language (cs.CL); Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)

The Semantic Brand Score (SBS) is a new measure of brand importance calculated on text data, combining methods of social network and semantic analysis. This metric is flexible as it can be used in different contexts and across products, markets and languages. It is applicable not only to brands, but also to multiple sets of words. The SBS, described together with its three dimensions of brand prevalence, diversity and connectivity, represents a contribution to the research on brand equity and on word co-occurrence networks. It can be used to support decision-making processes within companies; for example, it can be applied to forecast a company's stock price or to assess brand importance with respect to competitors. On the one side, the SBS relates to familiar constructs of brand equity, on the other, it offers new perspectives for effective strategic management of brands in the era of big data.

Replacements for Thu, 13 May 21

[9]  arXiv:2002.03129 (replaced) [pdf, other]
Title: GLSearch: Maximum Common Subgraph Detection via Learning to Search
Comments: Accepted by ICML 2021
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Machine Learning (stat.ML)
[10]  arXiv:2105.02570 (replaced) [pdf, other]
Title: Capturing the diversity of multilingual societies
Comments: Main text: 11 pages, 6 figures, 47 references. Supplementary Information: 26 pages, 15 figures, 2 tables
Subjects: Physics and Society (physics.soc-ph); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
[ total of 10 entries: 1-10 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2105, contact, help  (Access key information)