We propose a model for the synthetic generation of information cascades in social media. In our model the information “memes” propagating in the social network are characterized by a probability distribution in a topic space, accompanied by a textual description, i.e., a bag of keywords coherent with the topic distribution. Similarly, every user of the social media is described by a vector of interests defined over the same topic space. Information cascades are governed by the topic of the meme, its level of virality, the interests of each user, community pressure, and social influence.
The main technical challenge we face towards our goal is the generation of realistic interest vectors, given a known network structure and a tunable level of homophily. We tackle this problem by means of a method based on non-negative matrix factorization, which is shown experimentally to outperform non-trivial baselines based on label propagation and random-walk-based graph embedding.
As we showcase in our experiments, our model offers a small set of simple and easily interpretable “knobs” which allow to study, in vitro, how each set of assumptions affects the resulting propagations. Finally, we show how to generate synthetic cascades that have similar macro-statistics to the real world cascades for a dataset containing both the network and the cascades.
Dettaglio pubblicazione
2020, Fourteenth International AAAI Conference on Web and Social Media, Pages 107-118
Generating realistic interest-driven information cascades (04b Atto di convegno in volume)
Cinus Federico, Bonchi Francesco, Monti Corrado, Panisson André
ISBN: 978-1-57735-823-7
keywords