Curious Rhythms: Temporal Regularities of Wikipedia Consumption

Read original: arXiv:2305.09497 - Published 4/23/2024 by Tiziano Piccardi, Martin Gerlach, Robert West
Total Score

0

⚙️

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper investigates how people use Wikipedia, the world's largest online encyclopedia, throughout the day.
  • The researchers analyzed billions of page requests on the English Wikipedia site to understand the temporal patterns and information needs of Wikipedia users.
  • The study found clear daily rhythms in the types of articles people access, with certain topics being more popular during the day versus at night.
  • The findings shed light on how people search for and consume information online, with implications for designing better information systems.

Plain English Explanation

Wikipedia is an incredibly vast and useful resource, containing information on everything from history and science to current events and pop culture. But how people actually use Wikipedia throughout the day has not been well studied until now.

The researchers in this paper looked at billions of page requests on the English Wikipedia site to see if there were any patterns or rhythms in the types of articles people accessed. They found that even after accounting for the obvious day-night cycle, individual articles had their own distinct daily patterns - some were more popular during the day, while others were preferred at night.

For example, articles on current events or work-related topics tended to be accessed more during typical working hours, while articles on entertainment or personal interest topics saw more traffic in the evenings. The researchers also found that factors like the reader's country and whether they were using a mobile device also influenced article access patterns.

These findings provide important insights into how people search for and consume information online. They suggest that Wikipedia, as one of the largest public knowledge repositories, fulfills a diverse range of information needs throughout the day. This has implications for designing better information systems that can adapt to people's changing informational needs over the course of a day.

Technical Explanation

The researchers collected and analyzed billions of page view logs from the English Wikipedia site, carefully accounting for differences in time zones around the world. They found that even after removing the global day-night cycle, individual Wikipedia articles still exhibited strong daily patterns in terms of when they were accessed.

By characterizing the typical "shapes" of these daily consumption patterns, the researchers identified a clear distinction between articles that were more popular during the day versus those that saw more traffic in the evenings and overnight. Day-oriented articles tended to cover topics like current events, work, and education, while night-oriented articles were more likely to be about entertainment, personal interests, and leisure activities.

The researchers also found that other contextual factors, such as the reader's country and whether they were using a mobile device or desktop computer, were significant predictors of an article's daily access pattern. For example, articles accessed more by mobile users showed different rhythms than those accessed primarily on desktops.

Overall, this large-scale analysis of Wikipedia usage provides new insights into how people search for and consume information online. The findings suggest that Wikipedia plays a diverse role in fulfilling a wide range of information needs spread throughout the day, with implications for designing more adaptive and responsive information systems.

Critical Analysis

The researchers provided a robust and comprehensive analysis of Wikipedia usage patterns, leveraging a massive dataset of page view logs to uncover novel insights. By focusing on temporal rhythms rather than just overall popularity, the study reveals new dimensions of how people interact with online information.

That said, the research is limited to Wikipedia, which, while an enormously important platform, is still a subset of overall online information-seeking behavior. It would be valuable to see if similar daily patterns emerge in other large-scale web usage datasets, such as search engine queries or browsing on news sites.

Additionally, the analysis relies on correlational data, so it cannot definitively establish the causal mechanisms underlying the observed patterns. Further research, perhaps incorporating user surveys or controlled experiments, could shed more light on the specific motivations and contexts driving these daily rhythms in information consumption.

Overall, though, this study represents an important step forward in understanding the temporal dynamics of online information seeking. By highlighting how the type and context of information people access varies throughout the day, it encourages us to think more critically about how we design and deliver information to users in an increasingly 24/7 digital world.

Conclusion

This large-scale analysis of Wikipedia usage patterns reveals clear daily rhythms in the types of information people access online. Certain topics and article types are more popular during the day, while others see more traffic in the evenings and overnight.

These findings provide valuable insights into how people search for and consume information on the web, with implications for designing more responsive and adaptive information systems. As one of the world's largest online knowledge repositories, Wikipedia serves a diverse range of informational needs spread throughout the day, underscoring its central role in global knowledge sharing and lifelong learning.

The researchers' innovative approach to studying temporal patterns in web usage data opens up new avenues for understanding human information-seeking behavior at scale. Continued research in this area can lead to better tools and platforms that meet people's evolving informational needs over the course of a day, week, or lifetime.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

⚙️

Total Score

0

Curious Rhythms: Temporal Regularities of Wikipedia Consumption

Tiziano Piccardi, Martin Gerlach, Robert West

Wikipedia, in its role as the world's largest encyclopedia, serves a broad range of information needs. Although previous studies have noted that Wikipedia users' information needs vary throughout the day, there is to date no large-scale, quantitative study of the underlying dynamics. The present paper fills this gap by investigating temporal regularities in daily consumption patterns in a large-scale analysis of billions of timezone-corrected page requests mined from English Wikipedia's server logs, with the goal of investigating how context and time relate to the kind of information consumed. First, we show that even after removing the global pattern of day-night alternation, the consumption habits of individual articles maintain strong diurnal regularities. Then, we characterize the prototypical shapes of consumption patterns, finding a particularly strong distinction between articles preferred during the evening/night and articles preferred during working hours. Finally, we investigate topical and contextual correlates of Wikipedia articles' access rhythms, finding that article topic, reader country, and access device (mobile vs. desktop) are all important predictors of daily attention patterns. These findings shed new light on how humans seek information on the Web by focusing on Wikipedia as one of the largest open platforms for knowledge and learning, emphasizing Wikipedia's role as a rich knowledge base that fulfills information needs spread throughout the day, with implications for understanding information seeking across the globe and for designing appropriate information systems.

Read more

4/23/2024

Talking Wikidata: Communication patterns and their impact on community engagement in collaborative knowledge graphs
Total Score

0

Talking Wikidata: Communication patterns and their impact on community engagement in collaborative knowledge graphs

Elisavet Koutsiana, Ioannis Reklos, Kholoud Saad Alghamdi, Nitisha Jain, Albert Mero~no-Pe~nuela, Elena Simperl

We study collaboration patterns of Wikidata, one of the world's largest collaborative knowledge graph communities. Wikidata lacks long-term engagement with a small group of priceless members, 0.8%, to be responsible for 80% of contributions. Therefore, it is essential to investigate their behavioural patterns and find ways to enhance their contributions and participation. Previous studies have highlighted the importance of discussions among contributors in understanding these patterns. To investigate this, we analyzed all the discussions on Wikidata and used a mixed methods approach, including statistical tests, network analysis, and text and graph embedding representations. Our research showed that the interactions between Wikidata editors form a small world network where the content of a post influences the continuity of conversations. We also found that the account age of Wikidata members and their conversations are significant factors in their long-term engagement with the project. Our findings can benefit the Wikidata community by helping them improve their practices to increase contributions and enhance long-term participation.

Read more

7/29/2024

🛠️

Total Score

0

The Web unpacked: a quantitative analysis of global Web usage

Henrique S. Xavier

This paper presents a comprehensive analysis of global web usage patterns based on data from SimilarWeb, a leading source for estimating web traffic. Leveraging a dataset comprising over 250,000 websites, we estimate the total web traffic and investigate its distribution among domains and industry sectors. We detail the characteristics of the top 116 domains, which comprise an estimated one-third of all web traffic. Our analysis scrutinizes various attributes of these domains, including their content sources and types, access requirements, offline presence, and ownership features. Our analysis reveals a significant concentration of web traffic, with a diminutive number of top websites capturing the majority of visits. Search engines, news and media, social networks, streaming, and adult content emerge as primary attractors of web traffic, which is also highly concentrated on platforms and USA-owned websites. Much of the traffic goes to for-profit but mostly free-of-charge websites, highlighting the dominance of business models not based on paywalls.

Read more

4/29/2024

Tracking Patterns in Toxicity and Antisocial Behavior Over User Lifetimes on Large Social Media Platforms
Total Score

0

Tracking Patterns in Toxicity and Antisocial Behavior Over User Lifetimes on Large Social Media Platforms

Katy Blumer, Jon Kleinberg

An increasing amount of attention has been devoted to the problem of toxic or antisocial behavior on social media. In this paper we analyze such behavior at very large scales: we analyze toxicity over a 14-year time span on nearly 500 million comments from Reddit and Wikipedia, grounded in two different proxies for toxicity. At the individual level, we analyze users' toxicity levels over the course of their time on the site, and find a striking reversal in trends: both Reddit and Wikipedia users tended to become less toxic over their life cycles on the site in the early (pre-2013) history of the site, but more toxic over their life cycles in the later (post-2013) history of the site. We also find that toxicity on Reddit and Wikipedia differ in a key way, with the most toxic behavior on Reddit exhibited in aggregate by the most active users, and the most toxic behavior on Wikipedia exhibited in aggregate by the least active users. Finally, we consider the toxicity of discussion around widely-shared pieces of content, and find that the trends for toxicity in discussion about content bear interesting similarities with the trends for toxicity in discussion by users.

Read more

7/15/2024