Hierarchical characterization and generation of blogosphere workloads

Mariela Josefina Curiel Huérfano, Azer Bestavros, Fernando Duarte, Bernardo Mattos, Jussara Almeida, Virgilio Almeida

Research output: Contribution to journalArticle

Abstract

We present a thorough characterization of the access patterns in blogspace, which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and management requests spanning a 28-day period is done at three different levels. The user view characterizes how individual users interact with blogosphere objects (blogs); the object view characterizes how individual blogs are accessed; the server view characterizes the aggregate access patterns of all users to all blogs. The more-interactive nature of the blogosphere leads to interesting traffic and communication patterns, which are different from those observed for traditional web content. We identify and characterize novel features of the blogosphere workload, and we show the similarities and differences between typical web server workloads and blogosphere server workloads. Finally, based on our main characterization results, we build a new synthetic blogosphere workload generator called GBLOT, which aims at mimicking closely a stream of requests originating from a population of blog users. Given the increasing share of blogspace traffic, realistic workload models and tools are important for capacity planning and traffic engineering purposes.
Original languageEnglish
Pages (from-to)1-34
Number of pages34
JournalComputer Science, Boston University, Tech. Rep
StatePublished - 2008
Externally publishedYes

Fingerprint

Dive into the research topics of 'Hierarchical characterization and generation of blogosphere workloads'. Together they form a unique fingerprint.

Cite this