Measuring controversy in Wikipedia via counting reply chains

In this post we present another measure of controversy, based on the chains of mutual responses between users. The metric has been introduced by Laniado et al (2011) in an article focused on conversations in Wikipedia.

Each Wikipedia article can have a talk page associated to it, i.e. a space for discussion on how to improve its content. Talk pages are just simple wiki pages, but they are used in a forum-like way, as it can be seen in a screenshot from the talk page related to the article Presidency of Barack Obama.

Talk page

Talk page for the Wikipedia article "Presidency of Barack Obama"

The discussion related to each article can be visualized as a tree, where the red root node symbolizes the article itself, and gray nodes represent structural elements such as subpages or thread headlines. Comments are represented as orange nodes (cyan if they are unsigned), having for parent the comment to which they reply, or the structural node representing the thread or subpage they are placed under.

Discussion tree

Discussion tree for the article "Presidency of Barack Obama"

Chains of mutual replies between a pair of users can be studied as one indicator of controversy in the discussions, following the intuition that this behavioral pattern tends to emerge in case of conflict.

Laniado et al (2011)  define as chains all subthreads composed of at least three consecutive comments involving only two users who reply to each other. For example, if user B replies to a comment by user A, and user A replies back, we have a chain of length 3: A ← B ← A. As an example, the following figure shows a thread from the discussion about the article Global Warming, containing a chain of length 5 (involving the users James S. and Kim D. Petersen).

A discussion thread from the article "Global warming". The thread contains a chain of length 5 involving users James S. and Kim D. Petersen.

While the total number of comments is the basic measure for the size of a discussion, the number of chains can be leveraged to quantify contention. Applying this metric to the whole English Wikipedia it is possible to identify articles characterized by conflictive discussions. Here is the list of the top 20 controversial Wikipedia articles according to their number of discussion chains:

Top 20 Wikipedia articles by number of discussion chains

Top 20 Wikipedia articles by number of discussion chains. Also other indicators are reported (in parenthesis the rank of each article according to the corresponding indicator). These results are based on a complete dump of the English Wikipedia dated March 2010.

For each article, also other metrics are reported (with the corresponding rank in parenthesis): the total number of comments, the number of distinct users participating in the discussion, the depth of the longest thread (max. depth) and the h-index of the discussion tree (see previous post). The last column shows the number of edits received by each article.

As it can be observed, the most disputed articles include topics which aroused wide discussions, such as Barack Obama or Gaza War, but also less known issues like Chiropractic, where a considerably lower number of users generated a huge amount of discussion chains. Regarding the EMAPS project it is also interesting to observe the presence of topics  like Global Warming and Climatic Research Unit hacking incident in this list.


Laniado D., Tasso R., Volkovich Y. and Kaltenbrunner A. (2011).
When the Wikipedians Talk: Network and Tree Structure of Wikipedia Discussion Pages.
ICWSM 2011 – 5th International AAAI Conference on Weblogs and Social Media, Barcelona, Spain.

3 Responses to “Measuring controversy in Wikipedia via counting reply chains”

  1. Counting reply-chains is an extremely interesting method to measure how controversial are different Wikipedia pages.
    The measure is, however, vulnerable to ‘flame wars’, long squabbles between few (often just two) users that get involved in personal quarrels. Although, flame wars are discouraged by the Wikipedia administrators, they may still false the results on some articles.
    I have had a look at the Laniado et al. article quoted in the post and (if I understand it correctly) authors have also taken into account the number of editors of the page. It is not clear to me if they used this information to ‘correct’ the above measure.
    The number of chain divided by the number of discussants would be a more solid indicator of controversiality.

  2. As discussed in our last skype meeting, it would be very interesting to have the measure discussed in this post for the pages concerning our controversies.
    Below is a list of them.
    If possible, it would also be nice to expand this list by scraping all the links present of these pages and directed to other wikipedia pages. Then we can add to the list all the pages that are cited by at least two pages of the original list. This would make sure that we considered all the relevant pages.


    Related controversies



  3. Thank you for your comments.
    Concerning reply-chains, we have actually analysed two measures: the first one is the total number of messages belonging to reply-chains. We found this measure to be potentially sensitive to the presence of single long discussion threads, so we preferred to just count the number of chains as a more robust indicator. We believe that dividing this indicator by the number of users involved in the discussion would be counterproductive, as for example a page with just two users replying to each other would have a very high value.

