In this post we present another measure of controversy, based on the chains of mutual responses between users. The metric has been introduced by Laniado et al (2011) in an article focused on conversations in Wikipedia.
Each Wikipedia article can have a talk page associated to it, i.e. a space for discussion on how to improve its content. Talk pages are just simple wiki pages, but they are used in a forum-like way, as it can be seen in a screenshot from the talk page related to the article Presidency of Barack Obama.
The discussion related to each article can be visualized as a tree, where the red root node symbolizes the article itself, and gray nodes represent structural elements such as subpages or thread headlines. Comments are represented as orange nodes (cyan if they are unsigned), having for parent the comment to which they reply, or the structural node representing the thread or subpage they are placed under.
Chains of mutual replies between a pair of users can be studied as one indicator of controversy in the discussions, following the intuition that this behavioral pattern tends to emerge in case of conflict.
Laniado et al (2011) define as chains all subthreads composed of at least three consecutive comments involving only two users who reply to each other. For example, if user B replies to a comment by user A, and user A replies back, we have a chain of length 3: A ← B ← A. As an example, the following figure shows a thread from the discussion about the article Global Warming, containing a chain of length 5 (involving the users James S. and Kim D. Petersen).
While the total number of comments is the basic measure for the size of a discussion, the number of chains can be leveraged to quantify contention. Applying this metric to the whole English Wikipedia it is possible to identify articles characterized by conflictive discussions. Here is the list of the top 20 controversial Wikipedia articles according to their number of discussion chains:
For each article, also other metrics are reported (with the corresponding rank in parenthesis): the total number of comments, the number of distinct users participating in the discussion, the depth of the longest thread (max. depth) and the h-index of the discussion tree (see previous post). The last column shows the number of edits received by each article.
As it can be observed, the most disputed articles include topics which aroused wide discussions, such as Barack Obama or Gaza War, but also less known issues like Chiropractic, where a considerably lower number of users generated a huge amount of discussion chains. Regarding the EMAPS project it is also interesting to observe the presence of topics like Global Warming and Climatic Research Unit hacking incident in this list.
Laniado D., Tasso R., Volkovich Y. and Kaltenbrunner A. (2011).
When the Wikipedians Talk: Network and Tree Structure of Wikipedia Discussion Pages.
ICWSM 2011 – 5th International AAAI Conference on Weblogs and Social Media, Barcelona, Spain.