Wikipedia:WikiProject Vandalism studies

From Wikipedia, the free encyclopedia

WP:WPVS (talk)
Study 1 (talk) (finished)
Study 2 (talk)
Obama Study (talk)
Related Projects/Pages
Wikipedia:Researching Wikipedia
Wikipedia:WikiProject Wikidemia
Wikipedia:Statistics
Wikipedia:The Motivation of a Vandal
edit
Shortcut:
WP:WPVS
Goals
  • To conduct research related to unconstructive edits on Wikipedia.
Scope
  • The project covers all vandalism on Wikipedia. (Vandalism is any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia. - Wikipedia:Vandalism)

Contents

[edit] Members

(Please remove your name if you are not actively participating in the project)

  1. Remember (talk contribs count) -interested in everything about vandalism Former member
  2. JoeSmack (talk · contribs) - interest in anti-vandalism and have a specialization in linkspam reverting
  3. John Broughton (talk · contribs) - believe that Wikipedia policies and practices should be based at least partly on hard data, not just people's opinions
  4. AJackl (talk · contribs) - interested in the balance between subjectivity and objectivity and fact, and communities enforcing good faith edits in Wikipedia.
  5. res2216firestar (talk · contribs) - Believes that intentional vandalism should be strictly punished, and has a talent in telling intentional vandalism from good faith or test edits.
  6. JackSparrow Ninja (talk contribs count) - interest in anti-vandalism, and has a talent in telling intentional vandalism from good faith or test edits.
  7. Jonathan Stokes (talk · contribs) - interested in hard data on the contributions of anonymous users
  8. Jayron32 (talk · contribs) - Interested in fighting vandalism, but also interested in maintaining the Good Faith that most anons are still interested in improving Wikipedia.
  9. Deletion Quality (talk · contribs)- Interested in anti-vandalism and tracking down vandals
  10. Flubeca (talk · contribs) - More we know about vandals, the more we can stop them.
  11. hmwith (talk · contribs)
  12. A.Ou (talk · contribs)
  13. E (talk contribs count) - Vandalism fighter, interested in statisticts of vandalism on Wikipedia.
  14. HisSpaceResearch (talk · contribs) - interested in everything about vandalism
  15. Katsuhagi (talk · contribs) - interested in anti-vandalism, understanding more about vandals, learning how to fight it
  16. DGG 01:47, 26 June 2007 (UTC) How to reform vandals.
  17. --Arceus fan 23:15, 25 September 2007 (UTC) I want to stop vandalism because I am SICK AND TIRED of it!
  18. Phgao (talk · contribs)
  19. Gawaxay (talk contribs count) - Totally obsessed with vandal fighting
  20. Pumpmeup (talk contribs count) - wow, surprised this project existed without my knowledge. I don't get angry with vandals, I welcome them :-)
  21. Voice of All (talk contribs count) - I can try to answer questions of how long vandalism lasts and revert statistics per page or per page samples.
  22. 224jeff6 (talk contribs count)
  23. Desalvionjr (talk contribs count) - HATES vandalism with a passion!
  24. Gribeco (talk contribs count) - I run the French anti-vandalism bot, Salebot, it reverts 100+ IP changes daily
  25. Steve Crossin (talk contribs count) - Specifically interested in reforming the "zero tolerance" policy that is starting to be implemented by RC patrollers, and to educate editors as to what is an acceptable way to deal with vandalism. Steve Crossin (talk) (anon talk) 13:26, 17 April 2008 (UTC)
  26. Richard0612 (talk contribs count) - I fight vandalism [when I can] and have been toying with the idea of starting my own study for a while.

[edit] Open tasks

[edit] Research questions

These are some preliminary questions may stimulate future studies. Not all questions may be answerable, so think of it more as a brainstorming section.

[edit] Analysis of vandalism

  • Who is responsible for vandalism? What are the demographics of the vandal population?
  • What proportion of vandals are on dynamic IP addresses, and hence very hard to block?
  • Are IP edits ever responsible to improving a featured article while on the Main Page?
  • What motivates people to vandalize articles (See Wikipedia:The_Motivation_of_a_Vandal)? How can we minimize the satisfaction they get from doing it?
  • Do vandals just choose another article to edit instead if an article is semi-protected? How can we test this?
  • Why do certain articles attract more vandalism than others?
  • What types of vandalism are there? What message are they trying to get across? Why do vandals not fully realise that their actions are futile?
  • What sort of financial gains can be made from using Wikipedia to advertise - are spammers just wasting their time, or can it actually be profitable? Are our anti-spam measures adequate?
  • What is the overall contribution from schools and universities? Are they worth having? Do universities contribute less vandalism than schools, or are all ages equally immature?
  • How does the rate of vandalism vary throughout the day?
  • Angela suggests there would still be problems with vandalism if anonymous editing was blocked. How can we test this hypothesis? Certain categories could be experimentally altered to block anonymous editors, but then vandals could just choose an article that wasn't protected. We would have to block all IP editing, which would certainly be controversial, even just to gather a small sample of data. The blocks would also have to allow newly registered users to edit, otherwise there wouldn't be time to create an account and then wait 4 days. Perhaps we could use a comparative method by doing the experiments on another wiki instead?
  • Quantitatively, how are levels of vandalism affected (both in terms of percentage of edits and number of edits) when there is external attention draw to an article (e.g. Slashdot or The Colbert Report). Do levels of vandalism return to normal (e.g. in elephant) in all cases? How quickly?
  • How much of vandalism is self-reverted?
  • How do the levels of reverted edits compare between articles of different quality (e.g. GA vs. start class)
  • How often are good faith edits labeled as vandalism, either a) mistakenly and through misinterpretation of policy or b) maliciously?

[edit] Impact

  • How long does vandalism typically remain visible?
  • What level of vandalism is considered acceptable before semi-protection or some other measure is needed? How should the 'level of vandalism' be measured? (See Wikipedia talk:Protection policy#A more explicit semi-protection policy for articles subject to vandalism)
  • What impact does vandalism have on the reputation of Wikipedia?
  • How often are good faith editors driven away after getting mislabeled as vandals?
  • How often are good faith editors driven away because an article is vandalized?
  • How much time do editors waste cleaning up vandalism?

[edit] Counter measures

  • How effective are bots in curtailing vandalism?
  • Warnings:
    • Are editors any more likely to continue or desist vandalizing if warned by a bot instead of a person?
    • How often are vandals warned on their talk page after committing an offense?
    • What are the costs and benefits, and hence overall utility, of warning users? How do users respond to warnings?
  • Who is responsible for reverting vandalism?
  • What effects does semi-protection have on the level of vandalism of protected articles?
  • What strategies can we employ to catch vandalism quickly?
    • How can we catch most of it at recent changes?
    • How can we establish a situation where almost every article has someone responsible for maintaining it? Is this even a good idea? (See WP:OWN)
  • How good are editors at reverting vandalism? That is, is it reverted properly, or is it often dealt with poorly, e.g. removing a whole paragraph that the vandal has simply altered in meaning.
  • What happens to vandalism levels when edits won't show up in the current version of the article - a trial of something like stable versions, where the vandal cannot vandalize the actual article people see, or something functionally similar, is needed. Perhaps a small section (e.g. all articles in a certain category) could be tested out. (See also: flagged versions.
  • How well does Wikipedia:Flagged revisions work in practice?

[edit] Current studies

[edit] Finished studies by Vandalism studies wikiproject

[edit] Previous studies of vandalism on Wikipedia by others

[edit] Summary of findings

[edit] Conclusions from study 1

Study 1 analyzed a sample pool of 174 random articles for edits during November of 2004, 2005, and 2006. Of these articles, 100 contained an edit. A total of 668 edits were observed, of which 31 (or 4.6%) were a vandalism of some type (defined below). Because articles were randomly sampled and not edits, a ratio estimate must be used to calculate the percentage of edits that are vandalism. The percentage of edits that were vandals is 4.6% with a standard error of 1.7%. There is no discernible time trend in the data as can be seen in the table below.

year sampled articles mean(vandals/edit) Std. Err.
All 100 0.046 0.017
2004 25 0.035 0.023
2005 42 0.050 0.027
2006 78 0.046 0.022

In addition, 97% of the vandalism observed is done by anonymous editors. Obvious vandalism accounts for the vast majority. Roughly 25% of vandalism reverting is done by anonymous editors and roughly 75% is done by wikipedians with user accounts. The mean average time between vandalism and the edit being reverted is 758.35 minutes (12.63 hours), a figure that is skewed by outliers. The median time before reversion is 14 minutes.


Detailed conclusions Study 1 compiled 174 random articles from the 'Random article' sidebar tool. Of the 174 articles compiled, 100 articles had at least one edit during the months of November 2004, 2005, or 2006 (74 articles compiled had no data and thus were omitted from the sample pool of data points). There was a total of 668 edits during November 2004, 2005, or 2006 for those 100 data points of the sample pool. Of those 668 edits, 31 (or 4.64%) were vandalism of some type.

There did not seem to be any trend towards vandalism growing or shrinking as a percentage of total edits, rather the percentage of total edits that were vandalism seemed stable, taking up 3-6% of the edits in each period reviewed. This was shown by the fact that in Nov. 2004 total vandalism edits represented 3.49% (3 cases of vandalism out of 86 edits), in Nov. 2005 vandalism edits were 5.04% (14 cases of vandalism out of 278 edits) and in Nov. 2006 they were 4.61% (14 cases of vandalism out of 304 edits).

As for the type of vandalism, by far the most common was obvious vandalism which accounted for 83.87% of all vandalism in the study (26 out of 31 cases). Deletion vandalism was next with 9.68% (3 out of 31) and linkspam being the least used at 6.45% (2 out of 31 cases).

Those that vandalized pages were overwhelmingly anonymous editors, who accounted for 96.77% of all vandalism edits (30 out of a possible 31 vandalized edits). While anonymous editors did vandalize wikipedia pages much more, they also did contribute to reverting vandalism 25.81% of the time (8 out of 31) while all other reverts were done by wikipedians using their accounts (74.19% with 23 out 31 reverts).

The mean time it took to revert edits was 758.35 minutes or 12.64 hours. This was much higher than previously estimated by an earlier IBM study (See pdf here). However, the median time for reverting was 14 minutes (0, 0, 0, 1, 1, 3, 4, 4, 6, 7, 8, 10, 11, 11, 13, 14, 18, 23, 29, 51, 104, 222, 452, 490, 895, 898, 963, 1903, 2561, 6816, 7991).

[edit] User box

WVS This user is interested in studying vandalism.




[edit] Categories

[edit] Internet coverage of wikiproject's studies

Coverage of first study

[edit] Related projects / pages

[edit] Pages

[edit] Projects