Wikipedia:WikiProject Vandalism studies
From Wikipedia, the free encyclopedia
| WP:WPVS (talk) |
|---|
| Study 1 (talk) (finished) Study 2 (talk) Obama Study (talk) |
| Related Projects/Pages |
| Wikipedia:Researching Wikipedia Wikipedia:WikiProject Wikidemia Wikipedia:Statistics Wikipedia:The Motivation of a Vandal |
| edit |
- Goals
- To conduct research related to unconstructive edits on Wikipedia.
- Scope
- The project covers all vandalism on Wikipedia. (Vandalism is any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia. - Wikipedia:Vandalism)
Contents |
[edit] Members
(Please remove your name if you are not actively participating in the project)
Remember (talk • contribs • count) -interested in everything about vandalismFormer member- JoeSmack (talk · contribs) - interest in anti-vandalism and have a specialization in linkspam reverting
- John Broughton (talk · contribs) - believe that Wikipedia policies and practices should be based at least partly on hard data, not just people's opinions
- AJackl (talk · contribs) - interested in the balance between subjectivity and objectivity and fact, and communities enforcing good faith edits in Wikipedia.
- res2216firestar (talk · contribs) - Believes that intentional vandalism should be strictly punished, and has a talent in telling intentional vandalism from good faith or test edits.
- JackSparrow Ninja (talk • contribs • count) - interest in anti-vandalism, and has a talent in telling intentional vandalism from good faith or test edits.
- Jonathan Stokes (talk · contribs) - interested in hard data on the contributions of anonymous users
- Jayron32 (talk · contribs) - Interested in fighting vandalism, but also interested in maintaining the Good Faith that most anons are still interested in improving Wikipedia.
- Deletion Quality (talk · contribs)- Interested in anti-vandalism and tracking down vandals
- Flubeca (talk · contribs) - More we know about vandals, the more we can stop them.
- hmwith (talk · contribs)
- A.Ou (talk · contribs)
- E (talk • contribs • count) - Vandalism fighter, interested in statisticts of vandalism on Wikipedia.
- HisSpaceResearch (talk · contribs) - interested in everything about vandalism
- Katsuhagi (talk · contribs) - interested in anti-vandalism, understanding more about vandals, learning how to fight it
- DGG 01:47, 26 June 2007 (UTC) How to reform vandals.
- --Arceus fan 23:15, 25 September 2007 (UTC) I want to stop vandalism because I am SICK AND TIRED of it!
- Phgao (talk · contribs)
- Gawaxay (talk • contribs • count) - Totally obsessed with vandal fighting
- Pumpmeup (talk • contribs • count) - wow, surprised this project existed without my knowledge. I don't get angry with vandals, I welcome them :-)
- Voice of All (talk • contribs • count) - I can try to answer questions of how long vandalism lasts and revert statistics per page or per page samples.
- 224jeff6 (talk • contribs • count)
- Desalvionjr (talk • contribs • count) - HATES vandalism with a passion!
- Gribeco (talk • contribs • count) - I run the French anti-vandalism bot, Salebot, it reverts 100+ IP changes daily
- Steve Crossin (talk • contribs • count) - Specifically interested in reforming the "zero tolerance" policy that is starting to be implemented by RC patrollers, and to educate editors as to what is an acceptable way to deal with vandalism. Steve Crossin (talk) (anon talk) 13:26, 17 April 2008 (UTC)
- Richard0612 (talk • contribs • count) - I fight vandalism [when I can] and have been toying with the idea of starting my own study for a while.
[edit] Open tasks
- Come up with a defensible definition of "vandalism", including a list of types. (See /Types of vandalism and Wikipedia:Vandalism#Types of vandalism.)
- Discuss what research needs to be conducted and what would be good metrics to gather.
- Create a central place that lists all empirical studies related to vandalism on wikipedia.
- Add thoughts/contribute to any of the studies that are completed or currently running. (Study 1, Study 2, Obama article study).
- Wikipedia:Error management
[edit] Research questions
These are some preliminary questions may stimulate future studies. Not all questions may be answerable, so think of it more as a brainstorming section.
[edit] Analysis of vandalism
- Who is responsible for vandalism? What are the demographics of the vandal population?
- What proportion of vandals are on dynamic IP addresses, and hence very hard to block?
- Are IP edits ever responsible to improving a featured article while on the Main Page?
- What motivates people to vandalize articles (See Wikipedia:The_Motivation_of_a_Vandal)? How can we minimize the satisfaction they get from doing it?
- Do vandals just choose another article to edit instead if an article is semi-protected? How can we test this?
- Why do certain articles attract more vandalism than others?
- What types of vandalism are there? What message are they trying to get across? Why do vandals not fully realise that their actions are futile?
- What sort of financial gains can be made from using Wikipedia to advertise - are spammers just wasting their time, or can it actually be profitable? Are our anti-spam measures adequate?
- What is the overall contribution from schools and universities? Are they worth having? Do universities contribute less vandalism than schools, or are all ages equally immature?
- How does the rate of vandalism vary throughout the day?
- Angela suggests there would still be problems with vandalism if anonymous editing was blocked. How can we test this hypothesis? Certain categories could be experimentally altered to block anonymous editors, but then vandals could just choose an article that wasn't protected. We would have to block all IP editing, which would certainly be controversial, even just to gather a small sample of data. The blocks would also have to allow newly registered users to edit, otherwise there wouldn't be time to create an account and then wait 4 days. Perhaps we could use a comparative method by doing the experiments on another wiki instead?
- Quantitatively, how are levels of vandalism affected (both in terms of percentage of edits and number of edits) when there is external attention draw to an article (e.g. Slashdot or The Colbert Report). Do levels of vandalism return to normal (e.g. in elephant) in all cases? How quickly?
- How much of vandalism is self-reverted?
- How do the levels of reverted edits compare between articles of different quality (e.g. GA vs. start class)
- How often are good faith edits labeled as vandalism, either a) mistakenly and through misinterpretation of policy or b) maliciously?
[edit] Impact
- How long does vandalism typically remain visible?
- What level of vandalism is considered acceptable before semi-protection or some other measure is needed? How should the 'level of vandalism' be measured? (See Wikipedia talk:Protection policy#A more explicit semi-protection policy for articles subject to vandalism)
- What impact does vandalism have on the reputation of Wikipedia?
- How often are good faith editors driven away after getting mislabeled as vandals?
- How often are good faith editors driven away because an article is vandalized?
- How much time do editors waste cleaning up vandalism?
[edit] Counter measures
- How effective are bots in curtailing vandalism?
- Warnings:
- Are editors any more likely to continue or desist vandalizing if warned by a bot instead of a person?
- How often are vandals warned on their talk page after committing an offense?
- What are the costs and benefits, and hence overall utility, of warning users? How do users respond to warnings?
- Who is responsible for reverting vandalism?
- What effects does semi-protection have on the level of vandalism of protected articles?
- What strategies can we employ to catch vandalism quickly?
- How can we catch most of it at recent changes?
- How can we establish a situation where almost every article has someone responsible for maintaining it? Is this even a good idea? (See WP:OWN)
- How good are editors at reverting vandalism? That is, is it reverted properly, or is it often dealt with poorly, e.g. removing a whole paragraph that the vandal has simply altered in meaning.
- What happens to vandalism levels when edits won't show up in the current version of the article - a trial of something like stable versions, where the vandal cannot vandalize the actual article people see, or something functionally similar, is needed. Perhaps a small section (e.g. all articles in a certain category) could be tested out. (See also: flagged versions.
- How well does Wikipedia:Flagged revisions work in practice?
[edit] Current studies
- Wikipedia:WikiProject Vandalism studies/Obama article study, (talk page)
- Wikipedia:WikiProject Vandalism studies/Study2, (talk page)
[edit] Finished studies by Vandalism studies wikiproject
[edit] Previous studies of vandalism on Wikipedia by others
- University of Minnesota vandalism study (pdf)
- IBM study of Wikipedia, April 2004 (pdf)
- Wikipedia talk:Don't protect Main Page featured articles/December Main Page FA analysis
- Study by Jayron32 (warning: results found in a rather long Village Pump dif...)
- Study by Opabinia regalis
- Study of vandalism on individual's user page
- 2006 study by Buriol et. al. - figure 5 shows "rv" and "revert" edits have steadily increased over time; these were 6% of all edits as of January 2006
- User:Colonel Chaos/study
- User:Cool3/Analysis
- Automatic Vandalism Detection in Wikipedia, Martin Potthast, Benno Stein, and Robert Gerling, Bauhaus University Weimar (2008)
[edit] Summary of findings
[edit] Conclusions from study 1
Study 1 analyzed a sample pool of 174 random articles for edits during November of 2004, 2005, and 2006. Of these articles, 100 contained an edit. A total of 668 edits were observed, of which 31 (or 4.6%) were a vandalism of some type (defined below). Because articles were randomly sampled and not edits, a ratio estimate must be used to calculate the percentage of edits that are vandalism. The percentage of edits that were vandals is 4.6% with a standard error of 1.7%. There is no discernible time trend in the data as can be seen in the table below.
| year | sampled articles | mean(vandals/edit) | Std. Err. |
|---|---|---|---|
| All | 100 | 0.046 | 0.017 |
| 2004 | 25 | 0.035 | 0.023 |
| 2005 | 42 | 0.050 | 0.027 |
| 2006 | 78 | 0.046 | 0.022 |
In addition, 97% of the vandalism observed is done by anonymous editors. Obvious vandalism accounts for the vast majority. Roughly 25% of vandalism reverting is done by anonymous editors and roughly 75% is done by wikipedians with user accounts. The mean average time between vandalism and the edit being reverted is 758.35 minutes (12.63 hours), a figure that is skewed by outliers. The median time before reversion is 14 minutes.
Detailed conclusions Study 1 compiled 174 random articles from the 'Random article' sidebar tool. Of the 174 articles compiled, 100 articles had at least one edit during the months of November 2004, 2005, or 2006 (74 articles compiled had no data and thus were omitted from the sample pool of data points). There was a total of 668 edits during November 2004, 2005, or 2006 for those 100 data points of the sample pool. Of those 668 edits, 31 (or 4.64%) were vandalism of some type.
There did not seem to be any trend towards vandalism growing or shrinking as a percentage of total edits, rather the percentage of total edits that were vandalism seemed stable, taking up 3-6% of the edits in each period reviewed. This was shown by the fact that in Nov. 2004 total vandalism edits represented 3.49% (3 cases of vandalism out of 86 edits), in Nov. 2005 vandalism edits were 5.04% (14 cases of vandalism out of 278 edits) and in Nov. 2006 they were 4.61% (14 cases of vandalism out of 304 edits).
As for the type of vandalism, by far the most common was obvious vandalism which accounted for 83.87% of all vandalism in the study (26 out of 31 cases). Deletion vandalism was next with 9.68% (3 out of 31) and linkspam being the least used at 6.45% (2 out of 31 cases).
Those that vandalized pages were overwhelmingly anonymous editors, who accounted for 96.77% of all vandalism edits (30 out of a possible 31 vandalized edits). While anonymous editors did vandalize wikipedia pages much more, they also did contribute to reverting vandalism 25.81% of the time (8 out of 31) while all other reverts were done by wikipedians using their accounts (74.19% with 23 out 31 reverts).
The mean time it took to revert edits was 758.35 minutes or 12.64 hours. This was much higher than previously estimated by an earlier IBM study (See pdf here). However, the median time for reverting was 14 minutes (0, 0, 0, 1, 1, 3, 4, 4, 6, 7, 8, 10, 11, 11, 13, 14, 18, 23, 29, 51, 104, 222, 452, 490, 895, 898, 963, 1903, 2561, 6816, 7991).
[edit] User box
| WVS | This user is interested in studying vandalism. |
[edit] Categories
[edit] Internet coverage of wikiproject's studies
Coverage of first study
- Digg Posting.
- Blogs about the study WikiAngela, Original Research blog, Valuewiki Blog
- Wikizine, Year: 2007 Week: 15 Number: 67 [1]
- Wikipedia signpost [2]
- Wikipedia weekly podcast Wikipedia:WikiProject WikipediaWeekly/Episode19. They start talking about it at minute 17.
[edit] Related projects / pages
[edit] Pages
- Wikipedia:Vandalism
- Wikipedia:Most vandalized pages
- Wikipedia:Long term abuse
- Wikipedia:RC patrol
- Wikipedia:Error management
- Wikipedia:Cleaning up vandalism
- Wikipedia:Administrator intervention against vandalism
- Wikipedia:The Motivation of a Vandal
- User:Dragons flight/Log analysis - various data, including some on vandalism.
[edit] Projects
- Wikipedia:Counter-Vandalism Unit
- Wikipedia:WikiProject Wikidemia (parent project)

