Let me start by saying that I am a supporter of Wikipedia, I contribute articles and information wherever I think I have sufficient knowledge. I also contribute annually a certain amount to Wikipedia donations. Having said that, it does hurt me sometimes when people rubbish you if you quote them something from wikipedia or you give them a wikipedia link in an attempt to prove your point. People who don’t know how wikipedia works or have very little surface knowledge seem to disregard it with much ease. I read somewhere about an article that how teachers in most school discredit any wikipedia sources of research. Yes they dislike it because in many cases it contradicts their text books. In reality, Wikipedia is a mighty flattener of the world by providing free access and authoring capability of information to general public. Let me quote an example, have you heard of the famous saying, “History is written by conquerors”? Not anymore. With rising popularity of Wikipedia, every piece of historical article is being subjected to views from all directions. One such example would be the role of “Aryan Invasion Theory” in Indian history. For more than one century we have heard the Aryan Invasion theory and taken it as practical history, of course until now. Without going into the details, you will notice Wikipedia article on the subject seems to stay neutral by presenting both sides of the argument.
Now coming to the original intention of writing this article, I propose to write first an algorithm and then a practical implementation of the algorithm as a web service/site that other applications can use. Yes, everything will be open source and free. The purpose of the algorithm would be to present the reader with a version of the wikipedia page (or for that matter any wiki page) that the algorithm thinks is the most stable/reliable version. How the algorithm will work is a set of steps that I will be detailing next.
- Access the History page of the article
- Fetch a list of all the authors
- Loop through all edits made by non-registered-users i.e. random edits
- Check if these edits against article lifecycle, i.e. how far in the stable life of the article was the edit made
- If the edit was made and no registered user edit was made after it, remove it
- Mark every other random edit as “Candidate for Removal”
- Fetch a list of newly registered users who have recently modified the page
- Check if the author has made edits to other pages, if yes, look at the activity interval. If there are rapid edits, the author could be spammer. If the edit made was very recent, mark it as “Recent Edits” and “Candidate for Removal”.
- Every content line that has a  marking, mark them as “candidate for removal”
- Find trustworth authors, by finding every author that has been editing on wikipedia for quite a long time
- Promote their edits to “Trustworthy Info”
- Find any “Candidates for Removal” in the “Trustworthy Info” and let “Trustworthy Info” suppress Candidate for Removals
- Based on the stringency of user settings, curate the “Candidates for Removal” in the final rendering of the article
This could just turn out to be the quick moderator you need while browsing the excellent and superb Wikipedia! And this doesnt just apply to wikipedia, it also applies to Technical wikis we use at work. There are many people writing and modifying wiki pages. If its a big organization, I bet there are many new joinees and interns who are not necessarily the most trusted people to edit wikis. However, the best use of it is on public wiki sites where trust worthiness of an article becomes a big question for few.