Fellas! My blog is back after being dead for more than a year. It took me some maintenance and technical fixing to get the website back up and running. I intend to be more active with my blogging from now on, and talk about subjects that I care about from the bottom of my heart. The latest edition to my loved subjects is Artificial Intelligence (AI) and my work with a new startup called Karmaloop AI (www.karmaloop.ai)
Post your comments!
A few years back when I used to be big on finding out about the origins of Indo-European people/community, I had learned about the gypsies tracing their roots back to India. Yesterday I stumbled upon the history of Romani people who are a part of the gypsy tribe and have lived in Europe for centuries not really knowing their place of origin. It will be surprising to a lot of readers that the Romani people have been genetically proven to be a race of the North-Middle-Indian territory but have lived across Europe for at least a 1000 years. The language they speak is very closely related to Hindustani or the Hindi language of India which is another startling fact.
What probably remains a big mystery is how did the Romanis or the Gypsys get to Europe from India? What made them forget the land they came from and what made them never make an effort to go back. It is a mystery even the Romani people may not know the answer to. Someday their history and their historic connection to India will be lost in the annals of time and it is unlikely if anyone will be interested in finding out the real reasons.
Do the Romani people consider themselves Indians and would they ever want to come back to India? Hmm interesting question and only a Romani connected to the roots can answer…
Let me start by saying that I am a supporter of Wikipedia, I contribute articles and information wherever I think I have sufficient knowledge. I also contribute annually a certain amount to Wikipedia donations. Having said that, it does hurt me sometimes when people rubbish you if you quote them something from wikipedia or you give them a wikipedia link in an attempt to prove your point. People who don’t know how wikipedia works or have very little surface knowledge seem to disregard it with much ease. I read somewhere about an article that how teachers in most school discredit any wikipedia sources of research. Yes they dislike it because in many cases it contradicts their text books. In reality, Wikipedia is a mighty flattener of the world by providing free access and authoring capability of information to general public. Let me quote an example, have you heard of the famous saying, “History is written by conquerors”? Not anymore. With rising popularity of Wikipedia, every piece of historical article is being subjected to views from all directions. One such example would be the role of “Aryan Invasion Theory” in Indian history. For more than one century we have heard the Aryan Invasion theory and taken it as practical history, of course until now. Without going into the details, you will notice Wikipedia article on the subject seems to stay neutral by presenting both sides of the argument.
Now coming to the original intention of writing this article, I propose to write first an algorithm and then a practical implementation of the algorithm as a web service/site that other applications can use. Yes, everything will be open source and free. The purpose of the algorithm would be to present the reader with a version of the wikipedia page (or for that matter any wiki page) that the algorithm thinks is the most stable/reliable version. How the algorithm will work is a set of steps that I will be detailing next.
- Access the History page of the article
- Fetch a list of all the authors
- Loop through all edits made by non-registered-users i.e. random edits
- Check if these edits against article lifecycle, i.e. how far in the stable life of the article was the edit made
- If the edit was made and no registered user edit was made after it, remove it
- Mark every other random edit as “Candidate for Removal”
- Fetch a list of newly registered users who have recently modified the page
- Check if the author has made edits to other pages, if yes, look at the activity interval. If there are rapid edits, the author could be spammer. If the edit made was very recent, mark it as “Recent Edits” and “Candidate for Removal”.
- Every content line that has a  marking, mark them as “candidate for removal”
- Find trustworth authors, by finding every author that has been editing on wikipedia for quite a long time
- Promote their edits to “Trustworthy Info”
- Find any “Candidates for Removal” in the “Trustworthy Info” and let “Trustworthy Info” suppress Candidate for Removals
- Based on the stringency of user settings, curate the “Candidates for Removal” in the final rendering of the article
This could just turn out to be the quick moderator you need while browsing the excellent and superb Wikipedia! And this doesnt just apply to wikipedia, it also applies to Technical wikis we use at work. There are many people writing and modifying wiki pages. If its a big organization, I bet there are many new joinees and interns who are not necessarily the most trusted people to edit wikis. However, the best use of it is on public wiki sites where trust worthiness of an article becomes a big question for few.
You might have noticed that my blog was down for a few weeks. Yes my last host had crashed. Not wanting to rely on them further, I have shifted my website server over to ByetHost. So far I am enjoying their very generous hosting offer. Nothing better for hosting a wordpress blog.