User:FearBot/EvalFunc

From Wikipedia, the free encyclopedia

The Evaluation Function is a function written in Java that gives the article a score. Scores are defined as follows:

  • Smaller than 0: Great
  • 0-10: Ok
  • 10-16: Shows Template:IdentifiedSpam
  • 16-20: WP:PROD
  • Greater than 20: WP:SPEEDY

[edit] Information

The paramaters are as follows:

  • MediaWikiBot mwb: The bot functions from JWBF, not used in this function
  • SimpleArticle article: The article in question. article.getText() is the text, article.getLabel() is the label.

The variables are as follows:

  • String[] lines: An array of the lines in the source code
  • String text: The article text (duh)
  • int p: The score
  • int len: The article length
  • int numtemplates, int numlinks, int numimages, int numcomments: Self explanatory (if you don't get it, num=Number Of
  • int numelinks: Number of external links

The data arrays are as follows:

  • String[] badwords, String[] goodwords: An array of words (off User:FearBot/Wordlists
  • String[] lupinBadwords: Some badwords gotten off User:Lupin/badwords (does not include regex)
  • String[] langs: List of language codes used in wikipedia

The functions are as follows:

  • int sOccC(String str, String substr): Returns the number of times substr is in str (sOccC = String Occurence Count)

[edit] The Code

String text = article.getText().toLowerCase();
                if(text.contains("{{db")){
                        return 0;
                }
                String[] lines = text.split("\n");
                if(lines.length > 0 && lines.length <= 2 && lines[0].startsWith("#REDRIECT")){
                        return 0;
                }
                int p = 0;
                int len = text.length();
                if(len < 100){
                        p += 15;
                }
                if(len < 300 && len >= 100){
                        p += 5;
                }
                if(len < 500 && len >= 300){
                        p += 2;
                }
                if(len < 1000 && len >= 500){
                        p += 1;
                }
                if(len > 1000){
                        p -= 5;
                }
                if(len > 2000){
                        p -= 5;
                }
                int numlinks = 0;
                int numelinks = 0;
                int numtemplates = 0;
                int numimages = 0;
                int numcomments = 0;
                numlinks = sOccC(text, "[[");
                numelinks = sOccC(text, "[http:");
                numimages = sOccC(text, "[[Image:");
                numtemplates = sOccC(text, "{{");
                numcomments = sOccC(text, "<!--");
                p -= (sOccC(text, "class=wikitable") * 5);
                p += (sOccC(text, "--[[User:") * 2);
                p -= (sOccC(text, "<ref>") * 3);
                p -= (sOccC(text, "{{cite") * 3);
                p -= (sOccC(text, "Infobox") * 5);
                p -= (sOccC(text, "stub") * 10);
                for(int i = 0; i < langs.length; i++){
                        p -= (sOccC(text, "[["+langs[i]+":") * 5);
                }
                p -= (sOccC(text, "[[Category:") * 5);
                p -= (sOccC(text, "{{reflinks}}") * 10);
                p -= (sOccC(text, "redirect") * 20);
                p += (sOccC(text, "'''bold text'''") * 5);
                p += (sOccC(text, "== headline text ==") * 5);
                p += (sOccC(text, "!") * 3);
                p -= (sOccC(text, "'''") * 3);
                p -= (sOccC(text, "disambig") * 15);
                p -= (sOccC(text, "|") * 3);
                p -= (sOccC(text, "==") * 5);
                p -= (sOccC(text, "<") * 2);
                for(int i = 0; i < badwords.length; i++){
                        p += (sOccC(text, " "+badwords[i]+" ") * 3);
                }
                for(String bw : lupinBadwords){
                        p += (sOccC(text, bw) * 5);
                }
                for(int i = 0; i < goodwords.length; i++){
                        p -= (sOccC(text, goodwords[i]) * 3);
                }
                if(numtemplates < 5){
                        p += 2;
                }
                if(numtemplates > 5){
                        p -= numtemplates / 2;
                }
                if(numlinks < 3){
                        p += 10;
                }
                if(numelinks > 1){
                        p -= numelinks;
                }
                if(numimages == 0){
                        p += 2;
                }
                p -= numcomments;
                if(article.getLabel().equals(article.getLabel().toUpperCase())){
                        p += 5;
                }
                System.out.println("Article "+article.getLabel()+" scored "+p);
                return p;