Wikipedia:AutoWikiBrowser/Regular expression

From Wikipedia, the free encyclopedia

AutoWikiBrowser - 4.3.2.0

v  d  e
Home

General information about AutoWikiBrowser and directions for installation.

Request approval

Request approval to be added to the CheckPage to use AutoWikiBrowser.

Discussion

Discuss the application and ask questions.

Bugs

Report bugs in the application.

Feature Requests

Request new features to be implemented into AWB

User manual

The full user manual.

Developer Talk Page · Typos · IW Order · User talk templates · Plugins · IRCMonitor · Projects which have used AWB · Changelog · AWB Sandbox · Settings · Custom Modules · WikiFunctions.dll · Custom style.css · Userbox · SVN Snapshots · Usage Stats

This is the Regular expresions sub section of the user manual for AutoWikiBrowser.

Please note, the documentation contained in this manual may be out of date.
Features of the latest version of AWB may have been changed, added or removed and may not be reflected correctly in this version of the manual.


Shortcuts:
WP:AWB/UM
WP:AWB/MAN

Contents


[edit] Regular expression definitions

Regular expressions
Anchors
^ Start of string
\A Start of string
$ End of string
\Z End of string
\b Word boundary
\B Not word boundary
\< Start of word
\> End of word
   
Character Classes
\c Control character
\s White space
\S Not white space
\d Digit
\D Not digit
\w Word
\W Not word
\x Hexadecimal digit
\O Octal digit
   
Quantifiers
* 0 or more
+ 1 or more
 ? 0 or 1
{3} Exactly 3
{3,} 3 or more
{2,4} 2, 3 or 4
   
Escape Character
\ Escape Character
   
Metacharacters (must be escaped)
Metacharacter Metacharacter escaped
^ \^
$ \$
( \(
) \)
< \<
. \.
* \*
+ \+
 ? \?
[ \[
{ \{
\ \\
| \|
> \>
   
Special Characters
\n New line
   
Groups and Ranges
Note: Ranges are inclusive
. Any character except new line (\n)
(a|z) a or z
( ) Capture group (captures anything between the "(" ")"
[def] Range d or e or f
[^abc] Range not a or b or c
[a-q] Letter between a and q
[A-Q] Upper case letter between A and Q
[0-7] Digit between 0 and 7
   
String Replacement
$1- returns sam (sam) (max) (pete)
$2 - returns max (sam) (max) (pete)
$3 - returns pete (sam) (max) (pete)
   
   
   
Sample Patterns
Regex pattern Will Match
([A-Za-z0-9-]+) Letters, numbers and hyphens
(\d{1,2}\/\d{1,2}\/\d{4}) Date 3/24/2008 or 03/24/2008
\[\[\d{4}\]\] 4 digit number wiki link [[2008]]
   
   
   
   
   
   
   
   

[edit] Regular expression examples

Regular expression examples (Regex)
Description: Search for flagicon template and remove
Find: \{\{.*?flagicon.*?\|.*?\}\}
Replace With: (nothing)
Example of text to search: {{flagicon|USA}} [[United States]]
Result: [[United States]]
Comments:

[edit] Tips and tricks

[edit] User made shortcut editing macros

You can make your own shortcut editing macros. When you edit an page, you can enter your short-cut macro keys into the page anywhere you want AWB to act upon them.

For example you are examining an page in the AWB edit box. You see numerous items like adding {{fact}}, inserting line breaks <br>, commenting out entire lines <!-- comment -->, inserting state names, <ref>Insert footnote text here</ref>, insert Level 2,3,or even 4 headlines, etc... This can all be done by creating your short-cut macro keys.

  • The process
  1. Create a rule. See Find and replace, Advanced.
  2. Edit your page in the edit box. Insert your short-cut editing macro key(s) anywhere in the page you want AWB to make the change(s) for you.
  3. Re-parse the page. Right click on the edit box and select Re-parse from the context pop up menu. AWB will then re-examine your page with your macro short-cut key(s), find your short-cut key(s) and preform the action you specified in the rule.

Naming a short-cut macro key can be any name. But it is best to try and make it unique so that it will not interfere with any other process that AWB may find and suggest. For that reason using /// followed by a set of lowercase characters that you can easily remember is best (lowercase is used so that you do not have to use the shift key). You can then enter these short-cut macros keys you create into the page manually or by using the edit box context menu paste more function. The reason why we use three '/' is so that AWB will not confuse web addresses/url's in an page when re-parsing.

Examples:

Create a rule as a regular expression.

User made short-cut editing macros examples
Short-cut key: ///col
Name: Comment out entire line
Find: ///col(.*)
Replace With: <!-- $1 -->
Example before reparsing: ///colThe quick brown fox jumps over the lazy dog
Result after re-parsing: <!-- The quick brown fox jumps over the lazy dog -->
Comments:
Short-cut key: ///br
Name: Insert line feed
Find: ///br
Replace With: <br />
Example before reparsing: Eat some more///br of these soft French buns///br and drink some tea
Result after re-parsing: Eat some more<br /> of these soft French buns<br /> and drink some tea
Comments:
Short-cut key: ///fac
Name: Insert {{fact}} with current date
Find: ///fac
Replace With: {{fact|date={{subst:CURRENTMONTHNAME}} {{subst:CURRENTYEAR}}}}
Example before reparsing: The quick brown fox jumps over the lazy dog///fac
Result after re-parsing: The quick brown fox jumps over the lazy dog{{fact|date={{subst:CURRENTMONTHNAME}} {{subst:CURRENTYEAR}}}}
Comments: