Writing a Wiki Bot

As part of the wiki re-fuckulation process a month back I had to move and reformat a bunch of pages. So I went about writing a wiki bot.

If you already have good knowledge of PHP this is about a billion times easier than it sounds due to a useful little library called wikimate. Its github page speaks for itself, it’s incredibly painless to read page text, set page text, delete pages etc.

So as an example.. one of the fucked up things about the old wiki is that it didn’t use categories. All of the functions were painstakingly listed on a page. So if you removed or added a function you had to edit this index page too. That’s not how a wiki is meant to be indexed. So I went about getting all the listings on those pages and converting them to a new template, so they would be categorised properly and also be editable via a form.

When I’m posting this code I’m now showing you great coding – I’m showing you code that got shit done.

image

So this function looks on the Libraries page and splits it by newlines – so that $lines is an array of lines. Then if a line contains “{{Lib” it trims shit off, splits it by | and calls ConvertLib.

image

ConvertLib does pretty much the same thing. It gets the library’s index page, goes line by line looking for LibraryFunc. Then it splits by | and calls Process.

image

Process is kind of a big boring function, so I’m splitting it into sections here. This part gets the target page. If it already exists then we already did our work, so it bails. If not it get the source page. This is the page we’re gonna be reading and converting to the target page.

image

It reads the page contents into $text

image

Then it tries to find the information we need for the new page. This looks more complicated than it is. It’s basically trying to find text between two other pieces of text. There’s a bit of fucking about to get this to work properly.. This is probably where regex works well – but who wants to learn how to use regex – right?

image

Once we have the info we can build $out – in the format that we want.

image

The arguments were stored in a really shitty way under a single template (Arg1Name, Arg2Name, Arg1Type, Arg2Type blah blah). In the new system we store the arguments unnumbered under the Arg template – and fuck about adding numbers in javascript after the page is rendered.

image

The final thing is to set the text of the new page, and delete the old one. We set the delete reason on the old page so future generations can figure out why we did what we did.

image

How was I running this? Well you just install PHP and then you can open a command line and run “php.exe convert_libraries.php”. You don’t need to go through the trouble of uploading to a host and running it in the browser.

I had lots of fun playing with this stuff and I think it has made the wiki much friendlier to edit! So everyone wins!

11 thoughts on “Writing a Wiki Bot

    1. 100% agreed with Brad. If you write a lot of code, the time investment to learn regexes will pay for itself many times over.

      Regexes can turn some coding problems from tedious or daunting tasks into simple regex matching/substitution one-liners. And most regex implementations allow regular (non-regex) functions to be called to perform custom in-place substitutions/transformations against regex-matched substrings. This combination of regex and can be very powerful.

      IMHO regular expressions are something that should be in every programmer’s toolbox.

      1. That should have read “This combination of regex and (your chosen language) can be very powerful.” Angle brackets derp.

        Or as a JS regex:
        > text.replace(/(and) (can)/, ‘$1 (your chosen language) $2’)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s