The software is organized as an extension to Mediawiki. During initializing of a page it will test whether the page is a Lua module, and if so whether it is a lib page (the normal code page) or a test page. If it is a lib page it will automatically run the tests, extract the result and show it through page indicators, add tracking categories and possibly log the result. If it is tests it will do nearly the same, but roles are slightly changed. The actions are triggered from Pickle\Hooks::onContentAlterParserOutput() (Hooks.php). During editing tests can be run from the edit page, but only if either the lib or the test are saved. In the current version it is not possible to do interactive testing with both lib and test is in an unsaved state. This could although be implemented.

On each test page a small template with {{#invoke:pickle|tap}} (can be slightly different) will be evaluated during rendering. Actual call can be configured in the pickle-testspec-invoke message at en.json. This message is exempt from translation. The message is then used for an automatic call, and the outcome will be parsed according to the configuration. Most of this goes on in Pickle\Hooks::onContentAlterParserOutput() (Hooks.php) or is called from this hook handler.

As rendering can be triggered by unrelated edits all logging is done by a system user, the Observer. Actual user can be configured in extension.json, entry ObserverID. In some cases it is possible to identify a likely user, that is a user editing either tester or testee within a few minutes, but the editing could be unrelated to an observed failure due to changes in included libraries. Code for identifying the editing user has not been made, as it is assumed that identifying wrong user would create too much on-wiki discussion and blame-game.

Initial rendering

The initial processing continues up to a point where it is clear whether the page is a test module. During this phase the page can be given additional items.

It will then

  • figure out whether the page should be included or excluded,
  • find a strategy for invoking the page,
  • find a strategy for extracting the status, and
  • TODO: Missing steps

Include or exclude the page

The page is first checked for correct content model. This initial filtering is on content model, not on namespace. Pages can be moved around, and namespaces are not a good indicator on how to handle a page. Wrong content model will lead to termination of further processing.

Next the page is checked for exclusion, as some common pages are explicit excluded. These are found by preg patterns, that is they are regexes, and if the pattern is found in the page title then the page will be excluded. This is usually employed for exclusion of subpages. If a partial match is found, then further processing will be terminated.

The default exclusion set contains subpages for doc, conf, data, i18n, and l10n. This set should probably be locally configured at each site, but then the actual impact would be higher.

Invoke the subpage

Next up is an attempt to find a strategy for invoking the page. This makes it possible to do more informed decisions whether the page can be called. Actual strategy can be configured in extension.json, entry InvokeSubpage. If no strategy is found, then further processing is terminated. At this point we have verified that the page is callable.

InvokeSubpage should probably be renamed, it is not a strategy only for subpages but all pages. If necessary, then the strategy can be extended with additional tests. First strategy that accepts the page will be used, and further processing terminated.

If the page is accepted, then it is checked to see if it is a valid subpage, that is a test page or a normal lib. It will then try to invoke the page, possibly squashing the page to get rid of noise if it is a TAP report (TAP).

Extract the status

The status is then extracted (ExtractStatus) from the end result. This will then be passed on to the hooks SpecTesterGadgets or SpecTesteeGadgets. Bindings for the handlers for these hooks can be found in extension.json, under the section Hook.

If the page is a tester page, then a help link (Help) will be inserted. This only wraps up the standardized help link.

After the handlers are run, the initial processing will terminate.

Additional hooks (should be moved)

Both the hooks SpecTesterGadgets and SpecTesteeGadgets has handlers that add the page indicators (Indicator) for page status, in addition the hook for SpecTesteeGadgets has handlers for adding tracking categories (Category) and test log (LogEntry).

Extended processing

When a page is detected as a test page, then it is processed to get its return status. This happen in several steps.


The tests are implemented as plain lua functions, and follows a spec-like layout, although with some variations. A very thin shim is called initially as describe, and this is used to bootstrap the rest of the system. This will as default try to set up the system to use the superpage as the initial test subject. This avoids some boilerplate code. The assumption is to use the outer describe in a single call, and then to use the return value from this as the returned table from the module. After the initial call the functions in this table can be used to access the various states and values.

Without running describe the bootstrapped functions will not be available, which means they will not exist in the console. It is important to know this, as you will not have access to them from the console before this call, but it is also important because without this initial call it will not interfere with normal lua modules. It also means that the environment created inside the test will not leak out into other code. Unfortunately it also makes testing of the Pickle library itself a bit cumbersome.


Initially some configuration needs to be passed from extension.json into the Pickle library. This is done inside LuaLibPickle.php.


The access point for the library is Pickle.lua. This bootstraps the full harness through the call describe. After the call the engine is available, and other parts will be included as necessary. In particular Adapt.lua, Frame.lua, and Spy.lua will be bootstrapped. Before the initial process is done, no reports will be available.


The functions describe(), context(), and it() are test harness at various levels (Frame.lua), that is functions with closures holding references to frames. The functions can take a table as a subject, strings that contains data for examples, and a function acting as the test fixture.

An alternate interpretation is that the function (the fixture) is a representation of the world, and within this world a table (the subject) should be a valid instance when it is instantiated with the example data. For now, we assume we are using universal quantifier (∀x) as a paradigm, all tests in a set of test cases must hold, so any interpretation that fails will trigger a return from the world. That is, we assume the test to be an approximation to a tautology.

Later on it might be interesting to add testing with existential quantification (∃x), that is some test in a set of test cases must hold.

When describe returns, it will return the unevaluated outer test case. It must be explicitly evaluated by calling eval. By using access functions it can be further manipulated, in particular reports according to the Test Anything Protocol can be made. That report is used as the rendered result that is shown by the indicators and so forth. It can also be prodded by clicking the run button in the test console.


An extractor is a mechanism to get the actual arguments out of the strings that are used as examples to a test case. There are several types of extractors for various types of arguments(extractor). The collection of all arguments for a single example is then provided for the fixture.

NOTE: The remaining string from the example is collected as a key for identifying a translator, and also for reuse if a translator is missing.


The functions subject and expect are assertions that tests conditions in different directions (Adapt.lua), that is functions with closures holding references to adaptations. Each of them adjust the implicit or explicit argument, and then compare it to the other party. They are similar, but describe assume a table to be a subject, and thus push it on a stack of subjects. This makes subject the access point for the test subject, and expect the access point for the test expectation.

An alternate interpretation is that this forms conditions that should be valid and true for the instance within the world. If it is not valid and true, then we have found a contradicting example and the instance can not be part of the given world.

Tests depending on existential quantification (∃x) is a rather difficult topic, and whether such testing should be available, and whether it would be useful, is debatable.


Usually code will be tested with as little additional code as possible, but sometimes it is necessary to check what is going on inside the code. For this we can use spies like carp, cluck, croak, and confess. A spy will do something unconditionally, and it is also a function closure that carry reference to the replaced function (Spy.lua).

NOTE: Spies should not be called in the tested code, only as part of the test fixture.

Spies can only be injected on provided code. That means code hidden away inside a module can't be tested. By including references only during testing otherwise hidden functionality can be exposed to testing, without making the code available for general use.

NOTE: Seems like the injection mechanism is lost, is it merge gone wrong?

Final rendering

The results will be used during automated processing and will also be provided to the end user.

Reports and renders

Reports holds a trace of what goes on inside the fixtures, and the final outcome of each interpretation. The reports are rendered and can be styled as one of vivid, full, or compact. Compact is sufficient for creation of the indicator, and does not include the trace. Full is used as default, and includes the trace. Vivid is for the console and is strictly speaking not valid TAP.

Reports and renders are not stable and will be changed.


The test results can be translated into other languages. To facilitate this the parsed examples are split into a reusable string and its arguments. The string can be used as a key for lookup in a table with translations, and the arguments can then be reinserted into the translated string. If the arguments can be adjusted to the new language it works quite well, but for a lot of languages there are no way to translate the arguments properly.

Translations are only provided as a help for the end users, the non-programmers, so they can get a grasp on why something is failing. The translations are not ment as a tool for the programmers, and it is not ment to be used in the tests. In particular, the translations are not used for automated processing.

TODO: This is only rudimentary implemented.

generated by LDoc 1.4.6