Edgewall Software

Version 1 (modified by Christian Boos, 14 years ago) ( diff )

explain what I meant with vertical/horizontal wiki parsing

Vertical vs. Horizontal Parsing in Trac Wiki

When one look at a Trac Wiki markup source, the primary structure one can see is along the "vertical" direction.

Of course, one can find in nearly any text a vertical sequence of group of lines ("paragraphs"), vertically isolated lines corresponding to section titles, etc.

But in Trac or other wiki markups, this goes further:

  • wiki processors use "blocks" of lines ({{{}}}), eventually nested
  • lists, blockquotes, definition lists all rely on a consistent indentation,
  • citation quotes ('> …') also stand out first in the vertical direction

Yet the wiki parsing up to Trac 0.11 is heavily line oriented. The detection of code blocks is one exception to this, where matching {{{ / }}} pairs of lines are first detected, then their content processed, even recursively if needed.

By analyzing the structure of lines one after the other, one has to maintain a lot of state in order to keep a correct sense of context. Also the code is often fragile as it's not always obvious to decide what HTML elements have to be closed first before proceeding.

The idea with VH parsing is that the existing vertical structure in the markup should be exploited first before tackling the horizontal parsing, which is better suited to intra-paragraph markup.

In 0.12, the parsing of citation quotes ('> …') now uses this approach, and this enabled to address the #4235 issue in a comprehensive way relatively painlessly.

In future versions, the lists and other kind of markup could also benefit from this approach.

Note: See TracWiki for help on using the wiki.