Edgewall Software

Changes between Version 1 and Version 2 of TracDev/Proposals/VerticalHorizontalParsing


Ignore:
Timestamp:
Jun 20, 2010, 12:48:59 PM (14 years ago)
Author:
Christian Boos
Comment:

#ParsingOverview write down some notes

Legend:

Unmodified
Added
Removed
Modified
  • TracDev/Proposals/VerticalHorizontalParsing

    v1 v2  
    1919
    2020In future versions, the lists and other kind of markup could also benefit from this approach.
     21
     22== Parsing Overview
     23
     24Here's a very rough outline:
     25 * `parse_vertical`
     26   - prepares !WikiDocument (W)
     27   - preprocess (r9868), split text in lines
     28   - `parse_blocks` - get a tree of the `{{{` / `}}}` delimited blocks (B) and the spans between them (Raw); at this stage, the root document (W) and each (B) contains a list of (B|Raw) nodes
     29   - for each wiki block (i.e. (W) and each (B) containing wiki text)
     30     - `parse_raw_text` - each top-level (Raw) node will be scanned for structural ("vertical") markup; for each line:
     31       * detect verbatim text ({{{`}}}...{{{`}}} and `{{{`...`}}}` sequences); remember verbatim spans for that line, escape the line (replaced by 'X')
     32       * match vertical patterns which can result in:
     33         - (I)tem node   (`- * 1.` etc.)
     34         - (D)efinition list node (`... :: ...`)
     35         - (Q)uote node (leading space)
     36         - (C)itation node (`>+`)
     37         - (Row) node (`|| ... ||`)
     38         - ...
     39         - if nothing matches, this is a plain (T)ext node
     40       * at the end, this collection of (S)tructural nodes replace the (Raw) node
     41     - `assemble_nodes` - each node had an indentation level, the first non-space character in its starting line; this information will enable us to re-arrange a list of (B) and (S) nodes according a logical nesting determined by the indentation
     42 * `parse_horizontal` - each node in the previous tree will be split further, according to inline ("horizontal") markup
     43   - some nodes won't have any text content to process
     44   - some will have two (D) or more (Row)
     45   - it can well be that some markup will need to be processed recursively (e.g. `[=#anchor ''this was already explained above'']`).
     46   - macros could at this stage expand the tree as it's being built (e.g. via a new `IWikiMacroProvider.parse_macro` method)
     47