Edgewall Software

Changes between Version 2 and Version 3 of TracDev/JournalingProposal


Ignore:
Timestamp:
Jun 26, 2006, 8:11:51 PM (18 years ago)
Author:
Christian Boos
Comment:

Redirect to TracDev/Proposals/Journaling

Legend:

Unmodified
Added
Removed
Modified
  • TracDev/JournalingProposal

    v2 v3  
    11= Journaling Proposal =
    22
    3 == The Problem ==
     3See the full version in TracDev/Proposals/Journaling.
     4
     5----
     6=== In brief: ===
     7==== The Problem ====
    48Trac maintains coherency upon data changes by using various `I...Listener`
    59extension points.
     
    913scenario, with the widespread use of the Apache/prefork web front-end.
    1014
    11 === Some examples ===
    12 
    13 ==== Reacting on Wiki content change ====
    14 
    15 Several Wiki pages are used to facilitate interactive configuration by the users.
    16 This is the case of the InterMapTxt, for maintaining the list of InterWiki prefixes,
    17 the BadContent page for maintaining a list of regexps used to filter out SPAM,
    18 and probably more in the future.
    19 See my original explanation about what's going on with
    20 [http://trac-hacks.org/ticket/456#comment:3 updating InterMapTxt].
    21 
    22 ==== Reacting on Wiki page creation/deletion ====
    23 
    24 In order to not have to check in the DB for the existence of a Wiki page
    25 every time a WikiPageNames is seen in wiki text, we maintain a cache of
    26 the existing wiki pages.
    27 This list could be easily maintained using the change listener, but
    28 this would ''not'' work if a creation and deletion would be done
    29 by another process. A workaround for this is currently implemented:
    30 every once in a while, the cache is cleared and updated
    31 (see from source:trunk/trac/wiki/api.py@3362#L114).
    32 This is a very ad-hoc solution. It should be possible to do this
    33 better and in a more generic way.
    34 
    35 
    36 == A solution ==
     15==== A solution ====
    3716
    3817Every ''change'' event could be journalled.
     
    4120to other processes, in a generic way.
    4221
    43 After all, this journaling is already done in some cases.
    44 For example, all the ticket  changes are journaled, in the `ticket_change` table:
    45 {{{
    46 #!sql
    47 CREATE TABLE ticket_change (
    48     ticket integer,
    49     time integer,
    50     author text,
    51     field text,
    52     oldvalue text,
    53     newvalue text,
    54     UNIQUE (ticket,time,field)
    55 );
    56 }}}
    57 
    58 There's currently some discussion about adding to the above
    59 the `ipnr` and `authenticated` columns, to better track
    60 who did what (see #1890 for details).
    61 
    62 This would lead to even more duplication of data than what we have now.
    63 Granted, currently this duplication (of the ticket/time/author values)
    64 are used to group related changes.
    65 
    66 A cleaner approach, for #1890, would be:
    67 {{{
    68 #!sql
    69 CREATE TABLE ticket_change (
    70     tid integer,
    71     field text,
    72     oldvalue text,
    73     newvalue text,
    74 );
    75 
    76 CREATE TABLE ticket_transaction (
    77     tid integer PRIMARY KEY,
    78     ticket integer,
    79     time integer,
    80     author text,
    81     ipnr text,
    82     authenticated boolean
    83 );
    84 }}}
    85 
    86 Now, with this proposal, this could be extended to:
    87 {{{
    88 #!sql
    89 CREATE TABLE ticket_change (
    90     tid integer,
    91     field text,
    92     oldvalue text,
    93     newvalue text,
    94 );
    95 
    96 CREATE TABLE journal (
    97     tid integer PRIMARY KEY,
    98     type text,
    99     id text,
    100     change text,
    101     time integer,
    102     author text,
    103     ipnr text,
    104     authenticated boolean
    105 );
    106 }}}
    107 
    108 And `ticket_change` could even be generalized to `property_change`
    109 and go in the direction of a generalization of properties to
    110 all Trac objects (remember the TracObjectModelProposal?)
    111 
    112 The `change` column in `journal` could contain some keyword about
    113 the nature of the change: `CREATE`, `DELETE`, `MODIFICATION`, etc.
    114 
    115 Now, how to use this information?
    116 
    117 Each process would write into the `journal` table during the same
    118 transaction that modifies the object model tables themselves.
    119 This will mostly be something along the lines of:
    120 {{{
    121 #!python
    122     tid = record_in_journal(req, db, 'wiki', page, 'CREATE')
    123 }}}
    124 and:
    125 {{{
    126 #!python
    127     tid = record_in_journal(req, db, 'ticket', id, 'MODIFY')
    128 }}}
    129 
    130 Each process will also have to keep track of the last `tid` known.
    131 
    132 If this happens to have changed (details to be finalized:
    133 the detection could be done either during `record_in_journal` itself,
    134 or before request dispatching, or ...), there could be a ''replay''
    135 of those events, triggering the appropriate change listeners.
    136 
    137 The change listeners would anyway gain to be refactored in a more
    138 generic way (merging the IWikiChangeListener, ITicketChangeListener
    139 giving IMilestoneChangeListener, IChangesetChangeListener etc. for free,
    140 the usual TracObjectModelProposal blurb ;) ).
    141 
    142 Last but not least, there would be a need to differentiate between
    143 '''primary''' change and '''secondary''' change.
    144  primary change:: the change originated from the same process;
    145   there's only one process which ever sees a change as being a primary change
    146  secondary change:: the change originated from another process.
    147 
    148 This distinction is quite important w.r.t. to side-effects.
    149 
    150 Only ''primary'' changes should ever generate side-effects, such as e-mail
    151 notifications (a related topic: e-mail notifications should also be based
    152 on change listeners, see #1660). That way, one can be sure that the side-effects
    153 will be triggered only once, independantly from the number of server processes.
    154 
    155 Then, ''secondary'' changes could be used for all the informational stuff,
    156 for refreshing all sorts of internal caches (the use cases listed
    157 [TracDev/JournalingProposal#Someexamples above]).
    158  
     22''(and various other advantages)''