Edgewall Software

Changes between Version 6 and Version 7 of GenericTrac


Ignore:
Timestamp:
Sep 3, 2009, 5:32:27 PM (15 years ago)
Author:
Christian Boos
Comment:

updated with ideas developed in #6466

Legend:

Unmodified
Added
Removed
Modified
  • GenericTrac

    v6 v7  
    11= GenericTrac Data Model =
    22
    3 This page tries to define a new data model that could be suitable
    4 for most Trac resources. The main benefits expected from the new model are:
     3This page attempts to define a new data model for Trac that could be suitable
     4for most of its resources. The main benefits expected from the new model are:
    55 - simplification of the internals of Trac, especially for the ticket model,
    66   in which the storage of changes is quite cumbersome (see #454, #6466)
    7  - solve several desing problems with the current data model (#1890)
    8  - allow better code reuse
     7 - solve a few design problems with the current data model (like #1890, #4582)
     8 - allow better code reuse and share of the base features
     9   among different kinds of resources (numerous examples for that,
     10   see [#RelatedTickets] below)
    911
    1012This stems from the following former proposals:
     
    1214 - TracDev/Proposals/DataModel
    1315 - TracDev/Proposals/Journaling
    14  - WikiContext are used as ''resource descriptors'' and have a `.resource` field
    15    which enables one to fetch the corresponding data model instance
    1616
    1717See also [googlegroups:trac-dev:8cf3f5fe0e476ce5 this mail].
     
    2020it could also be a good opportunity to take the
    2121''[TracMultipleProjects multiple project]'' considerations into account (#130).
    22 Each resource related table should probably get a `project` identifier field.
     22Each resource related table could get a `project` identifier field.
    2323
    2424Working on the generic aspect of Trac should also make it possible to implement various ''generic'' operations on Trac resources as plugins, mainly being able to (re-)implement TracCrossReferences as a plugin (see also #6543).
    2525
    2626
    27 === Possible Implementation Plan ===
    28 ==== Milestone First ====
     27== Design Discussion ==
     28
     29Requirements for the new model:
     30 1. it has to be ''simple'';
     31 2. it must be ''flexible'', in order to accommodate different kinds of resources and   
     32    allow for dynamic evolution;
     33 3. it should remain ''fast'', if not faster than what we currently have;
     34 4. it should lead to a more ''compact'' representation of data
     35
     36=== Resource Content ===
     37
     38The ticket model is by far richer data model we would have to support,
     39so we could take this as a basis to lay out the foundations of the new model.
     40For ticket, we currently have a fixed set of properties
     41(as columns in the `ticket` table)
     42and a flexible set of properties
     43(as name/value columns in a `ticket_custom` table).
     44
     45Both styles have advantages and disadvantages:
     46 1. properties as columns:
     47   - (-) only flexibility is to not use some fields (e.g. severity)
     48   - (-) no multiple values per field possible 
     49   - (+) faster
     50   - (+) straightforward code (`for field1,field2, ... in cursor: ...`)
     51 2. properties in name/value columns
     52   - (+) highest flexibility, add or remove fields at will
     53   - (+) allow for multiple values per name, provided we don't use a primary key
     54     as we currently do for the `ticket_custom` table (#918)
     55   - (-) slower, less memory efficient (?)
     56   - (-) more complex code (?)
     57
     58In order to reduce the overall complexity, the idea would be to pick only one approach, instead of having to support both.
     59By using the second style, we could also have our "fixed" set of properties,
     60while obviously the first style can't support the second.
     61
     62It remains to be seen whether the second approach is really less efficient than the first, but this doesn't really matter as we anyway have already to pay the price for
     63that flexibility.
     64
     65So the new model could be simply:
     66
     67'''ticket'''
     68|| ''id'' || ''name'' || ''value'' ||
     69
     70or even:
     71
     72'''resource_prop'''
     73|| ''realm'' || ''id'' || ''name'' || ''value'' ||
     74(if we use one mega table for all resources)
     75
     76We could also keep the metadata associated to the properties in the database,
     77instead of being hard-coded and present in the TracIni file.
     78
     79'''resource_schema'''
     80|| ''realm'' || ''prop'' || ''name'' || ''value'' ||
     81
     82Here, possible values for ''name'' could be 'label', 'default', 'order', 'type', etc.
     83
     84Example.
     85|| ticket || description || type || wiki ||
     86|| ticket || priority || type || enum ||
     87|| ticket || priority || enum || priority ||
     88|| ticket || priority || default || normal ||
     89|| ticket || need_review || type || checkbox ||
     90|| ticket || need_review || default || 0 ||
     91
     92
     93As a possible refining, it could be possible to have specialized tables,
     94one for each different value column type we want to support:
     95 - '''resource_prop''' for text values
     96 - '''resource_prop_int''' for integer values
     97 - ('''resource_prop_float''' for float values, if really needed)
     98And we could even differentiate between short and long text values (requirement 4):
     99 - '''resource_prop''' for short text values
     100 - '''resource_prop_text''' for long text values
     101(see #6986).
     102
     103
     104Along the same lines there's also the question of what should be the ''id'':
     105a natural or a surrogate key?
     106 - natural keys: (''id'' would be 123 for ticket !#123, id would be 'milestone1' for milestone1, etc.)
     107   - we have to support different type of keys (text for milestone, int for ticket).
     108     - not a problem for separate tables
     109     - would require ''resource_int_prop'' style for resources having an ''int''
     110       id ... cumbersome
     111   - less compact but easier to "understand"
     112   - renaming is more difficult
     113 - surrogate keys: (''id'' would be a number in a sequence, never shown as such in the interface)
     114   - only one type of keys (int) - faster, simpler,
     115     the unique ''resource_prop'' table approach is possible
     116   - more compact, not that difficult to read either (there would always be a
     117     ''name=id'', ''value=the natural key'' entry
     118   - renaming is easy (relations preserved)
     119
     120This suggests that using surrogate keys would be preferable.
     121Now if this is the case, the '''resource_prop''' table could as well become:
     122|| ''id'' || ''name'' || ''value'' ||
     123and the ''realm'' information could simply be store as another name/value entry.
     124
     125
     126=== Resource History ===
     127
     128We need to differentiate between the changes to the data, and the metadata about the change. The metadata is about who did the change, when, why the change was made, etc.
     129We can adopt the same flexible strategy as the one for resource properties and
     130store arbitrary name/value pairs of "revision properties".
     131
     132'''resource_revprop'''
     133|| ''changeid'' || ''name'' || ''value'' ||
     134
     135Typical example:
     136|| 101001 || author  || cboos         ||
     137|| 101001 || auth    || 1             ||
     138|| 101001 || date    || 1231232114.12 ||
     139|| 101001 || comment || random change ||
     140
     141A given ''changeid'' is usually related to a specific change in one resource,
     142but there could be other situations:
     143 - one change affecting lots of resources (typically #4582 and #5658)
     144 - changes affecting changes (typically #454)
     145
     146
     147The property changes themselves are stored in other tables.
     148
     149Several possibilities here:
     150'''ticket_change''
     151|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
     152'''milestone_change''
     153|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
     154or:
     155'''resource_change''
     156|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
     157(surrogate key approach)
     158
     159The latter has the advantage that it would make easy to relate a given ``changeid``
     160to the resource(s) that were affected by the change, without having to go through
     161each resource table.
     162
     163We could also keep all property changes as text values
     164or have extra `..._int` (`..._float`) tables for more compact
     165representation.
     166
     167See also ticket:6466#comment:10 and follow-ups for a discussion about how ticket changes and in particular ticket change edits, could be handled using this approach.
     168
     169
     170== Possible Implementation Plan ==
     171
     172=== Milestone First ===
    29173 - modify the Milestone module so that it uses the new proposed datamodel. See [#TheMilestoneExample].
    30174 - experiment new tabbed view for the milestone (''View'', ''Discussion'', ''History''). See TracProject/UiGuidelines.
     
    35179Once this is complete, validate the genericity by promoting the components to be first class resources as well (#1233).
    36180
    37 ==== Ticket First ====
     181=== Ticket First ===
    38182
    39183As the ticket module is by far the most complex, it might be worth to
     
    47191
    48192
    49 == The Model ==
    50 
    51 === The Milestone Example ===
    52 
    53 The proposed data model would be:
    54 {{{
    55 #!sql
    56 -- record Milestone current properties
    57 --
    58 create table milestone_prop (
    59  project text,
    60  id    text,
    61  --
    62  name  text,
    63  value text
    64 );
    65 
    66 create index milestone_idx on milestone_prop (id, name);
    67 
    68 -- record Milestone change metadata
    69 --
    70 create table milestone_revision (
    71  tid            int primary key,
    72  --
    73  date           int,
    74  authname       text,
    75  author         text,
    76  ip             text,
    77  authenticated  int
    78 );
    79 
    80 create index milestone_date_idx on milestone_revision ( date );
    81 create index milestone_authname_idx on milestone_revision ( authname, authenticated );
    82 
    83 -- Track changes of Milestone properties
    84 --
    85 create table milestone_change (
    86  tid   int,
    87  project text,
    88  id    text,
    89  --
    90  name  text,
    91  value text,
    92  unique (tid, project, id)
    93 );
    94 
    95 -- record Milestone metadata
    96 --
    97 create table milestone_schema (
    98  project  text,
    99  name     text,
    100  --
    101  revprop  char,
    102  type     text,
    103  detail   text,
    104  value    text,
    105  order    int,
    106  unique (project, name)
    107 );
    108 }}}
    109 
    110 The existing `milestone` table can be kept, it will simply not be used anymore.
    111 This will allow to test the branch within existing environments.
    112 
    113 The `name` is not unique in `milestone_change`, to allow multiple values (#918)
    114 
    115193== Related Tickets ==
    116194