Context Navigation

Changes between Version 6 and Version 7 of GenericTrac

Timestamp:: Sep 3, 2009, 5:32:27 PM (15 years ago)
Author:: Christian Boos
Comment:: updated with ideas developed in #6466

Legend:

: Unmodified
: Added
: Removed
: Modified

GenericTrac

-              v6
+              v7
 = GenericTrac Data Model =
 This page tries to define a new data model that could be suitable
 for most Trac resources. The main benefits expected from the new model are:
+This page attempts to define a new data model for Trac that could be suitable
+for most of its resources. The main benefits expected from the new model are:
  - simplification of the internals of Trac, especially for the ticket model,
    in which the storage of changes is quite cumbersome (see #454, #6466)
+ - solve several desing problems with the current data model (#1890)
+ - allow better code reuse
+ - solve a few design problems with the current data model (like #1890, #4582)
+ - allow better code reuse and share of the base features
+   among different kinds of resources (numerous examples for that,
+   see [#RelatedTickets] below)
 This stems from the following former proposals:
 …
  - TracDev/Proposals/DataModel
  - TracDev/Proposals/Journaling
- - WikiContext are used as ''resource descriptors'' and have a `.resource` field
-   which enables one to fetch the corresponding data model instance
 See also [googlegroups:trac-dev:8cf3f5fe0e476ce5 this mail].
 …
 it could also be a good opportunity to take the
 ''[TracMultipleProjects multiple project]'' considerations into account (#130).
 Each resource related table should probably get a `project` identifier field.
+Each resource related table could get a `project` identifier field.
 Working on the generic aspect of Trac should also make it possible to implement various ''generic'' operations on Trac resources as plugins, mainly being able to (re-)implement TracCrossReferences as a plugin (see also #6543).
+=== Possible Implementation Plan ===
+==== Milestone First ====
+== Design Discussion ==
+Requirements for the new model:
+. it has to be ''simple'';
+. it must be ''flexible'', in order to accommodate different kinds of resources and
+    allow for dynamic evolution;
+. it should remain ''fast'', if not faster than what we currently have;
+. it should lead to a more ''compact'' representation of data
+=== Resource Content ===
+The ticket model is by far richer data model we would have to support,
+so we could take this as a basis to lay out the foundations of the new model.
+For ticket, we currently have a fixed set of properties
+(as columns in the `ticket` table)
+and a flexible set of properties
+(as name/value columns in a `ticket_custom` table).
+Both styles have advantages and disadvantages:
+. properties as columns:
+   - (-) only flexibility is to not use some fields (e.g. severity)
+   - (-) no multiple values per field possible
+   - (+) faster
+   - (+) straightforward code (`for field1,field2, ... in cursor: ...`)
+. properties in name/value columns
+   - (+) highest flexibility, add or remove fields at will
+   - (+) allow for multiple values per name, provided we don't use a primary key
+     as we currently do for the `ticket_custom` table (#918)
+   - (-) slower, less memory efficient (?)
+   - (-) more complex code (?)
+In order to reduce the overall complexity, the idea would be to pick only one approach, instead of having to support both.
+By using the second style, we could also have our "fixed" set of properties,
+while obviously the first style can't support the second.
+It remains to be seen whether the second approach is really less efficient than the first, but this doesn't really matter as we anyway have already to pay the price for
+that flexibility.
+So the new model could be simply:
+'''ticket'''
+|| ''id'' || ''name'' || ''value'' ||
+or even:
+'''resource_prop'''
+|| ''realm'' || ''id'' || ''name'' || ''value'' ||
+(if we use one mega table for all resources)
+We could also keep the metadata associated to the properties in the database,
+instead of being hard-coded and present in the TracIni file.
+'''resource_schema'''
+|| ''realm'' || ''prop'' || ''name'' || ''value'' ||
+Here, possible values for ''name'' could be 'label', 'default', 'order', 'type', etc.
+Example.
+|| ticket || description || type || wiki ||
+|| ticket || priority || type || enum ||
+|| ticket || priority || enum || priority ||
+|| ticket || priority || default || normal ||
+|| ticket || need_review || type || checkbox ||
+|| ticket || need_review || default || 0 ||
+As a possible refining, it could be possible to have specialized tables,
+one for each different value column type we want to support:
+ - '''resource_prop''' for text values
+ - '''resource_prop_int''' for integer values
+ - ('''resource_prop_float''' for float values, if really needed)
+And we could even differentiate between short and long text values (requirement 4):
+ - '''resource_prop''' for short text values
+ - '''resource_prop_text''' for long text values
+(see #6986).
+Along the same lines there's also the question of what should be the ''id'':
+a natural or a surrogate key?
+ - natural keys: (''id'' would be 123 for ticket !#123, id would be 'milestone1' for milestone1, etc.)
+   - we have to support different type of keys (text for milestone, int for ticket).
+     - not a problem for separate tables
+     - would require ''resource_int_prop'' style for resources having an ''int''
+       id ... cumbersome
+   - less compact but easier to "understand"
+   - renaming is more difficult
+ - surrogate keys: (''id'' would be a number in a sequence, never shown as such in the interface)
+   - only one type of keys (int) - faster, simpler,
+     the unique ''resource_prop'' table approach is possible
+   - more compact, not that difficult to read either (there would always be a
+     ''name=id'', ''value=the natural key'' entry
+   - renaming is easy (relations preserved)
+This suggests that using surrogate keys would be preferable.
+Now if this is the case, the '''resource_prop''' table could as well become:
+|| ''id'' || ''name'' || ''value'' ||
+and the ''realm'' information could simply be store as another name/value entry.
+=== Resource History ===
+We need to differentiate between the changes to the data, and the metadata about the change. The metadata is about who did the change, when, why the change was made, etc.
+We can adopt the same flexible strategy as the one for resource properties and
+store arbitrary name/value pairs of "revision properties".
+'''resource_revprop'''
+|| ''changeid'' || ''name'' || ''value'' ||
+Typical example:
+|| 101001 || author  || cboos         ||
+|| 101001 || auth    || 1             ||
+|| 101001 || date    || 1231232114.12 ||
+|| 101001 || comment || random change ||
+A given ''changeid'' is usually related to a specific change in one resource,
+but there could be other situations:
+ - one change affecting lots of resources (typically #4582 and #5658)
+ - changes affecting changes (typically #454)
+The property changes themselves are stored in other tables.
+Several possibilities here:
+'''ticket_change''
+|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
+'''milestone_change''
+|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
+or:
+'''resource_change''
+|| ''id'' || ''changeid'' || ''name'' || ''value'' ||
+(surrogate key approach)
+The latter has the advantage that it would make easy to relate a given ``changeid``
+to the resource(s) that were affected by the change, without having to go through
+each resource table.
+We could also keep all property changes as text values
+or have extra `..._int` (`..._float`) tables for more compact
+representation.
+See also ticket:6466#comment:10 and follow-ups for a discussion about how ticket changes and in particular ticket change edits, could be handled using this approach.
+== Possible Implementation Plan ==
+=== Milestone First ===
  - modify the Milestone module so that it uses the new proposed datamodel. See [#TheMilestoneExample].
  - experiment new tabbed view for the milestone (''View'', ''Discussion'', ''History''). See TracProject/UiGuidelines.
 …
 Once this is complete, validate the genericity by promoting the components to be first class resources as well (#1233).
 ==== Ticket First ====
+=== Ticket First ===
 As the ticket module is by far the most complex, it might be worth to
 …
-== The Model ==
-=== The Milestone Example ===
-The proposed data model would be:
-{{{
-#!sql
--- record Milestone current properties
---
-create table milestone_prop (
- project text,
- id    text,
- --
- name  text,
- value text
-);
-create index milestone_idx on milestone_prop (id, name);
--- record Milestone change metadata
---
-create table milestone_revision (
- tid            int primary key,
- --
- date           int,
- authname       text,
- author         text,
- ip             text,
- authenticated  int
-);
-create index milestone_date_idx on milestone_revision ( date );
-create index milestone_authname_idx on milestone_revision ( authname, authenticated );
--- Track changes of Milestone properties
---
-create table milestone_change (
- tid   int,
- project text,
- id    text,
- --
- name  text,
- value text,
- unique (tid, project, id)
-);
--- record Milestone metadata
---
-create table milestone_schema (
- project  text,
- name     text,
- --
- revprop  char,
- type     text,
- detail   text,
- value    text,
- order    int,
- unique (project, name)
-);
-}}}
-The existing `milestone` table can be kept, it will simply not be used anymore.
-This will allow to test the branch within existing environments.
-The `name` is not unique in `milestone_change`, to allow multiple values (#918)
 == Related Tickets ==