Edgewall Software

Changes between Version 46 and Version 47 of GenericTrac


Ignore:
Timestamp:
May 31, 2016, 6:07:35 PM (6 years ago)
Author:
figaro
Comment:

Cosmetic changes

Legend:

Unmodified
Added
Removed
Modified
  • GenericTrac

    v46 v47  
    1 = GenericTrac Data Model =
    2 
    3 This page defines a new data model for Trac that should be suitable
    4 for manipulating and storing most of the data from resources,
    5 along with their change history.
     1[[PageOutline(2-5,Contents,pullout)]]
     2
     3= GenericTrac Data Model
     4
     5This page defines a new data model for Trac that should be suitable for manipulating and storing most of the data from resources, along with their change history.
    66
    77There are effectively two parts to this proposal:
     
    99 - the storage level, how the resource data are stored on persistent memory
    1010
    11 The high level API is intended to unify and take the best aspects of the different model used so far (!WikiPage, Ticket, Milestone, etc.). Actually, only the Ticket is really flexible and nice, the others are just simpler and much more limited models, so it's rather an "embrace and extend" move from the Ticket model to the others...
    12 
    13 While the high level API tends to unify things, the low level API will be there to introduce some diversity in the storage, at best transparent to the "user" of the higher level API. Those different storage models could be different table layouts in the database, but also in the longer term, different or complementary storage backend. It would be more appropriate to keep the change history in a dedicated system, like a version control backend, even a [DistributedTrac distributed] one).
     11The high level API is intended to unify and take the best aspects of the different model used so far (!WikiPage, Ticket, Milestone, etc.). Actually, only the Ticket is really flexible and nice, the others are just simpler and much more limited models, so it's rather an "embrace and extend" move from the Ticket model to the others.
     12
     13While the high level API tends to unify things, the low level API will be there to introduce some diversity in the storage, at best transparent to the "user" of the higher level API. Those different storage models could be different table layouts in the database, but also in the longer term, different or complementary storage backend. It would be more appropriate to keep the change history in a dedicated system, like a version control backend, even a [DistributedTrac distributed] one.
    1414
    1515The main benefits expected from the new API are:
    16  - simplification of the internals of Trac, especially for the ticket model,
    17    in which the storage of changes is quite cumbersome (see #454, #6466)
     16 - simplification of the internals of Trac, especially for the ticket model, in which the storage of changes is quite cumbersome (see #454, #6466)
    1817 - solve a few design problems with the current data model (like #1890, #4582)
    19  - allow better code reuse and sharing of the base features
    20    among different kinds of resources (like #695, #1113, and a lot more,
    21    see [#RelatedTickets] below)
    22 
    23 == Related Proposals ==
    24 
    25 The idea is not new, it has its root in the TracObjectModelProposal, and was refined several times (TracDev/Proposals/DataModel, TracDev/Proposals/Journaling, and [googlegroups:trac-dev:8cf3f5fe0e476ce5 this mail]).
    26 
    27 As this will be a major redesign of the data model,
    28 it could also be a good opportunity to take the
    29 **MultipleProjectSupport** considerations into account (#11025).
    30 
    31 Somehow related to the generic data model, but not strictly depending on it,
    32 Trac should also make it possible to implement new kinds of plugins that would
    33 perform ''generic'' operations on Trac resources.
    34 This could allow the (re-)implementation of the TracCrossReferences idea
    35 as a plugin, for example.
     18 - allow better code reuse and sharing of the base features among different kinds of resources (like #695, #1113, and a lot more, see [#RelatedTickets] below)
     19
     20== Related Proposals
     21
     22The idea is not new, it has its root in the TracObjectModelProposal, and was refined several times in TracDev/Proposals/DataModel, TracDev/Proposals/Journaling and [googlegroups:trac-dev:8cf3f5fe0e476ce5 this mail].
     23
     24As this will be a major redesign of the data model, it could also be a good opportunity to take the **MultipleProjectSupport** considerations into account (#11025).
     25
     26Somehow related to the generic data model, but not strictly depending on it, Trac should also make it possible to implement new kinds of plugins that would perform ''generic'' operations on Trac resources.
     27This could allow the (re-)implementation of the TracCrossReferences idea as a plugin, for example.
    3628See #31, #886, #6543 and TracDev/Proposals/TracRelations.
    3729
    38 
    39 == Design Discussion ==
     30== Design Discussion
    4031
    4132Requirements for the new model:
    4233 1. it has to be ''simple'' (//easy to understand, easy to code with, when looking at the raw data in the database one should be able to intuitively understand what it means//)
    43  2. it must be ''flexible'' ((//accommodate different kinds of resources, allow for dynamic evolution like addition or removal of fields by plugin or via the web admin//)
    44  3. it should remain ''fast'', if not become faster than what we currently have;
    45  4. it should lead to a more ''compact'' representation of data
    46  5. all the existing constraints about maintaining resource history and associated metadata should be taken into account
    47 
    48 Note that the persistence constraints imposed by the Trac data model are not necessarily only (or even best) approached using the RelationalModel, one could imagine that a future version could use more document-oriented persistence layers (e.g. [http://www.mongodb.org/ MongoDB]), or object-oriented databases (e.g. [http://www.zodb.org/ ZODB]). Also, as said above, the versioning of resources should be delegated to a version control backend, with a default, simple, in-database VCS backend.
    49 
    50 === The ticket data model ===
    51 
    52 The ticket model is by far richer data model we would have to support,
    53 so we could use it as a basis to lay out the foundations of the new model.
     34 1. it must be ''flexible'' (//accommodate different kinds of resources, allow for dynamic evolution like addition or removal of fields by plugin or via the web admin//)
     35 1. it should remain ''fast'', if not become faster than what we currently have;
     36 1. it should lead to a more ''compact'' representation of data
     37 1. all the existing constraints about maintaining resource history and associated metadata should be taken into account
     38
     39Note that the persistence constraints imposed by the Trac data model are not necessarily only (or even best) approached using the RelationalModel, one could imagine that a future version could use more document-oriented persistence layers (such as [http://www.mongodb.org/ MongoDB]), or object-oriented databases (such as [http://www.zodb.org/ ZODB]). Also, as said above, the versioning of resources should be delegated to a version control backend, with a default, simple, in-database VCS backend.
     40
     41=== The ticket data model
     42
     43The ticket model is by far richer data model we would have to support, so we could use it as a basis to lay out the foundations of the new model.
    5444
    5545==== Trac =< 1.1.3 model #legacy
    56 For ticket, we currently have a fixed set of properties
    57 (as columns in the `ticket` table)
    58 and a flexible set of properties
    59 (as prop/value columns in a `ticket_custom` table).
     46
     47For ticket, we currently have a fixed set of properties (as columns in the `ticket` table) and a flexible set of properties (as prop/value columns in a `ticket_custom` table).
    6048
    6149Note that we'd like to get ''relations'' (or ''links'') as well, see #31.
     
    6654   - (-) no multiple values per field possible 
    6755   - (+) faster especially for multicriteria queries (?)
    68    - (+) straightforward code (`for field1,field2, ... in cursor: ...`) (?)
     56   - (+) straightforward code (`for field1, field2, ... in cursor: ...`) (?)
    6957 2. properties stored in prop/value columns of a generic table
    7058   - (+) highest flexibility, add or remove fields at will
    71    - (+) allow for multiple values per property, provided we don't use a primary key
    72      as we currently do for the `ticket_custom` table (#918)
     59   - (+) allow for multiple values per property, provided we don't use a primary key as we currently do for the `ticket_custom` table (#918)
    7360     - (+) a slight extension would allow ordered multiple values (sequences), otherwise we have no control over the order in which those multiple values are retrieved, which might introduce confusion as this order will vary when new values are added;
    7461       while primarily useful for `int` fields, this could also be useful to implement `text` fields for Oracle (working around the limits on varchars and not having to use CLOBs)
    75    - (-) slower, less memory efficient, especially for multicriteria queries(?)
     62   - (-) slower, less memory efficient, especially for multi-criteria queries (?)
    7663   - (-) more complex code (?)
    77    - (-) cannot use DB features to assure data integrity (not NULL constraints, foreign key constraints, etc.)
     64   - (-) cannot use database features to assure data integrity (not NULL constraints, foreign key constraints, etc.)
    7865   - (-) very difficult to integrate against the database (e.g. import/export, analysis)
    7966
     
    10592
    10693CREATE TABLE ticket_custom (
    107        ticket               integer,
     94       ticket           integer,
    10895       name             text,
    10996       value            text,
     
    129116}}}
    130117
    131 Squeezed in the above are the ticket comments, complete with "reply-to" information and their own versioning scheme ;-)
    132 
     118Squeezed in the above are the ticket comments, complete with "reply-to" information and their own versioning scheme.
    133119
    134120==== Refactored model
    135121
    136 The new model is heavily relational, as opposed to the earlier EAV-like approach (see [./Brainstorm#EAV]).
     122The new model is heavily relational, as opposed to the earlier [wikipedia:Entity%E2%80%93attribute%E2%80%93value_model EAV-like approach], see [./Brainstorm#EAV].
    137123
    138124We keep a ''primary'' table for the resource, for all 1-to-1 properties and relations (table named `Ticket` here).
     
    230216}}}
    231217
    232 Adding the Project to mix becomes a simple matter of yet another relation table:
     218Adding the Project to the mix becomes a matter of yet another relation table:
    233219{{{#!sql
    234220CREATE TABLE Ticket_Project (
     
    241227CREATE INDEX Ticket_Project_Project_idx ON Ticket_Project (Project);
    242228}}}
     229
    243230A word about naming conventions:
    244231 * for table names:
     
    275262CREATE INDEX TicketChange_TEvent_idx ON TicketChange (TEvent);
    276263}}}
     264
    277265The `<Resource>Change` table is clearly inspired of the existing `ticket_change`: we don't want to replicate all the complexity of the ticket relational model in dedicated versioning tables, because it would be clearly overkill. We're not interested in //querying// these values anyway, only in storing them for displaying them whenever we're going to show the history of the resource.
    278266
     
    321309In addition, even though the relational information given by the foreign keys can be enlightening for understanding the data model, we'll need an additional descriptive layer, metadata describing the precise relationships between the tables (e.g. list of fields for a given resource/type?).
    322310
    323 This metadata is a natural continuation of the configuration of fields one can find in `[ticket-custom]` section of TracIni. Of course this time it is generalized to the "standard" resources and fields, and it is stored in the DB. Speaking of custom fields, one can see how custom fields fit in the above approach: they will simply correspond to additional secondary tables. The presence of a field in the primary table or in one secondary table becomes an implementation detail abstracted away by the metadata description of the resource. This will be also used to ease the transition to the new model by first describing the "current" state (all 1-to-1 fields in the primary table) and allowing for progressive upgrade steps.
     311This metadata is a natural continuation of the configuration of fields one can find in `[ticket-custom]` section of TracIni. Of course this time it is generalized to the "standard" resources and fields, and it is stored in the database. Speaking of custom fields, one can see how custom fields fit in the above approach: they will simply correspond to additional secondary tables. The presence of a field in the primary table or in one secondary table becomes an implementation detail abstracted away by the metadata description of the resource. This will be also used to ease the transition to the new model by first describing the "current" state (all 1-to-1 fields in the primary table) and allowing for progressive upgrade steps.
    324312
    325313Finally, for the code that would handle the resources in a generic way, instead of dealing with the same tables for every resources (the EAV approach), they would deal with the same API and use the metadata to know how to handle each particular kind of resources.
     
    327315By the way, we could also easily graft the idea of ''sub-types'' (cf. TicketTypes) that would use the same primary table as their main type but perhaps a subset (or a slightly different set) of secondary tables (e.g. no 'Version' for tasks).
    328316
    329 
    330 == Implementation Plan ==
     317== Implementation Plan
    331318
    332319The idea is to introduce the "right" model at the occasion of the implementation of the MultipleProjectSupport. So the new `project` resource will use this generic model, but as we're primarily concerned with `ticket`s in association with the `project`s, we'll have to address the ticket module early as well.
    333320
    334 === Applied on Ticket module ===
    335 
    336 As the ticket module is by far the most complex, it might be worth to
    337 try out the new model there first:
     321=== Applied on Ticket module
     322
     323As the ticket module is by far the most complex, it might be worth to try out the new model there first:
    338324 - we could verify that we meet the expectations in terms of code simplification,
    339325   solving open issues, etc.
     
    343329   custom fields, queries, etc.) to other resources (milestone, wiki, component, ...)
    344330
    345 === Applied on new Component and Version resources ===
    346 
    347 The not yet existing resource types like 'component' and 'version' will be implemented using this generic approach (#1233). Initially the corresponding fields would remain in the primary  Ticket table, they will be replaced from being properties (`component`, `version`) to relations (`Component`, `Version`). In a second step we could move these fields from the primary Ticket table to new secondary relation tables (`Ticket_Version`) and (`Ticket_Component`), still keeping the 1-to-1 relationship. The last step would be to allow them to be used as multi-valued relations (1 ticket to m versions and components).
    348 
    349 === Applied on Milestone module ===
     331=== Applied on new Component and Version resources
     332
     333The not yet existing resource types like 'component' and 'version' will be implemented using this generic approach (#1233). Initially the corresponding fields would remain in the primary Ticket table, they will be replaced from being properties (`component`, `version`) to relations (`Component`, `Version`). In a second step we could move these fields from the primary Ticket table to new secondary relation tables (`Ticket_Version`) and (`Ticket_Component`), still keeping the 1-to-1 relationship. The last step would be to allow them to be used as multi-valued relations (1 ticket to m versions and components).
     334
     335=== Applied on Milestone module
    350336
    351337 - modify the Milestone module so that it uses the new proposed datamodel.
    352338 - adapt the Roadmap module to the new model
    353339 - adapt the Milestone admin component to the new model
    354  - for a first approach, the generic scheme would be enough,
    355    no need for versioning as we currently don't support that
     340 - for a first approach, the generic scheme would be enough, no need for versioning as we currently don't support that
    356341
    357342Later:
     
    359344 - migrate to generic versioned scheme
    360345
    361 === Applied on Wiki module  ===
     346=== Applied on Wiki module
    362347
    363348Milestone and components are closer in spirit to a wiki page than to a ticket (we have the long standing #3776 ticket).
     
    367352If we do this, we can also at the same time think about the storage model, in particular how past versions could be stored in a VCS (#1132). We could then think about a serialization model that would play nicely with the genericity of the Trac resources on one hand side, and with the external editors on the other hand.
    368353
    369 See how a VCS backend could be written for the versioned generic scheme...
    370 
    371 
    372 == Related Tickets ==
     354See how a VCS backend could be written for the versioned generic scheme.
     355
     356== Related Tickets
    373357
    374358 - Data model issues: [[TicketQuery(status=!closed&keywords=~model)]]
    375359 - Resource related: [[TicketQuery(status=!closed&keywords=~tracobject)]]
    376360
    377 See also: [./Brainstorm] for older iterations of the idea and discussion.
     361See also: [./Brainstorm] and TracDev/ScratchPad/DataModels for older iterations of the idea and discussion.