wiki:GenericTrac

Context Navigation

Version 9 (modified by Christian Boos, 15 years ago) ( diff )
added #IntermediateModel and #CompleteModel, to complement the #MinimalModel

GenericTrac Data Model

This page attempts to define a new data model for Trac that could be suitable for most of its resources. The main benefits expected from the new model are:

simplification of the internals of Trac, especially for the ticket model, in which the storage of changes is quite cumbersome (see #454, #6466)
solve a few design problems with the current data model (like #1890, #4582)
allow better code reuse and share of the base features among different kinds of resources (numerous examples for that, see #RelatedTickets below)

This stems from the following former proposals:

Design Discussion

Requirements for the new model:

it has to be simple;
it must be flexible, in order to accommodate different kinds of resources and allow for dynamic evolution;
it should remain fast, if not faster than what we currently have;
it should lead to a more compact representation of data

Resource Content

The ticket model is by far richer data model we would have to support, so we could take this as a basis to lay out the foundations of the new model. For ticket, we currently have a fixed set of properties (as columns in the ticket table) and a flexible set of properties (as name/value columns in a ticket_custom table).

Both styles have advantages and disadvantages:

properties as columns:
- (-) only flexibility is to not use some fields (e.g. severity)
- (-) no multiple values per field possible
- (+) faster
- (+) straightforward code (for field1,field2, ... in cursor: ...)
properties in name/value columns
- (+) highest flexibility, add or remove fields at will
- (+) allow for multiple values per name, provided we don't use a primary key as we currently do for the ticket_custom table (#918)
- (-) slower, less memory efficient (?)
- (-) more complex code (?)

In order to reduce the overall complexity, the idea would be to pick only one approach, instead of having to support both. By using the second style, we could also have our "fixed" set of properties, while obviously the first style can't support the second.

It remains to be seen whether the second approach is really less efficient than the first, but this doesn't really matter as we anyway have already to pay the price for that flexibility.

So the new model could be simply:

ticket

name

value

or even:

resource_prop

realm

name

value

(if we use one mega table for all resources)

We could also keep the metadata associated to the properties in the database, instead of being hard-coded and present in the TracIni file.

resource_schema

realm

prop

name

value

Here, possible values for name could be 'label', 'default', 'order', 'type', etc.

Example.

ticket	description	type	wiki
ticket	priority	type	enum
ticket	priority	enum	priority
ticket	priority	default	normal
ticket	need_review	type	checkbox
ticket	need_review	default	0

As a possible refining, it could be possible to have specialized tables, one for each different value column type we want to support:

resource_prop for text values
resource_prop_int for integer values
(resource_prop_float for float values, if really needed)

And we could even differentiate between short and long text values (requirement 4):

resource_prop for short text values
resource_prop_text for long text values

(see #6986).

Along the same lines there's also the question of what should be the id: a natural or a surrogate key?

natural keys: (id would be 123 for ticket #123, id would be 'milestone1' for milestone1, etc.)
- we have to support different type of keys (text for milestone, int for ticket).
  - not a problem for separate tables
  - would require resource_int_prop style for resources having an int id … cumbersome
- less compact but easier to "understand"
- renaming is more difficult
surrogate keys: (id would be a number in a sequence, never shown as such in the interface)
- only one type of keys (int) - faster, simpler, the unique resource_prop table approach is possible
- more compact, not that difficult to read either (there would always be a name=id, value=the natural key entry
- renaming is easy (relations preserved)

This suggests that using surrogate keys would be preferable. Now if this is the case, the resource_prop table could as well become:

name

value

and the realm information could simply be store as another name/value entry.

Resource History

We need to differentiate between the changes to the data, and the metadata about the change. The metadata is about who did the change, when, why the change was made, etc. We can adopt the same flexible strategy as the one for resource properties and store arbitrary name/value pairs of "revision properties".

resource_revprop

changeid

name

value

Typical example:

101001	author	cboos
101001	auth	1
101001	date	1231232114.12
101001	comment	random change

A given changeid is usually related to a specific change in one resource, but there could be other situations:

one change affecting lots of resources (typically #4582 and #5658)
changes affecting changes (typically #454)

The property changes themselves are stored in other tables. We only need the changed properties here, no need to store the old/new values for each change, as this can be deduced from the past changes. Deletions of fields should be represented by setting a field to the NULL value.

Several possibilities here:

ticket_change

changeid

name

value

milestone_change

changeid

name

value

or:

resource_change

changeid

name

value

(surrogate key approach)

The latter has the advantage that it would make easy to relate a given changeid to the resource(s) that were affected by the change, without having to go through each resource table.

We could also keep all property changes as text values or have extra ..._int (..._float) tables for more compact representation.

See also ticket:6466#comment:10 and follow-ups for a discussion about how ticket changes and in particular ticket change edits, could be handled using this approach.

The Model

To summarize the above discussion, here's what could be the new model.

Minimal Model

surrogate keys for all resources
only text fields

Schema

resource_schema

realm

prop

name

value

resource_prop

name

value

resource_revprop

changeid

name

value

resource_change

changeid

name

value

Example Dataset

resource_schema

realm	prop	name	value
ticket	summary	type	text
ticket	description	type	wiki
ticket	reporter	type	text

resource_prop

id	name	value
0	id	130
0	summary	Multiple Project Support
0	description	One day…
0	reporter	joe

resource_revprop

changeid	name	value
1	author	joe
1	date	5 years ago
2	author	joe
2	date	2 years ago
2	comment	come on…
3	author	cboos
3	date	1 year ago
3	comment	sure…

resource_change

id	changeid	name	value
0	1	id	130
0	1	summary	Multiple Project Support
0	1	description	Should be easy…
0	1	reporter	joe
0	2	description	Should be easy… Redmine has it\|\|
0	3	description	One day…

Intermediate Model

surrogate keys for all resources
text and int fields (same as Complete Model, without *_text tables)

The minimal model above is handy for showing the essence of the new model, but it's too simple in practice.

I think we need at the very least to support 'integer' type columns, useful for storing dates efficiently, boolean values, and relations to other resources (as the surrogate id will be an integer).

Complete Model

surrogate keys for all resources
int, short and long text fields

Not absolutely necessary to go that far, this could nevertheless help a lot for the MySQL backend (#6986), possibly also for a future Oracle backend. Don't know about PostgreSQL, but for SQLite this should be indifferent.

Schema

resource_schema

realm

prop

name

value

resource_prop

name

value

resource_revprop

changeid

name

value

resource_change

changeid

name

value

resource_prop_text

name

value

resource_revprop_text

changeid

name

value

resource_change_text

changeid

name

value

resource_prop_int

name

value

resource_revprop_int

changeid

name

value

resource_change_int

changeid

name

value

Example Dataset

resource_schema

realm	prop	name	value
ticket	summary	type	text
ticket	description	type	wiki
ticket	reporter	type	string

resource_prop

id	name	value
0	reporter	joe

resource_revprop

changeid	name	value
1	author	joe
2	author	joe
3	author	cboos

resource_change

id	changeid	name	value
0	1	reporter	joe

resource_prop_text

id	name	value
0	summary	Multiple Project Support
0	description	One day…

resource_revprop_text

changeid	name	value
2	comment	come on…
3	comment	sure…

resource_change_text

id	changeid	name	value
0	1	summary	Multiple Project Support
0	1	description	Should be easy…
0	2	description	Should be easy… Redmine has it\|\|
0	3	description	One day…

resource_prop_int

id	name	value
0	id	130

resource_revprop_int

changeid	name	value
1	date	5 years ago
2	date	2 years ago
3	date	1 year ago

resource_change_int

id	changeid	name	value
0	1	id	130

Possible Implementation Plan

Milestone First

modify the Milestone module so that it uses the new proposed datamodel. See #TheMilestoneExample.
experiment new tabbed view for the milestone (View, Discussion, History). See TracProject/UiGuidelines.
milestone should be able to have attachments, too (#3068)
adapt the Roadmap module to the new model
adapt the Milestone admin component to the new model

Once this is complete, validate the genericity by promoting the components to be first class resources as well (#1233).

Ticket First

As the ticket module is by far the most complex, it might be worth to try out the new model there first:

we could verify that we meet the expectations in terms of code simplification, solving open issues, etc.
we could detect early if there are no regressions or risk of losing current features
by redeploying the ticket infrastructure to the other components, we could spread the most benefits of tickets (comments, custom fields, queries, etc.) to other resources (milestone, wiki, component, …)

Related Tickets

Data model issues:
#4130
Ticket Component Interface Fails to Properly Check for Existing Component
#11301
intermittent failure with notification test
Resource related:
#150
User-centric storage. Last-viewed information and 'intelligent' followup
#221
Creating TR for multiple components
#695
Keywords for wiki pages
#787
Change attachment description
#918
[patch] Custom Ticket Fields should support the multiple selection type
#1113
Show milestone changes in timeline
#1386
Adding properties to Wiki Pages
#1395
Text box for duplicate when a bug is a duplicate
#1678
Show component view in timeline view for checkins
#1835
Add a discussion tab for Trac resources
#2035
Changeset commenting
#2344
sub milestone
#2464
Conditional fields in tickets
#2465
Add "custom query" functionality for changesets
#2467
Link user name in reports to custom query showing that user's open tickets
#2662
assign tickets to multiple users
#2961
custom comment fields
#3003
milestone could be a ticket
#3080
Custom field sorts only as text
#3718
Trac should use HTTP 301 Moved when milestones get renamed
#3911
implement an object system (xwiki, roundup)
#4588
User Page: trac links to these pages
#5211
Ticket - Wiki Integration enhancement/suggestion
#7871
Add start date to a milestone
#8335
Create People page with user profiles and status of who's doing what.
#9263
if SVN is used, milestone should manage/allowtoassign links for branch and tag

Note: See TracWiki for help on using the wiki.

Download in other formats:

Plain Text