27 | | === Possible Implementation Plan === |
28 | | ==== Milestone First ==== |
| 27 | == Design Discussion == |
| 28 | |
| 29 | Requirements for the new model: |
| 30 | 1. it has to be ''simple''; |
| 31 | 2. it must be ''flexible'', in order to accommodate different kinds of resources and |
| 32 | allow for dynamic evolution; |
| 33 | 3. it should remain ''fast'', if not faster than what we currently have; |
| 34 | 4. it should lead to a more ''compact'' representation of data |
| 35 | |
| 36 | === Resource Content === |
| 37 | |
| 38 | The ticket model is by far richer data model we would have to support, |
| 39 | so we could take this as a basis to lay out the foundations of the new model. |
| 40 | For ticket, we currently have a fixed set of properties |
| 41 | (as columns in the `ticket` table) |
| 42 | and a flexible set of properties |
| 43 | (as name/value columns in a `ticket_custom` table). |
| 44 | |
| 45 | Both styles have advantages and disadvantages: |
| 46 | 1. properties as columns: |
| 47 | - (-) only flexibility is to not use some fields (e.g. severity) |
| 48 | - (-) no multiple values per field possible |
| 49 | - (+) faster |
| 50 | - (+) straightforward code (`for field1,field2, ... in cursor: ...`) |
| 51 | 2. properties in name/value columns |
| 52 | - (+) highest flexibility, add or remove fields at will |
| 53 | - (+) allow for multiple values per name, provided we don't use a primary key |
| 54 | as we currently do for the `ticket_custom` table (#918) |
| 55 | - (-) slower, less memory efficient (?) |
| 56 | - (-) more complex code (?) |
| 57 | |
| 58 | In order to reduce the overall complexity, the idea would be to pick only one approach, instead of having to support both. |
| 59 | By using the second style, we could also have our "fixed" set of properties, |
| 60 | while obviously the first style can't support the second. |
| 61 | |
| 62 | It remains to be seen whether the second approach is really less efficient than the first, but this doesn't really matter as we anyway have already to pay the price for |
| 63 | that flexibility. |
| 64 | |
| 65 | So the new model could be simply: |
| 66 | |
| 67 | '''ticket''' |
| 68 | || ''id'' || ''name'' || ''value'' || |
| 69 | |
| 70 | or even: |
| 71 | |
| 72 | '''resource_prop''' |
| 73 | || ''realm'' || ''id'' || ''name'' || ''value'' || |
| 74 | (if we use one mega table for all resources) |
| 75 | |
| 76 | We could also keep the metadata associated to the properties in the database, |
| 77 | instead of being hard-coded and present in the TracIni file. |
| 78 | |
| 79 | '''resource_schema''' |
| 80 | || ''realm'' || ''prop'' || ''name'' || ''value'' || |
| 81 | |
| 82 | Here, possible values for ''name'' could be 'label', 'default', 'order', 'type', etc. |
| 83 | |
| 84 | Example. |
| 85 | || ticket || description || type || wiki || |
| 86 | || ticket || priority || type || enum || |
| 87 | || ticket || priority || enum || priority || |
| 88 | || ticket || priority || default || normal || |
| 89 | || ticket || need_review || type || checkbox || |
| 90 | || ticket || need_review || default || 0 || |
| 91 | |
| 92 | |
| 93 | As a possible refining, it could be possible to have specialized tables, |
| 94 | one for each different value column type we want to support: |
| 95 | - '''resource_prop''' for text values |
| 96 | - '''resource_prop_int''' for integer values |
| 97 | - ('''resource_prop_float''' for float values, if really needed) |
| 98 | And we could even differentiate between short and long text values (requirement 4): |
| 99 | - '''resource_prop''' for short text values |
| 100 | - '''resource_prop_text''' for long text values |
| 101 | (see #6986). |
| 102 | |
| 103 | |
| 104 | Along the same lines there's also the question of what should be the ''id'': |
| 105 | a natural or a surrogate key? |
| 106 | - natural keys: (''id'' would be 123 for ticket !#123, id would be 'milestone1' for milestone1, etc.) |
| 107 | - we have to support different type of keys (text for milestone, int for ticket). |
| 108 | - not a problem for separate tables |
| 109 | - would require ''resource_int_prop'' style for resources having an ''int'' |
| 110 | id ... cumbersome |
| 111 | - less compact but easier to "understand" |
| 112 | - renaming is more difficult |
| 113 | - surrogate keys: (''id'' would be a number in a sequence, never shown as such in the interface) |
| 114 | - only one type of keys (int) - faster, simpler, |
| 115 | the unique ''resource_prop'' table approach is possible |
| 116 | - more compact, not that difficult to read either (there would always be a |
| 117 | ''name=id'', ''value=the natural key'' entry |
| 118 | - renaming is easy (relations preserved) |
| 119 | |
| 120 | This suggests that using surrogate keys would be preferable. |
| 121 | Now if this is the case, the '''resource_prop''' table could as well become: |
| 122 | || ''id'' || ''name'' || ''value'' || |
| 123 | and the ''realm'' information could simply be store as another name/value entry. |
| 124 | |
| 125 | |
| 126 | === Resource History === |
| 127 | |
| 128 | We need to differentiate between the changes to the data, and the metadata about the change. The metadata is about who did the change, when, why the change was made, etc. |
| 129 | We can adopt the same flexible strategy as the one for resource properties and |
| 130 | store arbitrary name/value pairs of "revision properties". |
| 131 | |
| 132 | '''resource_revprop''' |
| 133 | || ''changeid'' || ''name'' || ''value'' || |
| 134 | |
| 135 | Typical example: |
| 136 | || 101001 || author || cboos || |
| 137 | || 101001 || auth || 1 || |
| 138 | || 101001 || date || 1231232114.12 || |
| 139 | || 101001 || comment || random change || |
| 140 | |
| 141 | A given ''changeid'' is usually related to a specific change in one resource, |
| 142 | but there could be other situations: |
| 143 | - one change affecting lots of resources (typically #4582 and #5658) |
| 144 | - changes affecting changes (typically #454) |
| 145 | |
| 146 | |
| 147 | The property changes themselves are stored in other tables. |
| 148 | |
| 149 | Several possibilities here: |
| 150 | '''ticket_change'' |
| 151 | || ''id'' || ''changeid'' || ''name'' || ''value'' || |
| 152 | '''milestone_change'' |
| 153 | || ''id'' || ''changeid'' || ''name'' || ''value'' || |
| 154 | or: |
| 155 | '''resource_change'' |
| 156 | || ''id'' || ''changeid'' || ''name'' || ''value'' || |
| 157 | (surrogate key approach) |
| 158 | |
| 159 | The latter has the advantage that it would make easy to relate a given ``changeid`` |
| 160 | to the resource(s) that were affected by the change, without having to go through |
| 161 | each resource table. |
| 162 | |
| 163 | We could also keep all property changes as text values |
| 164 | or have extra `..._int` (`..._float`) tables for more compact |
| 165 | representation. |
| 166 | |
| 167 | See also ticket:6466#comment:10 and follow-ups for a discussion about how ticket changes and in particular ticket change edits, could be handled using this approach. |
| 168 | |
| 169 | |
| 170 | == Possible Implementation Plan == |
| 171 | |
| 172 | === Milestone First === |