Version Control Refactoring
This recurring topic is the place to discuss the design of new features for the version control subsystem of Trac.
Some of the major goals are:
support for multiple repositories
- support for scm neutral cache
- ideally, if the GenericTrac approach is finalized, arbitrary properties, comments and attachments for changeset and path Trac resources (useful for code reviews)
The forthcoming changes aim to better support some advanced version control system backends, like Mercurial and git, while not (completely) forgetting Subversion (yet). To that effect, the changes added to the core will be exercised by jointly developing the TracMercurial plugin and probably the GitPlugin as well (see in plugins).
New Repository Cache
A new cache should be designed with the primary focus on achieving maximal performance for the tasks needed by the TracBrowser. The current cache simply contains the required information, but not organized in a way that can speed-up by indexes for the common tasks.
The other constraints would be:
- Repository.get_node(path, rev) using the cached information only, which would be a dramatic improvement for Mercurial, which has no information about the folder themselves.
- Repository.get_path_history could also be implemented in a generic and efficient way using that caching scheme. This would also help address #10183 and #7744.
- The next/prev history navigation between revisions (or rather, their extended children/parents versions) could also be implemented on top of the cache (#8639; #7254; ticket:8813#comment:10)
- Probably Node.get_history as well, not to mention the possibility to find out the copy_to information (#1445).
- Repository.get_changes(from,to) should also be implemented using the cached information in the revision table (that would solve the #2353 issue).
node_change + directories
One idea would be to keep the node_change table as it is but we should also add the paths for files and folders that were not modified themselves, but happen to be in the same folder as one of the file or folder that has been modified.
(1) trunk/ (2) trunk/ (3) trunk/ (4) trunk/ (5) trunk/ dir1/ dir1/ dir1/ ... ... dir2/ dir2/ dir2/ tags/ tags/ README README* README v1/ (copied from trunk) dir3/ dir3/ A A* B B
Would result in:
- change_type will be empty when the path didn't actually change in that revision, but is simply included in the cache for the sake of get_nodes
- base_path should be left empty when it doesn't change, even for regular edits. This will save some space.
- base_rev will point to the last changed rev for that path, i.e. the latest revision in which its change_type was not null.
The paths could maybe be represented by hashes of their dirname, and only the filename part would be in clear text (#3676?).
The most flexible approach for storing extra node fields is certainly to let each backend create and maintain an additional table, e.g. node_changes_hg (see also #2733).
In addition, there will most certainly be the need for a kind of revision_link table in the general case, listing the prev/next relations between revisions.
Of course, a backend which only needs a sequential ordering for its revisions should be able to bypass that table.