Context Navigation

Changes between Version 15 and Version 16 of VcRefactoring

Timestamp:: Jan 28, 2012, 12:35:12 PM (12 years ago)
Author:: Christian Boos
Comment:: start to update the page, remove now obsolete information about done tasks and gather constraints for futur developments

Legend:

: Unmodified
: Added
: Removed
: Modified

VcRefactoring

-              v15
+              v16
 = Version Control Refactoring =
+This recurring branch is a sandbox for introducing new features
+that are potentially disruptive.
+Note that there's currently no source:sandbox/vc-refactoring branch, but
+there's a source:sandbox/multirepos one for the [#SupportforMultipleRepositories].
+This recurring topic is the place to discuss the design of new features
+for the **version control** subsystem of Trac.
 [[PageOutline]]
 The major goals for [milestone:0.12] in the versioncontrol area will be:
 . support for multiple repositories
+Some of the major goals are:
+. ~~support for multiple repositories~~
 . support for scm neutral cache
 . ideally, if the GenericTrac approach is finalized, arbitrary properties,
 …
 The forthcoming changes aim to better support some
 advanced version control system backends, like
+[http://www.selenic.com/mercurial Mercurial].
+[http://www.selenic.com/mercurial Mercurial] and [http://git-scm.org git],
+while not (completely) forgetting [http://subversion.apache.org Subversion] (yet).
 To that effect, the changes added to the core will be
+exercised by jointly developing the TracMercurial plugin.
+exercised by jointly developing the TracMercurial plugin
+and probably the GitPlugin as well (see in [source:plugins]).
 Sub topics:
 [[TitleIndex(VcRefactoring/)]]
+ - Initial Mercurial support - [./@15#SupportforMercurial-likeVersionControlSystem]
+ - MultiRepos - see source:sandbox/multirepos and [./@15#SupportforMultipleRepositories]
-The current controller changes are simple refactorings which are not changing the versioncontrol API, but cleaning up the internals of the versioncontrol related web ui.
-Those changes could eventually go in 0.11.
-== Support for Multiple scopes within a Repository ==
-This is mostly done in trunk, now.
- * This basically means fixing #1830. Initial work on this was done in r2992.
-   This has been reworked more in-depth in [3174/trunk/trac/versioncontrol/svn_fs.py].
- * There's also the idea to "scope" a repository using multiple paths.
-   This will probably be done for milestone:0.11.
-   Actually, it should be possible to achieve the above using FineGrainedPermissions,
-   thanks to r3174, but I haven't verified this, up to now.
-== Support for Multiple Repositories ==
-A first implementation of this feature is now available in the MultipleRepositorySupport branch (for the next release, i.e. Trac [milestone:0.12]).
-The problematic of the cache is for now avoided, this multiple repository support is only for the non-cached repositories, i.e. `hg` (Mercurial) and `direct-svnfs` (Subversion). You can happily mix ''both'' types of repositories, if needed.
-The cache needs to be modified/extended as well,
-in order to accommodate multiple repositories.
-There are several options:
-. use the cache as it is, merging all the repositories in a kind of virtual repository;
-    the first component of the path would be the name of the repository.
-. use a separate pair of tables for each repository
-. use a dedicated db to cache each repository
-Option 1. seems the best way to go.
-Its efficiency depends mainly about how the new cache will be implemented.
-If we go with path ids, then using one table would be practical, I think.
-See also: #2086, trac-dev:340 and, more recently, [googlegroups:trac-users:14ca95377e4a53b5 this mail] where I explain how TracLinks will support multiple repositories.
-Another important interdependency which comes to mind is the support for multiple projects in a single environment (see this
-[TracMultipleProjects/SingleEnvironment#ProposedImplementation proposal]).
-In this scenario, each project would have one or more repositories.
-Those repositories could eventually be ''shared'' between projects.
-Take the following example:
- - Project A
-   - repository /srv/svn/repo1 (trunk, branches, etc.)
- - Project B
-   - repository /srv/svn/repo1 (trunk, branches, etc.)
-   - repository /srv/svn/repo2 (trunk, branches, etc.)
-Within a wiki page of project A, `[123]` or `source:trunk/` would have the usual 1-to-1 meaning. The same resources, referenced from within a page belonging to project B could be accessed using InterTrac links: `[A123]` or `A:source:trunk/`.
-Now within project B, referring to `[123]` or `source:trunk/` would be ambiguous,
-unless a ''default'' repository would be specified (say /srv/svn/repo2).
-But in general, ''path restriction'' should be used to properly identify the resource:
-`[123/repo1]`, `source:repo1/trunk/` and `[123/repo2]`, `source:repo2/trunk/`.
-''How about `[123@repo2]`, `source:@repo2/trunk` and let `[source:trunk]` go to the default repository of the project so that when a new repository is added to the project, all existing links won't break?''[[br]]''-- Kenneth Xu''
-The ''only'' problem with this approach would be to risk some confusion if a repository name is also used as a toplevel folder name of some other repository in the same project. I'm don't think it's a showstopper though, as:
- - this shouldn't happen often in the first place
- - if it happens nevertheless, a simple disambiguation rule could be adopted,
-   like always consider that if the first element in the path restriction
-   corresponds to a repository name when multiple repositories are present,
-   then it's used as a repository selector.
-On the data model level, the cache for /srv/svn/repo1 will be shared for projects A and B. We simply need an additional relation table, pairing projects with repositories.
-== Support for Mercurial-like Version Control System ==
-===  Basic Level ===
- * DONE:
-   * support for non-numerical changesets (start with hexadecimal digit support)
-   * support for extra changeset properties
-     * basic infrastructure in trunk
-     * support for SVN: see #2545
-Those are the minimal changes needed so that the TracMercurial plugin can work at all.
-=== Advanced Level ===
- * TracRevisionLog should show the branches (a la
-   [http://www.flickr.com/photos/search/tags:mercurial%2Chgk/tagmode:all/ hgk]).
-   See also #1492.
- * DONE:
-   * Support for arbitrary changeset names (e.g. `[tip]` or `[head]`)
-   * Support for direct jump to a tag or a branch.
-     Done on the branch (r3017); re-done for 0.11 (now in trunk)
-=== Support for Big Repositories ===
-This means extending cache support.
-Support for multiple repositories would also require some changes to
-the caching anyway.
-This is material for Trac [milestone:0.12]...
 == New Repository Cache ==
+I think I've come up with a new caching scheme that would
+be able to handle this. The idea is to replicate the tree
+changes information that svn stores. This should also work
+for Mercurial or other backends.
+A new cache should be designed with the primary focus on achieving maximal performance for the tasks needed by the TracBrowser. The current cache simply contains the required information, but not organized in a way that can speed-up by indexes for the common tasks.
+The `node_changes` table could even be kept as it is, I think.
+The main difference would be that we should also add the paths
+for files and folders that were not modified themselves, but
+happen to be in the same folder as one of the file or folder
+that has been modified.
+The primary example for this is the prev_next_rev queries. The second example would be the get_entries() call which is expensive for TracMercurial and TracGit.
 That way, we could implement:
+The other constraints would be:
  - `Repository.get_node(path, rev)` using the cached information only,
    which would be a dramatic improvement for Mercurial, which has no
    information about the folder themselves.
  - Likewise, `Repository.get_path_history` could also be implemented
+ - `Repository.get_path_history` could also be implemented
    in a generic and efficient way using that caching scheme.
+   This would also help address #10183 and #7744.
  - The next/prev history navigation between revisions (or rather,
    their extended children/parents versions) could also be implemented
    on top of the cache.
+   on top of the cache (#8639; #7254; ticket:8813#comment:10)
  - Probably `Node.get_history` as well, not to mention the possibility
    to find out the ''copy_to'' information (#1445).
 …
    (that would solve the #2353 issue).
+=== `node_change` + directories
+One idea would be to keep the `node_change` table as it is
+but we should also add the paths for files and folders that
+were not modified themselves, but happen to be in the same
+folder as one of the file or folder that has been modified.
 Example:
 …
 Would result in:
 || '''rev''' || '''path''' || '''node_type''' || '''change_type''' || '''base_path''' || '''base_rev''' ||
+||= rev ||= path ||= node_type ||= change_type ||= base_path ||= base_rev ||
 || 1 || trunk || D || A || || -1 ||
 || 1 || trunk/dir1 || D || A || || -1 ||
 …
 and only the filename part would be in clear text (#3676?).
-Mercurial would need to store additional information per node,
-in particular the file size. Using Mercurial:RevlogNG, that
-information would be cheap to get, but revlogng is not yet
-widely used. -- ''update: source:plugins/0.11/mercurial-plugin uses the new API''
 The most flexible approach for storing extra node fields is certainly
 to let each backend create and maintain an additional table,
 e.g. `node_changes_hg` (see also #2733).
 In addition, there will most certainly be the need for a kind of