Edgewall Software
Modify

Opened 12 years ago

Closed 12 years ago

Last modified 10 years ago

#2591 closed enhancement (fixed)

Option to limit the number of diff'ed files in the changeset view

Reported by: Emmanuel Blot Owned by: Christian Boos
Priority: high Milestone: 0.10
Component: version control/changeset view Version: devel
Severity: major Keywords: diff tracdiff
Cc:
Release Notes:
API Changes:

Description

It would be nice if the changeset view provided an option to limit the maximum number of files diff'ed:
When Trac attempts to show the changed files of a very large changeset (hundreds of files), it can take several minutes to complete the diff computation and send the result back to the web browser. This delay is quite unusual, and not predictable for the user as he/she does know before selecting a changeset the number of changed files. The maximum number of files value should therefore be persistent.

Attachments (2)

detect_big_changesets.patch (3.9 KB ) - added by Christian Boos 12 years ago.
Detect big changesets and in this case don't show the diffs in the same page. Instead, add links to more specific diff views.
detect_big_changesets.2.patch (4.2 KB ) - added by Christian Boos 12 years ago.
Improved patch

Download all attachments as: .zip

Change History (18)

comment:1 Changed 12 years ago by Christian Boos

Keywords: tracdiff added
Milestone: 0.10
Owner: changed from Jonas Borgström to Christian Boos
Priority: lowhigh

That's an important feature to have for TracDiff too.

I'd suggest the following solution (copied from the ML thread):

[changeset]
max_diff_shown = 100

(suggestion for a better name welcomed)

If there are more than max_diff_shown actual diffs to show (Added/Deleted or Edits on Binary files wouldn't count), then no diffs would be inlined. Rather, the (n diff) link at the right of the file entry would link to the restricted changeset for that file.

comment:2 Changed 12 years ago by Christopher Lenz

I think it'd be more important to set a maximum size of files that are diffed… which are what actually takes up memory.

Also, we can improve the transcoding… currently we convert both versions to UTF-8 before performing the diff, which loads the complete file contents in memory. However, in 99% of the cases, the encoding of the two versions will match, so we'd only need to transcode to a common encoding when it differs.

These options should be investigated before limiting the number of files IMO.

comment:3 Changed 12 years ago by Christian Boos

See also #2343.

comment:4 Changed 12 years ago by Emmanuel Blot

See also #2636, similar request for enhancement about the browser view

comment:5 Changed 12 years ago by Christian Boos

Marked #2700 as a duplicate (a too large HTML diff can even kill tracd).

comment:6 Changed 12 years ago by Emmanuel Blot

Severity: normalmajor

I just ran once more into troubles with this issue:

Some users use the RSS feed to get the news about the SVN repository changes. When a changeset is "large enough", the Apache server takes ages to generate the changeset diff, which apparently ends up into deadlocking some of the threads of the Apache server.

This is not the most annoying part: as a side effect, the SVN repository (also run with the same Apache server) stops working: "svn copy URL URL" commands fail with the following cryptic message:

svn: MERGE request failed on '/svn/project'
svn: Failed to run '/local/svn/project/hooks/pre-commit' hook

The Apache server needs to be restarted to "fix up" this issue.

I'm increasing the severity of this ticket, as this issue may cause important side effects.

comment:7 Changed 12 years ago by Christian Boos

You mean the timeline RSS? There shouldn't be any diff generation in this situation. Do you use the changeset_show_file option? If so, try turning it off to see if this is the cause of the problem.

Perhaps it's a bug in Repository.get_changesets, try to monitor the behavior of that method.

comment:8 Changed 12 years ago by Emmanuel Blot

About the timeline RSS:

Used from Thunderbird for example, when the user selects a news the preview pane shows the linked document. In this case (changeset event) the preview pane shows http://server/trac/project/changeset/N i.e. the changeset view, which implies the full diff of the changeset.

comment:9 Changed 12 years ago by Christian Boos

Status: newassigned

Please try out attachment:detect_big_changesets.patch

This make use of the following configuration variables

[changeset]
max_diff_size = 100000
max_diff_files = 1000

This gives two hints for detecting big changesets. If a changeset is determined to be big, then for each diff, only the link to the restricted changeset for that file will be shown.

In the future, that link could be changed to an AJAX query for retrieving only that diff dynamically.

Changed 12 years ago by Christian Boos

Attachment: detect_big_changesets.patch added

Detect big changesets and in this case don't show the diffs in the same page. Instead, add links to more specific diff views.

comment:10 Changed 12 years ago by Emmanuel Blot

I havn't extensively tested the patch, but AFAICT, it works great.

Thanks a lot !

comment:11 Changed 12 years ago by Emmanuel Blot

Ooops, there is an issue with the patch:

When the change is about an added file or a removed file, the "full diff" URL is wrong: One of the path is correct, the other path is invalid (most of the time, it is a path to an unrelated file of the same changeset…)

comment:12 Changed 12 years ago by Christian Boos

Yep. I realized that too :)

There's now a second iteration of the patch available, which displays the full diff link only for modified files and also takes care of linking to the restricted changeset if the big changeset is really a changeset and not a more general diff.

Changed 12 years ago by Christian Boos

Improved patch

comment:13 Changed 12 years ago by Emmanuel Blot

I've applied the second patch. It seems to work ok. Thanks.

comment:14 Changed 12 years ago by Christian Boos

There's another alternative, which is to use max_preview_size (see #2343) to filter out big files, instead of relying on a highly approximative diff size estimate: that would give a mix of inlined changes for the files that fall below the limit and links to individual diffs for the big files above that limit. The global max_diff_files would still be used for avoiding huge changesets made of a lot a small changes.

Thoughts?

comment:15 Changed 12 years ago by Christian Boos

Resolution: fixed
Status: assignedclosed

The above modification has been added to the attachment:detect_big_changesets.2.patch and checked in as r2968.

I keep in mind doing the optimization about not converting the files beforehand to UTF8 in case they share the same encoding, but the main problem should be fixed.

comment:16 Changed 11 years ago by Christian Boos

doing the optimization about not converting the files beforehand to UTF8 in case they share the same encoding,

in r3092 I decided to do the diffs in unicode, even if both contents share the same encoding. If we would do a diff keeping the original encodings, this could reintroduce problems like #2363, as different characters may share an identical starting sequence of bytes…

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .

 
Note: See TracTickets for help on using tickets.