Opened 18 years ago
Last modified 14 months ago
#4474 new defect
diffing two large trees results in a massive list with a lot of empty links
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | normal | Milestone: | next-stable-1.6.x |
Component: | version control/changeset view | Version: | devel |
Severity: | normal | Keywords: | svn diff |
Cc: | nick-trac@… | Branch: | |
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description (last modified by )
If I run a manual diff of two large directories in different branches (600 files, 15MB), instead of directly showing all the diffs, it displays a list of links to the diffs of individual files.
That wouldn't be too bad (it's probably intended to reduce browser load but I'd like to be able to customize the behavior somehow - help!), but in the above case, many of the links lead to empty diffs, which is obviously wrong. Please fix it so it only links to non-empty diffs.
Attachments (0)
Change History (29)
comment:1 by , 18 years ago
comment:2 by , 18 years ago
Yes I noticed that one shortly after, and it works for that particular subdirectory. (although when I ran it on the root dir it rattled for a while and then gave up with no error messages or anything :)
comment:3 by , 18 years ago
Keywords: | svn diff added |
---|---|
Milestone: | → 0.11 |
Yes, this is a real issue in SubversionRepository.get_changes()
that I still need to investigate.
follow-up: 5 comment:4 by , 18 years ago
This was fixed sometime in january, probably by r4548 (but not sure).
follow-up: 12 comment:5 by , 18 years ago
Replying to theultramage@gmail.com:
This was fixed sometime in january, probably by r4548 (but not sure).
Ah, then we were not speaking of the same thing. I noticed that this happened when comparing different branches, and that spurious diffs were appearing for binaries, like in, e.g.
diff:trunk/htdocs@4700//sandbox/security/htdocs@4700
(those are actually the same pictures)
comment:6 by , 18 years ago
Hm, in my case it was probably caused by the tree-diffing engine not skipping empty diffs (see example, the first file). This doesn't happen anymore on trunk. Maybe a case needs to be added for binary files?
comment:8 by , 18 years ago
Update; the situation still happens, but it seems that only for larger diffs. I needed to do a diff of two entire branches (with almost the same content) today, and it still gave me a huge list with "(No files)" inside. I hope it is reproducible.
comment:9 by , 18 years ago
Milestone: | 0.11 → 0.12 |
---|
The bug will probably stay until after the 0.11 release, but could probably be then backported to 0.11-stable.
comment:10 by , 18 years ago
Cc: | added |
---|
comment:11 by , 16 years ago
Cc: | added |
---|
comment:12 by , 16 years ago
Replying to cboos:
I noticed that this happened when comparing different branches, and that spurious diffs were appearing for binaries
I'm having this issue. svn_repos_dir_delta
returns a difference for binary files that are not straight copies, even if the contents are identical.
My local Trac (0.11) has a patch for changeset.py:
old_content = old_node.get_content().read() new_content = new_node.get_content().read() if is_binary(old_content) or is_binary(new_content): if old_content == new_content: return [] return None
comment:13 by , 16 years ago
Milestone: | 0.13 → 0.12 |
---|
Great, thanks for sharing this information. Ideally we'd need to be able to only compare checksums here (e.g. svn_fs_file_checksum for Subversion).
comment:14 by , 16 years ago
I re-tested this with latest trunk (r7757). Inputs:
- a svk mirror of http://svn.eathena.ws/svn/ea/
- repo paths /trunk/npc and /branches/stable/npc
- trunk has 604 versioned files, stable has 598 versioned files
Outputs:
- A huge list of links <file path (view diffs)>
- Files: 8 added, 2 removed, 571 modified
- 402 files are binary identical (according to 'find . | xargs md5')
So trac skipped over some identical files (as it should have), but still reported hundreds of them. A lot of the 'view diffs' links lead to a page that only says '(No files)'. Therefore the issue still exists.
One potential workaround might be increasing the 'max_diff_bytes' config option to a very high value; I faintly recall this helping in cases of sufficiently small dirs.
comment:15 by , 16 years ago
While investigating an unrelated issue, I had an idea… could the following be the cause?
Trac starts to diff the target files one by one, eliminating identical files. In the middle of processing, it detects that it has already consumed max_diff_bytes of input. Then it doesn't continue, and instead just builds a list of all remaining files with links to individual files, for manual diffing. This would be what I was mentioning above.
Of course, such output that just lists (almost) all files has zero informational value. Not sure what to do about it though, except for bumping max_diff_bytes to 2 * sizeof(repository) …
follow-up: 19 comment:16 by , 16 years ago
We've just hit this problem (using 0.11.2.1). The scenario is as follows:
- We're trying to compare the differences between a branch and the trunk.
- Compared with the base of the branch, there are 618 files modified (all text files, not binary files).
- Compared with the latest version of the trunk there are only 7 files modified. ie. The branch contains only 6 modified files but there were lots of changes made to the trunk after the branch was created which have now been merged in.
- If I try to display the differences between the branch and the latest version of the trunk in Trac then:
- If I set
max_diff_files
to 618 (and setmax_diff_bytes
to be large) then the 7 modified files are shown correctly. - If I reduce
max_diff_files
to 617 then I get 618 modified files reported (with 611 of the "view diffs" links pointing to a page saying "No files").
- If I set
I don't understand why the number of files modified on the branch is relevant here. Subversion happily reports 7 modified files when you compare the 2 URLs - isn't Trac using this?
follow-ups: 18 23 comment:17 by , 16 years ago
Trac doesn't use the revision history for diffing. Whenever you request a diff of two files, it must manually compute the diff. If you have 1000 files in each directory you're comparing, trac has to diff each of those. Even if the difference between them is minimal, trac doesn't know that unless it actually diffs the files.
The max diff size and max diff files settings are there to limit the complexity of requests users can make. If there are more files to process than the limit allows, trac will just give up when it hits the limit and will dump the rest as unprocessed links that you can click on to request further diffs (on a per-file basis).
The way how trac displays the partial results is very awkward. Nonetheless, these settings provide at least some form of prevention against overloading the server. In our case, displaying of multi-file diff pages was disabled entirely. It's an inconvenience, but it lets us work with trac without long page loading delays.
comment:18 by , 16 years ago
Replying to theultramage@…:
Trac doesn't use the revision history for diffing. Whenever you request a diff of two files, it must manually compute the diff. If you have 1000 files in each directory you're comparing, trac has to diff each of those. Even if the difference between them is minimal, trac doesn't know that unless it actually diffs the files.
The max diff size and max diff files settings are there to limit the complexity of requests users can make. If there are more files to process than the limit allows, trac will just give up when it hits the limit and will dump the rest as unprocessed links that you can click on to request further diffs (on a per-file basis).
I don't understand this. Our project tree contains several thousand files. We frequently show the differences between the trunk and a branch and this has always worked fine. It is only in this case, where a large number of files have been modified on the branch, that we have had a problem. Also, it is this number of files modified which seems important, not the number of files in the tree.
comment:19 by , 16 years ago
A bit of further info …
Replying to Dave Matthews:
- Compared with the latest version of the trunk there are only 7 files modified. ie. The branch contains only 6 modified files but there were lots of changes made to the trunk after the branch was created which have now been merged in.
- If I try to display the differences between the branch and the latest version of the trunk in Trac then:
- If I set
max_diff_files
to 618 (and setmax_diff_bytes
to be large) then the 7 modified files are shown correctly.
I was getting confused here as to whether there are 6 or 7 files modified and I now realise why. There are, in fact, only 6 files modified. Even after increasing max_diff_files
Trac still shows 7 files modified. The extra file is a binary file which ties in with earlier comments about spurious diffs for binaries. So, are there 2 separate issues here or are they related?
comment:20 by , 15 years ago
Milestone: | 0.12 → next-minor-0.12.x |
---|
comment:21 by , 14 years ago
Cc: | removed |
---|
comment:22 by , 13 years ago
#10662 was closed as duplicate of this bug. It contains a reproduction recipe involving text files (i.e. not binaries).
comment:23 by , 13 years ago
Replying to theultramage@…:
Trac doesn't use the revision history for diffing. Whenever you request a diff of two files, it must manually compute the diff. If you have 1000 files in each directory you're comparing, trac has to diff each of those. Even if the difference between them is minimal, trac doesn't know that unless it actually diffs the files.
Why doesn't trac to do something like "svn diff --summarize <old_path>@<old> <new_path>@<new>
" to satisfy the diff query?
comment:24 by , 10 years ago
Milestone: | next-minor-0.12.x → next-stable-1.0.x |
---|
comment:25 by , 10 years ago
Description: | modified (diff) |
---|
Removed dead url; at this point I think the behavior is understood well enough to reproduce.
comment:26 by , 9 years ago
Owner: | removed |
---|
comment:27 by , 8 years ago
Milestone: | next-stable-1.0.x → next-stable-1.2.x |
---|
Moved ticket assigned to next-stable-1.0.x since maintenance of 1.0.x is coming to a close. Please move the ticket back if it's critical to fix on 1.0.x.
comment:28 by , 4 years ago
Milestone: | next-stable-1.2.x → next-stable-1.4.x |
---|
Replying to theultramage@gmail.com:
About this peculiar point: did you set up the Trac parameters as described in TracIni#changeset-section (
max_diff_*
)?