Edgewall Software

Ticket #4985 (closed defect: worksforme)

Opened 18 months ago

Last modified 18 months ago

Blame (annotate) issue

Reported by: eblot Owned by: cboos
Priority: normal Milestone:
Component: version control Version: devel
Severity: major Keywords: svn blame
Cc:

Description

The newly introduced "annotate" feature (svn blame) is in trouble on my server.

Error message:

Can't use blame annotator:

svn blame failed: ("Can't read file '/tmp/tmp.tmp': End of file found", 70014)

The debug log contains the following lines:

2007-03-20 18:53:34,475 Trac[__init__] INFO: opening ra_local session to file:///local/svn/project/.../file.cpp
2007-03-20 18:53:35,268 Trac[__init__] WARNING: Can't use annotator 'blame': svn blame failed: ("Can't read file '/tmp/tmp.tmp': End of file
 found", 70014)

When I use the following code snippet (copied from the svn_fs.py source), the annotate feature returns the list of revisions as expected and does not trigger any exception, using the same file and same revision as the blame2 arguments. I also tried to run the code snippet with the same user as Apache (www-data on Debian) and it also works:

from svn import core, client

url = 'file:///local/svn/project/.../file.cpp'

def svn_rev(num):
    value = core.svn_opt_revision_value_t()
    value.number = num
    revision = core.svn_opt_revision_t()
    revision.kind = core.svn_opt_revision_number
    revision.value = value
    return revision

annotations = []
def blame_receiver(line_no, revision, author, date, line, pool):
    annotations.append(revision)

rev = svn_rev(2309)
start = svn_rev(0)

client.blame2(url, rev, start, rev, blame_receiver, client.create_context())

print annotations

I also run the strace tool with this script and /tmp/tmp.tmp file is also used (successfully).

As an additional note, the same Trac development version works ok with tracd on Mac OS X.

Attachments

Change History

  Changed 18 months ago by eblot

I forgot to describe the environment:

Trac: 0.11dev-r5096
Python: 2.4.4 (#1, Jan 13 2007, 17:48:09)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)]
setuptools: 0.6c3
SQLite: 3.3.8
pysqlite: 2.3.3
Genshi: 0.4dev-r505
Pygments: 0.6
Subversion: 1.4.2 (r22196)

follow-up: ↓ 3   Changed 18 months ago by cboos

  • severity changed from normal to major
  • milestone set to 0.11

On the server where you noticed the problem, does that issue happen systematically? or you just found the problem in the logs?

Also, is it possible that there were two simultaneous annotate requests? (the full logs should eventually tell).

It might be related or not, but I was able to trigger again a freeze using tracd, and it looked like the blame could be involved, as the last output was "INFO: opening ra_local session"...

in reply to: ↑ 2   Changed 18 months ago by eblot

Replying to cboos:

On the server where you noticed the problem, does that issue happen systematically? or you just found the problem in the logs?

It always fails and I tried on different files: I've never been able to produce any annotation on this server - for about a week since I've started to try this new feature.

Also, is it possible that there were two simultaneous annotate requests? (the full logs should eventually tell).

No, I've double-checked that as well, with a tail -f trac.log session while running the annotation feature.

It might be related or not, but I was able to trigger again a freeze using tracd, and it looked like the blame could be involved, as the last output was "INFO: opening ra_local session"...

For now, it has never deadlocked. Execution seems to resume normally after the exception is triggered.

I've added some trace in the svn_fs.py file: the blame_receiver function is never called, client.blame2 fails before invoking the receiver.

  Changed 18 months ago by eblot

Update: mod_python strikes back?

I've run tracd on the same server, same user (www-data), same repository, same environment, same URL (which include the file path and the revision) but the port: the annotation feature seems to work seamlessly.

This seems to be related either to Apache or ModPython...

It seems that the only call to the svn.client library in Trac is this blame2 method.
Could there be any issue with invoking this "upper" API from within ModPython?

follow-up: ↓ 6   Changed 18 months ago by cboos

Well, I would try: strace httpd -X to see what kind of problem svn has with this file while running under apache.

in reply to: ↑ 5   Changed 18 months ago by eblot

Replying to cboos:

Well, I would try: strace httpd -X to see what kind of problem svn has with this file while running under apache.

Ok, so here it is, I don't know how to interpret what SVN does, as it seems to be the one in trouble

  1. It opens "/tmp/tmp.tmp" for writing, fd=16
  2. It dumps diff'ed data from the various revisions
    • It writes two 4kB buffers + one 2746b buffer, that is 10938 bytes
  3. It closes the file (fd=16)
    [pid  6477] open("/tmp/tmp.tmp", O_RDWR|O_CREAT|O_EXCL, 0666) = 16
    [pid  6477] open("/local/svn/project/db/revs/2032", O_RDONLY) = 17
    [pid  6477] lseek(17, 15659, SEEK_SET)  = 15659
    [pid  6477] read(17, "id: 66g.6xi.r2032/15659\ntype: fi"..., 4096) = 4096
    [pid  6477] close(17)                   = 0
    [pid  6477] open("/local/svn/project/db/revs/2032", O_RDONLY) = 17
    [pid  6477] lseek(17, 5309, SEEK_SET)   = 5309
    [pid  6477] read(17, "DELTA\nSVN\1\0\0\325:\222~\212=\226,x^\215VyWTG\26Wp\305"..., 4096) = 4096
    [pid  6477] mmap2(NULL, 212992, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xaf5c1000
    [pid  6477] brk(0x880e000)              = 0x880e000
    [pid  6477] brk(0x880b000)              = 0x880b000
    [pid  6477] write(16, "/**\n * "..., 4096) = 4096
    [pid  6477] write(16, "deo           = 0x2,\n      EComp"..., 4096) = 4096
    [pid  6477] write(16, "                  aCallBack TBoo"..., 2746) = 2746
    [pid  6477] close(16)                   = 0
    
  4. It opens other SVN revision files, so that the file handle 16 is recycled several times
    [pid  6477] open("/local/svn/project/db/revs/2046", O_RDONLY) = 16
    [pid  6477] lseek(16, 13320, SEEK_SET)  = 13320
    [pid  6477] read(16, "id: 66g.71l.r2046/13320\ntype: fi"..., 4096) = 4096
    [pid  6477] close(16)                   = 0
    
  5. It re-opens "/tmp/tmp.tmp" for reading, fd=16
  6. It attemps to open another temporary file (?) for writing, fails using the same name, then eventually successes opening the temp file as another name linked to fd=17
    • It duplicates the content of the first temp file to the secondary temp file
  7. It closes both temp files
    [pid  6477] open("/tmp/tmp.tmp", O_RDONLY) = 16
    [pid  6477] open("/tmp/tmp.tmp", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 EEXIST (File exists)
    [pid  6477] open("/tmp/tmp.2.tmp", O_RDWR|O_CREAT|O_EXCL, 0666) = 17
    ...
    [pid  6477] read(16, "/**\n * "..., 10938) = 10938
    [pid  6477] write(17, "/**\n * "..., 4096) = 4096
    [pid  6477] write(17, "put      = 0x0,\n      EComposite"..., 4096) = 4096
    [pid  6477] close(16)                   = 0
    [pid  6477] write(17, "onType\n    * @param  aCallBack  "..., 2826) = 2826
    [pid  6477] close(17)                   = 0
    
  8. Now the weird part
    The first temp file is open for reading once more, and stated. As expected, the file is 10938-byte long.
  9. It attempts to read much more bytes out of the file (131072 bytes, that is a 128kB buffer)
    • The system only returns the full content of the file, that is 10938 bytes
  10. Another attempt to read the "remaining" bytes from the file?
    • The system returns no byte, as expected as the file pointer has reached the end of file
  11. This seems to be the actual error: eof of file reached on /tmp/tmp.tmp
    [pid  6477] open("/tmp/tmp.tmp", O_RDONLY) = 16
    [pid  6477] fstat64(16, {st_mode=S_IFREG|0644, st_size=10938, ...}) = 0
    [pid  6477] lseek(16, 0, SEEK_SET)      = 0
    [pid  6477] read(16, "/**\n * "..., 131072) = 10938
    [pid  6477] read(16, "", 120134)        = 0
    
  12. Next system call are attempts to find a localized message of the error:
    [pid  6477] open("/usr/share/locale/en_US/LC_MESSAGES/subversion.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
    [pid  6477] open("/usr/share/locale/en/LC_MESSAGES/subversion.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
    [pid  6477] open("/usr/share/locale/en_GB/LC_MESSAGES/subversion.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
    
  13. All the temporary files are eventually deleted
    [pid  6477] close(18)                   = 0
    [pid  6477] unlink("/tmp/tmp.tmp")      = 0
    [pid  6477] close(16)                   = 0
    [pid  6477] unlink("/tmp/tmp.2.tmp")    = 0
    
  14. The next system call is the error trace emitted by Trac about the error
    [pid  6477] gettimeofday({1174482775, 377599}, NULL) = 0
    [pid  6477] stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1082, ...}) = 0
    [pid  6477] write(10, "2007-03-21 14:12:55,377 Trac[__i"..., 156) = 156
    

Now the question is: why SVN interprets the EOF as an error?

follow-up: ↓ 8   Changed 18 months ago by cboos

Those temporary files are closed when the pool is cleaned-up (svn_io_open_unique_file2), so I wonder if this is not related to some messing up in pools:

  • either due to apr incompatibilities between the one expected by Subversion at compile time and the one from apache effectively used at run time
  • or simply due to some bad interaction at run-time between Apache and Subversion pool usage

In the latter case, it might eventually help to not rely on the automatic pool management and provide an explicit pool to the blame2 function, as we usually do (oops...)

  • svn_fs.py

     
    758758            from svn import client 
    759759            try: 
    760760                client.blame2(repo_url, rev, start, rev, blame_receiver, 
    761                               client.create_context()) 
     761                              client.create_context(), self.pool()) 
    762762            except core.SubversionException, e: # svn thinks file is a binary 
    763763                raise TracError('svn blame failed: '+to_unicode(e)) 
    764764        return annotations 

in reply to: ↑ 7   Changed 18 months ago by eblot

Replying to cboos:

In the latter case, it might eventually help to not rely on the automatic pool management and provide an explicit pool to the blame2 function, as we usually do (oops...)

Unfortunately, it did not help.

I've run the same sequence with tracd (same everything but Apache ModPython), and the read sequence differs (4th line):

[pid  8192] open("/tmp/tmp.tmp", O_RDONLY|O_LARGEFILE) = 12
[pid  8192] fstat64(12, {st_mode=S_IFREG|0644, st_size=10938, ...}) = 0
[pid  8192] _llseek(12, 0, [0], SEEK_SET) = 0
[pid  8192] read(12, "/**\n * "..., 10938) = 10938
[pid  8192] open("/tmp/tmp.2.tmp", O_RDONLY|O_LARGEFILE) = 13
[pid  8192] fstat64(13, {st_mode=S_IFREG|0644, st_size=11018, ...}) = 0
[pid  8192] _llseek(13, 0, [0], SEEK_SET) = 0
[pid  8192] read(13, "/**\n * "..., 11018) = 11018
[pid  8192] close(13)                   = 0
[pid  8192] close(12)                   = 0
[pid  8192] close(14)                   = 0

Here, the read call uses the actual file size as the count of bytes to read, not the size of the destination buffer. No 'EOF' reached, no error.

There might be some collision with SVN APR and Apache APR as you were suggesting: I'm still explicitly using Apache 2.0.x while the 'unstable' version I'm using on the server suggests Apache 2.2.x. Maybe it's time to upgrade... I'm a bit scared to bump into a new series of issues with ModPython and the like. Let's try anyway.

  Changed 18 months ago by eblot

  • status changed from new to closed
  • resolution set to worksforme

After upgrading Apache (2.2.3) and ModPython (3.2.10), it works smoothly.

Thanks a lot for the support (and sorry, my mistake).

  Changed 18 months ago by eblot

  • milestone 0.11 deleted

  Changed 18 months ago by eblot

It is not probably worth opening a new ticket for it, but I just discover that the URL for annotation contains "annotate=False" when annotation is selected.

Whatever the value (i.e. False or True annotation is always rendered, it seems that as long as annotate is part of the URL, annotation is done.

  Changed 18 months ago by cboos

See r5111.

Add/Change #4985 (Blame (annotate) issue)

Author



Change Properties
<Author field>
Action
as closed
Next status will be 'reopened'
 
Note: See TracTickets for help on using tickets.