Edgewall Software
Modify

Opened 5 years ago

Closed 4 years ago

#8947 closed defect (fixed)

Use the timestamp as an identifier when editing a comment

Reported by: rblank Owned by: rblank
Priority: normal Milestone: 0.12
Component: ticket system Version: 0.12dev
Severity: normal Keywords:
Cc: leho@…, michaelc@…
Release Notes:
API Changes:

Description

Currently, Ticket.modify_comment() uses the comment number stored in the oldvalue column of ticket_change to identify the comment that is edited. Unfortunately, it seems that some "comment adders" don't set this field correctly (e.g. XmlRpcPlugin). Also, databases from old versions of Trac (prior to the introduction of the threading feature) also don't have the oldvalue column set correctly. This prevents such comments from being edited.

The comment editing feature should be changed to use the comment timestamp as an identifier instead of the comment number.

Attachments (1)

8947-auto-cnum-r9153.patch (5.0 KB) - added by rblank 5 years ago.
Automatically add cnum if not provided in save_changes().

Download all attachments as: .zip

Change History (23)

comment:1 Changed 5 years ago by lkraav <leho@…>

  • Cc leho@… added

comment:2 Changed 5 years ago by rblank

Mmh, using the comment time as an identifier is more difficult that I thought, because the comment number is used just about everywhere.

So another option would be to continue using the comment number, but to "interpolate" comment numbers for comments that don't have the oldvalue column set. Additionally, a database upgrade could be used to set the missing oldvalue fields. Finally, Ticket.save_change() could determine the comment number itself if the argument isn't provided.

comment:3 Changed 5 years ago by rblank

Fixed editing of comments where the comment number is not available in the oldvalue column of ticket_change in [9043]. The signature of Ticket.modify_comment() was also changed to take the comment date as an identifier instead of the comment number.

comment:4 Changed 5 years ago by rblank

[9043] introduced more robustness in the case where the comment number is not available in the oldvalue column. We could also make Ticket.save_changes() more robust, by having it find the cnum when it isn't provided:

  • trac/ticket/model.py

    diff --git a/trac/ticket/model.py b/trac/ticket/model.py
    a b  
    301301                           "VALUES (%s, %s, %s, %s, %s, %s)",
    302302                           (self.id, when_ts, author, name, self._old[name],
    303303                            self[name]))
     304
     305        # find cnum if it isn't provided
     306        if not cnum:
     307            cursor.execute("SELECT oldvalue FROM ticket_change "
     308                           "WHERE ticket=%s AND field='comment' "
     309                           "ORDER BY time DESC", (self.id,))
     310            num = 0
     311            for row in cursor:
     312                try:
     313                    num = int((row[0] or '').rsplit('.', 1)[-1]) + num
     314                    break
     315                except ValueError:
     316                    num += 1
     317            cnum = str(num + 1)
     318
    304319        # always save comment, even if empty (numbering support for timeline)
    305320        cursor.execute("INSERT INTO ticket_change "
    306321                       "(ticket,time,author,field,oldvalue,newvalue) "

This would ensure that the cnum is also inserted when plugins "forget" the argument. Thoughts?

comment:5 Changed 5 years ago by cboos

[9043] also introduced a bug apparently, see #8993.

comment:6 Changed 5 years ago by rblank

What about the proposed change in comment:4? Is it a good idea to generate a correct cnum if it isn't passed to save_changes()? Obviously, I'll have to adapt the patch after [9107].

Changed 5 years ago by rblank

Automatically add cnum if not provided in save_changes().

comment:7 Changed 5 years ago by rblank

The patch above was refreshed for current trunk, and calculates cnum in save_changes() if it's not provided.

Thoughts?

comment:8 follow-up: Changed 5 years ago by cboos

Seems fine. What about starting to use unique multiline """ ... """ strings for the SQL?

comment:9 in reply to: ↑ 8 Changed 5 years ago by rblank

Replying to cboos:

What about starting to use unique multiline """ ... """ strings for the SQL?

Ok, I'll do that one and then we can argue endlessly about the correct style (indentation, etc) to use ;-)

comment:10 Changed 5 years ago by rblank

  • Resolution set to fixed
  • Status changed from new to closed

Patch applied in [9158]. Ok with the SQL style?

comment:11 follow-up: Changed 5 years ago by cboos

Nice! But as you expected, I'd have done it slightly differently ;-)

  • trac/ticket/model.py

     
    283283        # find cnum if it isn't provided
    284284        if not cnum:
    285285            num = 0
    286             cursor.execute("""\
     286            cursor.execute("""
    287287                SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,'')
    288288                FROM ticket_change AS tc1
    289289                  LEFT OUTER JOIN
    290290                    (SELECT time,oldvalue FROM ticket_change
    291291                     WHERE field='comment') AS tc2
    292292                  ON (tc1.time = tc2.time)
    293                 WHERE ticket=%s ORDER BY tc1.time DESC""",
    294                 (self.id,))
     293                WHERE ticket=%s ORDER BY tc1.time DESC
     294                """, (self.id,))
    295295            for ts, old in cursor:
    296296                # Use oldvalue if available, else count edits
    297297                try:
  • I don't think we gain anything by avoiding the initial line break,
  • if we don't move the """ at the beginning of the line, the vertical separation between the last line of the SQL and the query parameters is not immediately obvious

comment:12 in reply to: ↑ 11 ; follow-up: Changed 5 years ago by rblank

Replying to cboos:

Nice! But as you expected, I'd have done it slightly differently ;-)

I would have been surprised if that wasn't the case. Or, to paraphrase the last book I read: "If two programmers in a group agree, that's a majority".

  • if we don't move the """ at the beginning of the line, the vertical separation between the last line of the SQL and the query parameters is not immediately obvious

It is if your editor does syntax highlighting ;-)

Feel free to apply, I don't really mind. I'll adapt to anything (as long as I don't have to write Perl, that is).

comment:13 in reply to: ↑ 12 Changed 5 years ago by cboos

Replying to rblank:

Replying to cboos:

Nice! But as you expected, I'd have done it slightly differently ;-)

I would have been surprised if that wasn't the case. Or, to paraphrase the last book I read: "If two programmers in a group agree, that's a majority".

And in our group, it's even unanimity ;-)

  • if we don't move the """ at the beginning of the line, the vertical separation between the last line of the SQL and the query parameters is not immediately obvious

It is if your editor does syntax highlighting ;-)

Feel free to apply, I don't really mind. I'll adapt to anything (as long as I don't have to write Perl, that is).

Oh yes, here-docs, that would be even better!

  • trac/ticket/model.py

     
    283283        # find cnum if it isn't provided
    284284        if not cnum:
    285285            num = 0
    286             cursor.execute("""\
     286            cursor.execute(<<EOSQL, (self.id,))
    287287                SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,'')
    288288                FROM ticket_change AS tc1
    289289                  LEFT OUTER JOIN
     
    291291                     WHERE field='comment') AS tc2
    292292                  ON (tc1.time = tc2.time)
    293293                WHERE ticket=%s ORDER BY tc1.time DESC""",
    294                 (self.id,))
     294            EOSQL
    295295            for ts, old in cursor:
    296296                # Use oldvalue if available, else count edits
    297297                try:

(we should switch to this as soon as the Grand-Unification-of-Scripting-Languages happen, I'll leave a reminder note for my kids ;-) ).

More seriously, the suggested style in comment:11 is very close to the one already adopted by the TracHacks:AnnouncerPlugin (example), which is the source of inspiration for this.

Committed in r9161.

comment:14 Changed 4 years ago by Michael Ching <michaelc@…>

I am trying to understand the SQL used to determine the cnum in the patch:

SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,'')
                    FROM ticket_change AS tc1
                      LEFT OUTER JOIN
                        (SELECT time,oldvalue FROM ticket_change
                         WHERE field='comment') AS tc2
                      ON (tc1.time = tc2.time)
                    WHERE ticket=? ORDER BY tc1.time DESC

If there are two transactions with the same timestamp but affecting different tickets, would this not be liable to return the comment values for all of the tickets? This in turn could result in the cnum being pulled from the incorrect ticket.

In addition, when ticket_change gets large, the outer join becomes very slow.

Adding "ticket=?" as a clause to the inner select seems like it would resolve both of these problems, but the statement could then be further simplified to something like the one in comment:4, e.g.:

SELECT time,COALESCE(oldvalue,'') FROM ticket_change
                         WHERE field='comment' AND ticket=? ORDER BY time DESC;

Please let me know if there is something which I am overlooking.

comment:15 Changed 4 years ago by cboos

  • Resolution fixed deleted
  • Status changed from closed to reopened

Well spotted! For correction this could probably be fixed. Even if we now use microsecond timestamps, we might deal with older tickets with a second only time resolution (I wonder if this couldn't explain some strange numbering we've seen recently on some tickets … should have kept track of which).

If additionally the fix improves the performance, all the better! Thanks, we'll look into it.

comment:16 Changed 4 years ago by rblank

The current code looks… just horribly wrong. I can't for the life of me remember why I did it that way.

/me continues trying to understand

comment:17 Changed 4 years ago by rblank

Oh, right, the outer join is necessary to handle the case where a change doesn't have the 'comment' field. But yes, a ticket=? is missing (the same in _find_change()).

comment:18 Changed 4 years ago by rblank

So the simple query from comment:14 won't work in the case where a ticket change doesn't have a comment field (which shouldn't normally happen, but does in practice).

The query should return one row (time,oldvalue) for every ticket comment, where oldvalue is taken from the comment field if it exists, or an empty string otherwise. Adding the ticket=? to the inner query would make the current query correct, but performance is probably not stellar. I'm of course open to suggestions how this could be improved.

We could at least avoid the sub-query but I don't know if it would make a difference in performance:

SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,'')
FROM ticket_change AS tc1
LEFT OUTER JOIN ticket_change AS tc2
    ON tc2.ticket = ? AND tc1.time = tc2.time AND tc2.field = 'comment'
WHERE tc1.ticket = ? ORDER BY tc1.time DESC

comment:19 Changed 4 years ago by rblank

Suggested fix:

  • trac/ticket/model.py

    diff --git a/trac/ticket/model.py b/trac/ticket/model.py
    a b class Ticket(object): 
    297297                cursor.execute("""
    298298                    SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,'')
    299299                    FROM ticket_change AS tc1
    300                       LEFT OUTER JOIN
    301                         (SELECT time,oldvalue FROM ticket_change
    302                          WHERE field='comment') AS tc2
    303                       ON (tc1.time = tc2.time)
    304                     WHERE ticket=%s ORDER BY tc1.time DESC
    305                     """, (self.id,))
     300                    LEFT OUTER JOIN ticket_change AS tc2
     301                    ON tc2.ticket=%s AND tc2.time=tc1.time
     302                       AND tc2.field='comment'
     303                    WHERE tc1.ticket=%s ORDER BY tc1.time DESC
     304                    """, (self.id, self.id))
    306305                for ts, old in cursor:
    307306                    # Use oldvalue if available, else count edits
    308307                    try:
    class Ticket(object): 
    604603        # Fallback when comment number is not available in oldvalue
    605604        num = 0
    606605        cursor.execute("""
    607             SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,''),
    608                tc2.author,COALESCE(tc2.newvalue,'')
    609                FROM ticket_change AS tc1
    610                    LEFT OUTER JOIN
    611                           (SELECT time,author,oldvalue,newvalue
    612                            FROM ticket_change
    613                            WHERE field='comment') AS tc2
    614                         ON (tc1.time = tc2.time)
    615                WHERE ticket=%s
    616                ORDER BY tc1.time
    617                """, (self.id,))
     606            SELECT DISTINCT tc1.time,COALESCE(tc2.oldvalue,''),
     607                            tc2.author,COALESCE(tc2.newvalue,'')
     608            FROM ticket_change AS tc1
     609            LEFT OUTER JOIN ticket_change AS tc2
     610            ON tc2.ticket=%s AND tc2.time=tc1.time AND tc2.field='comment'
     611            WHERE tc1.ticket=%s ORDER BY tc1.time
     612            """, (self.id, self.id))
    618613        for ts, old, author, comment in cursor:
    619614            # Use oldvalue if available, else count edits
    620615            try:

Performance should not be too bad, because the outer join should be using the (ticket, time, field) index.

Last edited 4 years ago by rblank (previous) (diff)

comment:20 Changed 4 years ago by Michael Ching <michaelc@…>

  • Cc michaelc@… added

Thank you for the fix, we are testing it out.

I can confirm that in my initial testing, simply having the join use the ticket index resolves the performance issue. In our table (110k ticket, 1.6m ticket_change), the initial statement took up to 30 seconds to complete, with the new statement returning in less than 0.01s.

comment:21 Changed 4 years ago by rblank

I have applied the patch in [10630:10631] as a first step. If the performance is still bad, we can optimize from there.

The patch above is slightly wrong. If you would like to test it (i.e. if you can't update to current 0.12-stable), then please add a DISTINCT to the second SELECT. Or re-get the patch from comment:19, as I'll fix it shortly.

Last edited 4 years ago by rblank (previous) (diff)

comment:22 Changed 4 years ago by rblank

  • Resolution set to fixed
  • Status changed from reopened to closed

This seems to be working fine now. Closing again.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain rblank.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from rblank to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.