Edgewall Software
Modify

Opened 19 years ago

Closed 18 years ago

Last modified 18 years ago

#1153 closed defect (fixed)

Broken display of changeset numbers when using URL-style format

Reported by: haui at haumacher.de Owned by: Christian Boos
Priority: normal Milestone: 0.9
Component: wiki system Version: 0.8
Severity: normal Keywords: changeset, link, question mark.
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

The following log message

Add-on to changeset:123:
 * Some change.
 * Yet another change.

is displayed as:

Add-on to changeset:123:

  • Some change.
  • Yet another change.

In wiki mode, the reference to the changeset is rendered as

changeset:123:?

with an additional (magic?) question mark and with the link expanded around the closing colon.

When displaying the log message of a changeset, the link is rendered as

changeset:123:[[BR]]

May this problem be introduced in changeset:776?

Attachments (2)

shref_fixes_r2399.patch (3.8 KB ) - added by Christian Boos 18 years ago.
Combining fixes for this ticket and the related #2240
dont_escape_wiki_text_beforehand.patch (7.8 KB ) - added by Christian Boos 18 years ago.
Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

Download all attachments as: .zip

Change History (17)

comment:1 by Christian Boos, 19 years ago

The behavior is now slightly better, as the question mark is not displayed anymore. However, the ":" will be still part of the link.

One way to correctly write such content would be:

Add-on to changeset:"123":
 * Some change.
 * Yet another change.

which displays as:

Add-on to changeset:"123":

  • Some change.
  • Yet another change.

I'm not sure if we can close this as a "worksforme" or try to improve the situation, to make the "123" quoting unecessary.

However, as I'd rather not exclude the ":" or any other character from the target part in the shref/lhref regexps, I don't see how this could be done.

comment:2 by moschny at ipd uni-karlsruhe de, 18 years ago

What about this patch (against 0.9b2):

  • trac/wiki/formatter.py

    old new  
    151151                  r"(?P<htmlescapeentity>!?&#\d+;)"]
    152152    _post_rules = [(r"(?P<shref>!?((?P<sns>%s):" % LINK_SCHEME +
    153153                    r"(?P<stgt>'[^']+'|\"[^\"]+\"|"
    154                     r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)]))))"),
     154                    r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)" "\n" r":]))))"),
    155155                   (r"(?P<lhref>!?\[(?:(?P<lns>%s):" % LINK_SCHEME +
    156156                    r"(?P<ltgt>'[^']+'|\"[^\"]+\"|[^\] ]*)"
    157157                    r"|(?P<rel>[/.][^ [\]]*))"

It fixes the problem of Bernhards original bug report + one additional problem:

The formatting of the said changelog entry is even more broken in the timeline view, because newlines are not stripped by the one-line flavor of the wiki parser.

The patch addresses both problems by disallowing ':' and '\n' as the last character in the target part.

comment:3 by Christian Boos, 18 years ago

Milestone: 0.9
Owner: changed from Jonas Borgström to Christian Boos

This is somehow related to both #2172 and #2240.

comment:4 by Christian Boos, 18 years ago

You might be interested to test the following attachment:ticket:2172:oneliner_fixes_and_tests.patch

comment:5 by moschny at ipd uni-karlsruhe de, 18 years ago

The patch can be applied to r2391 (and not to 0.9b2). While it seems to fix the main problem described in this ticket, some problems remain. See for example the last sentence of the main ticket text:

May this problem be introduced in changeset:776?

The question mark after the changeset number becomes part of the (thus broken) link. Not sure, but can a qm really be the last character of the target part?

comment:6 by Christian Boos, 18 years ago

Resolution: fixed
Status: newclosed

Fixed in r2395, which also exclude "?" and "!" from the allowed SHREF_TARGET_LAST_CHAR.

comment:7 by moschny at ipd uni-karlsruhe de, 18 years ago

Resolution: fixed
Status: closedreopened

Well, I hate to nit-pick, but there are still some problems left:

  • monospace changeset:123 or changeset:123 don't work (no link generated) - is "monospace font-style" to be understood as "verbatim", i.e. without further processing?
  • subscript changeset:123: The link is broken, it contains a caret (and a colon) at the end
  • backslash, ampersand, gt, double quote as last characters of the target:
    • changeset:123\
    • changeset:123>
    • changeset:123&
    • … there might be more problems with non-alphanumerical last characters, so maybe instead of a black-list, a white-list regexp should be used (utilizing \w of course)?

comment:8 by Christian Boos, 18 years ago

No problem :)

Yes, "monospace font-style" is to be understood as "verbatim", i.e. without further processing. It provides exactly the same quoting as {{{...}}} does.

For the rest, please test the following:

Index: formatter.py
===================================================================
--- formatter.py        (revision 2396)
+++ formatter.py        (working copy)
@@ -138,7 +138,7 @@
     QUOTED_STRING = r"'[^']+'|\"[^\"]+\""
 
     SHREF_TARGET_MIDDLE = r"(?:\|(?=[^|\s])|&(?!lt;)|[^|&\s])"
-    SHREF_TARGET_LAST_CHAR = r"[^|'~_\.,&\s\)\]:?!]"
+    SHREF_TARGET_LAST_CHAR = r"[a-zA-Z0-9/=]"
 
     LHREF_RELATIVE_TARGET = r"[/.][^\s[\]]*"

That supports all the current unit-tests, but I'm sure that we miss something with that…

Also, "\w" can't be used, as it contains "_" and therefore will make fail the following: __ticket:1__.

We're going to need something similar for the first char (see #2240).

comment:9 by moschny at ipd uni-karlsruhe de, 18 years ago

Almost working now, but > and & make problems:

are rendered wrongly: The links contain a trailing ">" resp. "&", and additionally a semicolon is printed after the link. The link itself (the href attribute) contains a trailing "&gt" resp. "&amp" (without semicolon).

It seems to me that the raw wiki text contains &gt; and &amp;, which is wrong (?), but otoh parsed correctly by the shref rule.

Regarding {{{..}}}: I would consider this at least a documentation bug. The one-liner version is clearly placed in the font-style section.

by Christian Boos, 18 years ago

Attachment: shref_fixes_r2399.patch added

Combining fixes for this ticket and the related #2240

comment:10 by Christian Boos, 18 years ago

The attachment:shref_fixes_r2399.patch should now also handle the case of changeset:123>,

changeset:123& can't be easily fixed as this would break the query: and search: syntax, where we need to have "&" in the target.

I think the way to go would be to not HTML escape the wiki text beforehand, but do that while doing the parsing.

I'll have a try, but I suspect this would be a too big change for 0.9, now.

comment:11 by moschny at ipd uni-karlsruhe de, 18 years ago

The patch looks good, thanks.

comment:12 by Christian Boos, 18 years ago

Resolution: fixed
Status: reopenedclosed

Ok, I applied attachment:shref_fixes_r2399.patch in r2401.

I also experimented with the idea of not escaping the "<", ">" and "&" before doing the Wiki parsing. This allows to handle the changeset:123& situation easily.

While I succeeded in having all the unit-tests pass, this is quite a big change and I suspect it would make the Wiki engine less robust than it currently is. At that point near the 0.9 release, it's not the right moment to introduce that change.

However, one might consider the idea later, so I'll drop the patch here (attachment:dont_escape_wiki_text_beforehand.patch).

by Christian Boos, 18 years ago

Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

comment:13 by moschny at ipd uni-karlsruhe de, 18 years ago

Shall I create a new ticket for the changeset:123& case so it doesn't get lost?

comment:14 by Christian Boos, 18 years ago

Yes, please. That would be an ER for milestone:1.0.

comment:15 by Christian Boos, 18 years ago

See the ticket #2455 for the changeset:123& case.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.