Edgewall Software
Modify

Ticket #1153 (closed defect: fixed)

Opened 7 years ago

Last modified 6 years ago

Broken display of changeset numbers when using URL-style format

Reported by: haui at haumacher.de Owned by: cboos
Priority: normal Milestone: 0.9
Component: wiki system Version: 0.8
Severity: normal Keywords: changeset, link, question mark.
Cc:
Release Notes:
API Changes:

Description

The following log message

Add-on to changeset:123:
 * Some change.
 * Yet another change.

is displayed as:

Add-on to changeset:123:

  • Some change.
  • Yet another change.

In wiki mode, the reference to the changeset is rendered as

changeset:123:?

with an additional (magic?) question mark and with the link expanded around the closing colon.

When displaying the log message of a changeset, the link is rendered as

changeset:123:[[BR]]

May this problem be introduced in changeset:776?

Attachments

shref_fixes_r2399.patch (3.8 KB) - added by cboos 6 years ago.
Combining fixes for this ticket and the related #2240
dont_escape_wiki_text_beforehand.patch (7.8 KB) - added by cboos 6 years ago.
Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

Download all attachments as: .zip

Change History

comment:1 Changed 6 years ago by cboos

The behavior is now slightly better, as the question mark is not displayed anymore.
However, the ":" will be still part of the link.

One way to correctly write such content would be:

Add-on to changeset:"123":
 * Some change.
 * Yet another change.

which displays as:

Add-on to changeset:"123":

  • Some change.
  • Yet another change.

I'm not sure if we can close this as a "worksforme"
or try to improve the situation, to make the "123"
quoting unecessary.

However, as I'd rather not exclude the ":" or any other character
from the target part in the shref/lhref regexps,
I don't see how this could be done.

comment:2 Changed 6 years ago by moschny at ipd uni-karlsruhe de

What about this patch (against 0.9b2):

  • trac/wiki/formatter.py

    old new  
    151151                  r"(?P<htmlescapeentity>!?&#\d+;)"] 
    152152    _post_rules = [(r"(?P<shref>!?((?P<sns>%s):" % LINK_SCHEME + 
    153153                    r"(?P<stgt>'[^']+'|\"[^\"]+\"|" 
    154                     r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)]))))"), 
     154                    r"((\|(?=[^| ])|[^| ])*[^|'~_\., \)" "\n" r":]))))"), 
    155155                   (r"(?P<lhref>!?\[(?:(?P<lns>%s):" % LINK_SCHEME + 
    156156                    r"(?P<ltgt>'[^']+'|\"[^\"]+\"|[^\] ]*)" 
    157157                    r"|(?P<rel>[/.][^ [\]]*))" 

It fixes the problem of Bernhards original bug report + one additional problem:

The formatting of the said changelog entry is even more broken in the timeline view, because newlines are not stripped by the one-line flavor of the wiki parser.

The patch addresses both problems by disallowing ':' and '\n' as the last character in the target part.

comment:3 Changed 6 years ago by cboos

  • Milestone set to 0.9
  • Owner changed from jonas to cboos

This is somehow related to both #2172 and #2240.

comment:4 Changed 6 years ago by cboos

You might be interested to test the following attachment:ticket:2172:oneliner_fixes_and_tests.patch

comment:5 Changed 6 years ago by moschny at ipd uni-karlsruhe de

The patch can be applied to r2391 (and not to 0.9b2). While it seems to fix the main problem described in this ticket, some problems remain. See for example the last sentence of the main ticket text:

May this problem be introduced in changeset:776?

The question mark after the changeset number becomes part of the (thus broken) link.
Not sure, but can a qm really be the last character of the target part?

comment:6 Changed 6 years ago by cboos

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in r2395, which also exclude "?" and "!" from the allowed
SHREF_TARGET_LAST_CHAR.

comment:7 Changed 6 years ago by moschny at ipd uni-karlsruhe de

  • Resolution fixed deleted
  • Status changed from closed to reopened

Well, I hate to nit-pick, but there are still some problems left:

  • monospace changeset:123 or changeset:123 don't work (no link generated) - is "monospace font-style" to be understood as "verbatim", i.e. without further processing?
  • subscript changeset:123: The link is broken, it contains a caret (and a colon) at the end
  • backslash, ampersand, gt, double quote as last characters of the target:
    • changeset:123\
    • changeset:123>
    • changeset:123&
    • ... there might be more problems with non-alphanumerical last characters, so maybe instead of a black-list, a white-list regexp should be used (utilizing \w of course)?

comment:8 Changed 6 years ago by cboos

No problem :)

Yes, "monospace font-style" is to be understood as "verbatim",
i.e. without further processing.
It provides exactly the same quoting as {{{...}}} does.

For the rest, please test the following:

Index: formatter.py
===================================================================
--- formatter.py        (revision 2396)
+++ formatter.py        (working copy)
@@ -138,7 +138,7 @@
     QUOTED_STRING = r"'[^']+'|\"[^\"]+\""
 
     SHREF_TARGET_MIDDLE = r"(?:\|(?=[^|\s])|&(?!lt;)|[^|&\s])"
-    SHREF_TARGET_LAST_CHAR = r"[^|'~_\.,&\s\)\]:?!]"
+    SHREF_TARGET_LAST_CHAR = r"[a-zA-Z0-9/=]"
 
     LHREF_RELATIVE_TARGET = r"[/.][^\s[\]]*"

That supports all the current unit-tests, but I'm sure that
we miss something with that...

Also, "\w" can't be used, as it contains "_" and therefore
will make fail the following: __ticket:1__.

We're going to need something similar for the first char (see #2240).

comment:9 Changed 6 years ago by moschny at ipd uni-karlsruhe de

Almost working now, but > and & make problems:

are rendered wrongly: The links contain a trailing ">" resp. "&", and additionally a semicolon is printed after the link. The link itself (the href attribute) contains a trailing "&gt" resp. "&amp" (without semicolon).

It seems to me that the raw wiki text contains &gt; and &amp;, which is wrong (?), but otoh parsed correctly by the shref rule.

Regarding {{{..}}}: I would consider this at least a documentation bug. The one-liner version is clearly placed in the font-style section.

Changed 6 years ago by cboos

Combining fixes for this ticket and the related #2240

comment:10 Changed 6 years ago by cboos

The attachment:shref_fixes_r2399.patch should now also handle the
case of changeset:123>,

changeset:123& can't be easily fixed as this would break
the query: and search: syntax, where we need to have "&" in the target.

I think the way to go would be to not HTML escape the wiki text beforehand,
but do that while doing the parsing.

I'll have a try, but I suspect this would be a too big change for 0.9, now.

comment:11 Changed 6 years ago by moschny at ipd uni-karlsruhe de

The patch looks good, thanks.

comment:12 Changed 6 years ago by cboos

  • Resolution set to fixed
  • Status changed from reopened to closed

Ok, I applied attachment:shref_fixes_r2399.patch in r2401.

I also experimented with the idea of not escaping
the "<", ">" and "&" before doing the Wiki parsing.
This allows to handle the changeset:123& situation easily.

While I succeeded in having all the unit-tests pass, this is quite a big
change and I suspect it would make the Wiki engine less robust than it currently is.
At that point near the 0.9 release, it's not the right moment to introduce
that change.

However, one might consider the idea later, so I'll drop
the patch here (attachment:dont_escape_wiki_text_beforehand.patch).

Changed 6 years ago by cboos

Keep "<", ">" and "&" in the Wiki text being parsed, and escape them during the parsing. This enables an easier manipulation of those characters in regexps (instead of having to handle the corresponding substituted strings). Patch made on r2403.

comment:13 Changed 6 years ago by moschny at ipd uni-karlsruhe de

Shall I create a new ticket for the changeset:123& case so it doesn't get lost?

comment:14 Changed 6 years ago by cboos

Yes, please. That would be an ER for milestone:1.0.

comment:15 Changed 6 years ago by cboos

See the ticket #2455 for the changeset:123& case.

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
to The owner will be changed from cboos. Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.