#4246 closed enhancement (fixed)
Adopt Pygments for syntax highlighting
Reported by: | Christopher Lenz | Owned by: | Christopher Lenz |
---|---|---|---|
Priority: | normal | Milestone: | 0.11 |
Component: | general | Version: | 0.10.2 |
Severity: | major | Keywords: | mimeview highlighting |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description
Pygments provides very nice—and pure Python—syntax highlighting. We should strongly consider adopting it as the default highlighter for future Trac releases, probably starting with 0.11, seeing how the current Enscript/Silvercity/PHP methods all suck.
Matt has already cooked up a Trac plugin for Pygments here.
I'm attaching some patches for both the Trac Mimeview component and the TracPygments plugin, accomplishing the following:
- the Pygments highlighter generates a Genshi event stream instead of a big string,
- the Mimeview component handles Genshi streams correctly, and supports putting individual lines in table rows,
- the Pygments support comes with a preference panel that allows the user to choose the highlighting theme, including live preview.
Some of this requires recent Genshi trunk.
The patches are more of a preview of what I'd like to see integrated for 0.11, not really a patch ready to be integrated. For example, the TracPygments patch is intended to be applied to the plugin. Also, a lot of details need to be refined.
Attachments (7)
Change History (19)
by , 18 years ago
Attachment: | trac-pygments.diff added |
---|
follow-up: 2 comment:1 by , 18 years ago
Keywords: | mimeview highlighting added |
---|---|
Severity: | normal → major |
probably starting with 0.11, seeing how the current Enscript/Silvercity/PHP methods all suck.
Oh yes, big +1 from me.
follow-up: 3 comment:2 by , 18 years ago
Replying to cboos:
Oh yes, big +1 from me.
+1 as well, this would be great. Does this mean we could drop support for other highlighting engines in the future?
comment:3 by , 18 years ago
Replying to eblot:
Does this mean we could drop support for other highlighting engines in the future?
We should keep them in for 0.11, but after that I guess they could be moved to trac-hacks.
comment:4 by , 18 years ago
I've cleaned up and reworked both patches. The Trac patch is a candidate for merging into trunk, so I'd appreciate some testing. The Pygments part would go on trac-hacks first, and be merged into Trac trunk later on.
The one thing I don't like all that much is the added returns_source
class variable on IHTMLPreviewRenderer
. It is intended to detect the difference between renderers such as ImageRenderer
and PatchRenderer
, and the various syntax highlighters. The output by the former should not be messed with, while the syntax highlighter output should be put in a <pre>
element or a table with annotations. Previously, this difference in behavior was dependent on whether the renderer returns a list or a string. Now I want syntax highlighters to be able to returns Genshi streams, so that method doesn't work any more.
follow-up: 6 comment:5 by , 18 years ago
I've tried the latest patches. The Pygments highlighter and its Trac plugin look great.
The patch for Trac also look fine, it contains a few changes I wanted to do myself (the use of the builder). The PHP highlighter doesn't work anymore though, but that's probably a trivial thing.
The only thing I'm a bit concerned with is the direct use of Genshi streams… the code is a bit hard to read and it seems like we're using the internals of Genshi. I would have preferred the usage of the builder there too, if possible. Is there really a big difference in term of speed between the two approaches?
In any case, I'm very much in favor that you commit those changes.
comment:6 by , 18 years ago
Replying to cboos:
The only thing I'm a bit concerned with is the direct use of Genshi streams… the code is a bit hard to read and it seems like we're using the internals of Genshi. I would have preferred the usage of the builder there too, if possible. Is there really a big difference in term of speed between the two approaches?
- Which code exactly is hard to read? Assuming you mean the
GenshiHtmlFormatter
in the Pygments plugin, I don't think that's particularly hard to follow. - Streams aren't really internals, they are an important part of the public API, although admittedly low-level.
- The difference between the builder and direct stream generation isn't so much about speed but about memory usage (and garbage collection). Content highlighted by Pygments will normally have a ton of little
<span class="foo">
elements. You'd be creating/deletingElement
objects for every one of them, holding them all in a tree until you callgenerate()
and let theFragment
be garbage collected. And later the containing fragment has to do a number of runtime checks (isinstance
etc) to determine what to do with all the child nodes. That overhead can be completely avoided by just generating the stream directly.
I general I'd advise against using builder
to generate anything larger than small snippets of markup. In those cases, using a template or generating a stream directly is going to provide much better performance and memory usage.
follow-up: 9 comment:7 by , 18 years ago
No, I was talking about _group_lines(stream)
in trac/mimeview/api.py.
Ok, thanks for the clarifications, if think I should start to getting used to it…
Still, I wonder if the Genshi Stream API couldn't be "humanized" a bit, like by using small objects (with __slots__
) instead of tuples.
by , 18 years ago
Attachment: | trac-pygments.3.diff added |
---|
New patch: moves Pygments support into Trac proper (replacing the plugin)
follow-up: 10 comment:8 by , 18 years ago
Above patch adds the PygmentsRenderer
to the Trac code base, so that the plugin isn't necessary anymore. Unfortunately, we cannot name the module pygments
because that would conflict with the name of the Pygments package itself. For the lack of a better idea, I chose pygment
, but maybe something like trac.mimeview.pygments_renderer
would be more appropriate — not sure.
follow-up: 11 comment:9 by , 18 years ago
Status: | new → assigned |
---|
Replying to cboos:
No, I was talking about
_group_lines(stream)
in trac/mimeview/api.py.
Ah right, that one's a bitch, and would definitely deserve some comments. It's the replacement for the old _html_splitlines()
, but now works on Genshi streams instead of on flat strings.
Still, I wonder if the Genshi Stream API couldn't be "humanized" a bit, like by using small objects (with
__slots__
) instead of tuples.
That does unfortunately still have quite significant overhead compared to raw tuples.
One option for a more convenient API (on top of the current tuple stuff) would be to provide event creation functions, for example:
yield Stream.start('div', class_='foobar') yield Stream.end('div')
But I don't think anything in that direction would actually help improving the _group_lines()
function.
comment:10 by , 18 years ago
I've tested the attachment:trac-pygments.3.diff, it works fine.
I've sorted out the PHP issue, it's simply that it goes through the _group_lines(stream)
. The following would fix it:
Index: php.py =================================================================== --- php.py (revision 4376) +++ php.py (working copy) @@ -102,7 +104,6 @@ break html = PhpDeuglifier().format(odata.decode('utf-8')) - for line in html.split('<br />'): - # PHP generates _way_ too many non-breaking spaces... - # We don't need them anyway, so replace them by normal spaces - yield line.replace(' ', ' ') + return [line.replace(' ', ' ') for line in html.split('<br />')] + # PHP generates _way_ too many non-breaking spaces... + # We don't need them anyway, so replace them by normal spaces
by , 18 years ago
Attachment: | generate1.py added |
---|
Clarify a bit the _generate()
helper function within _group_lines()
by , 18 years ago
Attachment: | generate2.py added |
---|
Alternate API for the _generate()
helper function within _group_lines()
comment:11 by , 18 years ago
Replying to cmlenz:
Replying to cboos:
Still, I wonder if the Genshi Stream API couldn't be "humanized" a bit,…
… One option for a more convenient API (on top of the current tuple stuff) would be to provide event creation functions, for example: …
Yes, that was something like that I had in mind. More specifically, those helper function could do the right thing to produce a new event from an old one, overriding the adequate parts. See attachment:generate2.py for an illustration.
Even without that, I think it's possible to improve a little bit the readability of this function, see attachment:generate1.py.
But I don't think anything in that direction would actually help improving the
_group_lines()
function.
What was the most problematic part I think was that event[1][0]
expression, which gives you the feeling that you have to really know the internals to understand what's going on ;)
Patch for Trac