= Improving Trac Performance =

This is the developer-oriented side of TracPerformance.
While in the latter we try to analyse the different factors that 
come into play, here we'll try to discuss about the practical solutions
we can imagine.

For a start, here's a raw list of the tickets tagged with the ''performance'' keyword:
[[TicketQuery(status=!closed,keywords=~performance,order=severity,format=table)]]

== Performance Analysis ==
=== Load Testing ===
 - using JMeter; see TracPerformance/LoadTestingWithJmeter
 - using `ab` (see [./0.11.5#HowItimed])
     ''For timing template generation I used ApacheBench and tossed out "warmup" requests, sometimes testing with keep alive (getting slightly better req/s)''
{{{
ab [-k] -c 1 -n 10 url
}}}

=== Profiling ===
 - #7490 contains some profiling data.
 - Shane Caraveo gave some instructions about profiling Trac ([./0.11.5#Profilingissues]); see also #8507 which contains his scripts.
 - here's some particularly interesting profiling data from ticket:7490#comment:106, typical of a request triggering an environment reload. Edited to make it more readable:
{{{
         935179 function calls (910454 primitive calls) in 6.699 CPU seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    6.699    6.699 ...trac/web/trac_profile.py:155(inspected)
        1    0.000    0.000    6.699    6.699 ...trac/web/wsgi.py:90(run)
        1    0.000    0.000    6.699    6.699 ...trac/web/main.py:314(dispatch_request)
        1    0.000    0.000    4.782    4.782 ...trac/web/main.py:432(_dispatch_request)
        1    0.000    0.000    4.782    4.782 ...trac/web/main.py:139(dispatch)
    43/41    0.003    0.000    2.894    0.071 ...trac/core.py:65(extensions)
  396/382    0.006    0.000    2.891    0.008 ...trac/core.py:194(__getitem__)
      105    0.157    0.001    2.786    0.027 ...trac/env.py:222(is_component_enabled)
        1    0.000    0.000    2.290    2.290 ...trac/web/chrome.py:706(render_template)
      2/1    0.000    0.000    1.867    1.867 ...trac/env.py:546(open_environment)
      6/2    0.001    0.000    1.733    0.867 ...genshi/template/loader.py:134(load)
      4/2    0.000    0.000    1.732    0.866 ...genshi/template/loader.py:242(_instantiate)
      4/2    0.001    0.000    1.732    0.866 ...genshi/template/base.py:345(__init__)
        1    0.000    0.000    1.476    1.476 ...trac/web/chrome.py:689(load_template)
    11699    0.098    0.000    1.423    0.000 ...trac/config.py:394(options)
        1    0.000    0.000    1.390    1.390 ...trac/env.py:175(__init__)
      189    0.007    0.000    1.349    0.007 ...genshi/template/eval.py:63(__init__)
        1    0.001    0.001    1.316    1.316 ...trac/loader.py:101(load_components)
        1    0.004    0.004    1.312    1.312 ...trac/loader.py:38(_load_eggs)
    11722    0.249    0.000    1.141    0.000 ...trac/config.py:296(get)
      382    0.163    0.000    1.102    0.003 {method 'sort' of 'list' objects}
      189    0.012    0.000    1.010    0.005 ...genshi/template/eval.py:423(_compile)
   806/67    0.010    0.000    1.006    0.015 ...genshi/template/base.py:412(_prepare)
        4    0.025    0.006    1.004    0.251 ...genshi/template/markup.py:72(_parse)
        1    0.000    0.000    0.995    0.995 ...trac/ticket/report.py:81(process_request)
       67    0.001    0.000    0.988    0.015 build/bdist.freebsd-6.1-RELEASE-i386/egg/pkg_resources.py:1911(load)
    63000    0.572    0.000    0.936    0.000 ...trac/env.py:236(<lambda>)
 2748/830    0.013    0.000    0.793    0.001 ...genshi/core.py:264(_ensure)
      940    0.020    0.000    0.762    0.001 ...genshi/template/interpolation.py:39(interpolate)
       99    0.001    0.000    0.705    0.007 ...genshi/template/directives.py:69 (attach)
        1    0.000    0.000    0.674    0.674 ...genshi/core.py:152(render)
        1    0.002    0.002    0.674    0.674 ...genshi/output.py:33(encode)
      231    0.004    0.000    0.667    0.003 ...genshi/output.py:302(__call__)
      234    0.001    0.000    0.662    0.003 ...genshi/output.py:751(__call__)
}}}
   - trac corresponds to Trac-0.11.5rc2-py2.5.egg and genshi to Genshi-0.5.1-py2.5-freebsd-6.1-RELEASE-i386.egg
   - as Shane found out, config is really taking more time than expected. Only 0.1ms per call to get/options, but as this happens 11000 times... 1.1s.
   - `Environment.is_component_enabled` is also unexpectedly slow, partly due to the following (line 236)
     {{{
#!python
rules.sort(lambda a, b: -cmp(len(a[0]), len(b[0])))
}}}
     There is some potential for caching `rules` here.


== Improvement Opportunities ==
=== Genshi ===
The impact of Genshi is especially important for requests generating a big amount of data, like the changeset view or the file browser view, especially when compared to ClearSilver.

From [./0.11.5#Genshi], we can get the following ideas:
 - revert to `out.write(_encode(u''.join(list(iterator))))` instead of using the StringIO (that was a change done during #6614, but we could have a `favor_speed_over_memory` setting) - 
 - avoid the whitespace filter (same setting)

Additionally, there's the idea to increase the size of the template cache for Genshi (#7842), also something that could be done if a  `favor_speed_over_memory` configuration is active.

We could also possibly gain some speed by not passing all the content through all the filters, but pre-render some parts in a faster way, then wrap them in `Markup` objects. That would make them opaque to the filters, but that would be most of the time OK (or the plugins that really need that could somehow selectively turn this optimization off). See #5499 (browser) and #7975 (changeset).

Also, the generated XHTML itself could certainly be improved: for example, using <pre> instead of tables rows for rendering lines in the browser view (#7055).

=== gc.collect ===
According to Shane's analysis, the systematic `gc.collect()` call after every request is one of the most critical performance killer for the average request ([./0.11.5#gc.collect]).
See proposed implementation of a secondary thread taking care of this, in #8507.

=== database level optimizations ===
 - #4425
 - Some MySQL specific changes have been suggested, see #6986.
 - #6654 - pad revision numbers with leading zeroes

Also, we should also simply fix the current behavior when it's known to be problematic:
 - timeout when trying to get a db connection (happens again on t.e.o, could be related to #8443)
 - prefer short-lived transactions as much as possible (#3446).
   We should fetch all data in memory instead of iterating over cursors and doing some work while iterating. This is a memory vs. speed (and concurrency) trade-off.

On the topic of transactions, we should probably use the transaction idea from the WikiRename branch, as refined by rblank on Trac-dev (googlegroups:trac-dev:21d21ad9866fc12b). This would help to visualize the span of write transactions in the code and better handle query failures (#8379).
 

=== misc. ===
 - some more modules could use paginated results (#6128 - timeline, #6101 - browser) or on-demand display (#515 - changeset view), or... no display at all (#5170)
 - there was once the suspicion that DEBUG logging level could significantly impact performance (get some numbers)