= Trac Load Testing While not necessarily definitive, some load testing has been done and the results are provided here for the benefit of anyone that has an interest in at least some insight into how Trac scales/performs under load. == Rationale This testing was prompted primarily due to investigation of issue #7490 and attempting to reproduce the issue or at least determine the extent of the impact of moving from Trac-0.10x to Trac-0.11x as well as the impact of various configurations (tracd vs. mod_python). Additional testing may be done in response to these results, but I figured while I was testing that I should probably capture the information somewhere as someone may find it of interest (so here it is). == Test Hardware and Services Configuration The server that Trac was install on is a Linux server (Gentoo flavored) with the following characteristics: * AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ * Linux 2.6.28-gentoo-!r5 !#6 SMP Tue May 26 19:03:14 PDT 2009 i686 * !MemTotal: 3633084 kB * !SwapTotal: 2096472 kB * nVidia Corporation MCP55 Ethernet (rev a2) * IDE interface: nVidia Corporation MCP55 IDE (rev a1) * IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) * Disk (from dmesg) * ata8.00: ATA-7: ST3320620AS, 3.AAD, max UDMA/133 * ata8.00: 625142448 sectors, multi 1: LBA48 NCQ (depth 31/32) * ata8.00: configured for UDMA/133 * scsi 7:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5 * sd 7:0:0:0: [sda] 625142448 512-byte hardware sectors: (320 GB/298 GiB) * sd 7:0:0:0: [sda] Write Protect is off * sd 7:0:0:0: [sda] Mode Sense: 00 3a 00 00 * sd 7:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA * sd 7:0:0:0: [sda] 625142448 512-byte hardware sectors: (320 GB/298 GiB) * sd 7:0:0:0: [sda] Write Protect is off * sd 7:0:0:0: [sda] Mode Sense: 00 3a 00 00 * sd 7:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA * GCC info: * Using built-in specs. * Thread model: posix * gcc version 4.3.2 (Gentoo 4.3.2-!r3 p1.6, pie-10.1.5) * CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer" The server was attached to a 100MB switch. The load driver (client) had a 1GB network card attached to a 1GB unmanaged switch that was attached to the 100MB managed switch. The load was generated from a client and was not generated on the server, which could skew the results. == Software Configurations Testing was conducted using three different versions of Trac: * Trac Branch 0.11-stable * Trac Trunk (r8253) * Trac Branch 0.10-stable ''Get output from /about on the site being tested as suggested in an email thread -- lance'' All the test were conducted on a "bare" installation of Trac, that is base install with only one ticket created (required as one of the read tests is for ticket #1, which will fail unless/until the first ticket is created by the clients creating tickets); however, it was also "attached" to a real (but empty) svn repo. Obviously, depending on the amount of change per day (tickets/day and change sets/day) has an impact on a number of the reports (timeline most directly) and will vary significantly from installation to installation. These specifics are difficult to factor into the testing and also can significantly reduce the "reproducability" of the results. Since the focus of this testing is to determine the performance impact from version to version of Trac as well as the performance of various configurations of Trac, I thought it more important to maintain reproducibility (for the sake of others) and also better enable others to compare the performance of their specific hardware and software configurations to the results presented here than it was to contrive or develop "the most realistic" test possible (which, in my (hopefully) humble opinion, while an interesting goal, it is generally not possible). ''Testing should also probably be done against a test environment that has a little more history/data, but that is also reproducible (archived in some way that it can be reused over and over for the testing but from a known state other than "empty")... -- lance'' ''Testing did not include authentication, which might have an impact. Probably should go back and setup an environment with authentication (as I would assume any "real" or useful scenario would require this and see if that has an impact on the numbers. -- lance'' ''Expand test scope to include other backends other than sqlite -- lance'' Testing also included two different configurations of accessing trac: * Tracd (run from the command line, non-daemonized) - ''you could perhaps also try with the new --http11 option (#8020)? Would be nice to get some numbers, I expect something like a 3x improvement, i.e. at least match mod_python performance if not outperform it ;-) -- cboos'' - ''will do and thanks for the pointer! -- lance'' * Mod_Python - version 3.3.1-!r1 (Gentoo calendar) For Mod_Python, Apache version 2.2.11 was used and was configured to use the "worker" MPM (more details on Apache config) - ''impact of other MPMs?? I do know that the choice of MPM has an impact on memory usage and potentially thread/process swaping as well as cost of processing a request, depending on MPM -- lance'' All configurations used the default Sqlite backend (3.6.13). Python was version 2.5.4-!r2 (Gentoo calendar) == Testing Methodology Initial testing was done with a proprietary tool; however, I have switched to the Apache JMeter project for load testing trac for several reasons: * It is OSS and thus easier/cheaper for others to obtain and potentially use to verify/test their specific configurations and thus compare them with the data obtained here (more easily verifiable by others) * Not platform specific. The original tool was only (mostly) available on Windows; however JMeter is 100% java and provides both *.bat and *.sh so should be usable regardless of platform (for client, sever was independent of load generator anyway, being HTTP). For the testing, I attempted to be realistic in the load that was placed on the server from the standpoint of ensuring that some form of "think time" was used in the clients. The current testing scenario also includes a set of clients that are read only and a set of clients that are creating new tickets. Obviously, the fact that tickets are being created (more tickets added over time) means that a number of the read only requests get longer and longer (more time to generate) as the testing continues, since they have to render successively more data as data is added to the system. In some of my initial scenarios, this had the effect that the more transactions per second, the more data that was added by the clients; however, with the current configuration in JMeter, servers that provide more TPS (transactions per second) *should* not be penalized. That is, if you are not careful, the data writers (clients adding new tickets) have a tendency to "equalize" the system by being able to add more tickets to a system with more TPS and thus the "average" performance would seem to be more equal between configurations with two different TPS. I have worked to engineer the testing such that this effect is minimized. Testing was conducted on a "warm" machine: did not reboot between tests and the first few test runs to warm up the machine/configuration were not recorded. == Test Cases This section provides the details (raw text) of the test cases created for JMeter. Thankfully, JMeter saves its test cases in XML format, so you can copy and past these into an XML file and download and use JMeter to potentially reproduce these tests on your own hardware for comparison with the results I have posted here. Two different test cases were used, one for tracd and one for configurations using Apache. The reason for this is that (for the most part and due to Python) tracd is pretty much single threaded while configurations leveraging Apache are not. Since the test server was a dual core, this enabled configurations using Apache to sustain greater/higher TPS than the configurations with tracd and thus I decided to increase the number of client threads for testing of the Apache configurations in order to better represent it's capabilities other than just through a reduced response time, ie show that it can sustain a higher TPS while still maintaining a low response time. === tracd === === mod_python === == Testing Results While the testing resulted in many MB of results from log files and test data, I am only providing key metrics for the testing here. If desired the testing could most likely be recreated and (one would hope) similar results could be obtained if anyone so desired. === Trac-0.11-stable Testing done on the stable branch of Trac-0.11 (r???). ==== Tracd Results (without --http11) ''needs to be redone since I fixed the issue with the errors on "get ticket !#1" -- lance'' ||sampler_label||aggregate_report_count||average||aggregate_report_min||aggregate_report_max||aggregate_report_stddev||aggregate_report_error%||aggregate_report_rate||aggregate_report_bandwidth||average_bytes ||New Ticket||71||1156||75||2953||517.4831967866745||0.0||0.11957753761218815||1.094648278883449||9374.0 ||Report All Active Tickets||491||1664||44||4965||848.4068696191049||0.0||0.8168466723895843||19.444474337954922||24375.617107942973 ||Get Wiki Start Page||502||1054||38||4006||545.5024255304309||0.0||0.8383139535854621||3.255015897906052||3976.0 ||Timeline||480||5239||209||14236||3246.8928787097334||0.0||0.7975303144595569||22.95792064427988||29477.1375 ||Roadmap||484||1059||49||3319||522.7798070814957||0.0||0.8127951420376304||6.4672116692150485||8147.716942148761 ||Browse Repo||495||810||43||2633||417.17340953994295||0.0||0.8283215694936696||4.945662183480757||6114.0 ||Get Ticket !#1||540||1910||29||5026||910.1263621397461||0.09444444444444444||0.9065726517250063||13.173966661944934||14880.375925925926 ||Create Ticket||71||2406||95||5754||1247.4502324835682||0.0||0.11675231243576568||1.6366039311408016||14354.169014084508 ||TOTAL||3134||1933||29||14236||2042.9401288283566||0.016273133375877474||4.835569051123722||67.50873074123534||14295.926611359286 ==== Trac-0.11-stable (with --http11) ''needs to be redone since I fixed the issue with the errors on "get ticket !#1" -- lance'' ||sampler_label||aggregate_report_count||average||aggregate_report_min||aggregate_report_max||aggregate_report_stddev||aggregate_report_error%||aggregate_report_rate||aggregate_report_bandwidth||average_bytes ||New Ticket||71||1373||155||3840||597.9516901599379||0.0||0.12039591603487072||1.1021399579207796||9374.0 ||Browse Repo||507||882||43||3776||489.6717565743946||0.0||0.8460462772295072||5.051491151348834||6114.0 ||Get Ticket !#1||523||2049||29||5540||868.1825280326381||0.015296367112810707||0.8724285876331996||13.496463017806384||15841.271510516252 ||Report All Active Tickets||501||1718||43||5545||925.2201978234693||0.0||0.8407055885945953||20.182527096918587||24582.8123752495 ||Roadmap||481||1121||50||3036||550.1249580973916||0.0||0.8109274582397648||6.524822720257001||8239.23076923077 ||Timeline||455||5314||212||16424||3450.3874664743503||0.0||0.7606749443704204||21.85913704092013||29426.178021978023 ||Get Wiki Start Page||468||1156||41||3727||563.0147932889474||0.0||0.7906340276721909||3.0698836855709293||3976.0 ||Create Ticket||71||2614||95||6646||1276.9382370567832||0.0||0.10596398120109426||1.4911780671535537||14410.239436619719 ||TOTAL||3077||2002||29||16424||2073.3038492683213||0.0025999350016249595||4.513113256115144||63.88590915782842||14495.353266168346 ==== mod_python ||sampler_label||aggregate_report_count||average||aggregate_report_min||aggregate_report_max||aggregate_report_stddev||aggregate_report_error%||aggregate_report_rate||aggregate_report_bandwidth||average_bytes ||New Ticket||82||611||76||1353||254.62547556725843||0.0||0.1410221353158982||1.2909584926281537||9374.0 ||Roadmap||877||369||48||2024||219.36793202953453||0.0||1.4648135744958752||11.780790836480373||8235.539338654504 ||Timeline||856||1940||218||6636||1029.9737836970821||0.0||1.4200869305550947||43.67996965621786||31496.866822429907 ||Get Wiki Start Page||910||244||37||1529||183.2723474641132||0.0||1.5245485828399492||5.91953629430824||3976.0 ||Get Ticket !#1||828||641||115||2079||306.54157480274296||0.0||1.3831166227900202||22.081270133910078||16348.021739130434 ||Browse Repo||866||258||43||1288||179.212920757861||0.0||1.4497096395646853||8.65578587529149||6114.0 ||Report All Active Tickets||874||2235||68||7207||1606.4483203058567||0.0||1.4639424606963274||42.157228129532946||29488.181922196796 ||Create Ticket||82||844||97||1909||440.54067306237306||0.0||0.13265045311453558||1.8483733535975773||14268.585365853658 ||TOTAL||5375||936||37||7207||1132.8669042178017||0.0||8.321180539429237||127.51528091676575||15691.961860465117 === Trac-0.10-stable Testing done on the "stable" branch of the 0.10 version (r???). == Discussion === About the presentation - what are the units used in the table above (asked by Shane Caraveo on Trac-Users) - maybe you could put the jmeter config files as attachments instead of inline (cboos) === About the tests I tried to run the tests locally and I noticed that the retrieval operations were done only on the main page, the sub-resources (images, CSS, etc.) were not queried. Maybe you should switch that option on instead, as I think this is more representative of a real load? (cboos)