Edgewall Software
Modify

Ticket #4245 (reopened defect)

Opened 5 years ago

Last modified 44 hours ago

[PATCH] Tickets are slow with large amount of users and permissions

Reported by: ants.aasma@… Owned by: cmlenz
Priority: high Milestone: next-minor-0.12.x
Component: general Version: devel
Severity: major Keywords: slow restrict_owner get_users_with_permission
Cc: zach@…, leho@…, ismael@…, mmitar@…, osimons, peter@…, verm@…, ryano@…
Release Notes:
API Changes:

Description

TicketSystem?.get_ticket_fields() implements an exremely inefficient algorithm to find users that have a certain permission (O(n2users * perms_per_useravg)). I have added a more efficient function to PermissionSystem? that enables querying users having a specified permission directly. It correctly resolves group memberships and hierarchical permissions.

Attachments

trac.get_users_with_permission.trunk.patch (4.3 KB) - added by ants.aasma@… 5 years ago.
Patch
trac.get_users_with_permission.v2.patch (4.0 KB) - added by ants.aasma@… 5 years ago.
Second version of the patch. Using the env.get_known_users() interface to get users.
get_users_with_permission.groupprovider_fix.patch (2.5 KB) - added by ants.aasma@… 5 years ago.
Adds support for the implicit group providers
groupprovider_fix.diff_against_prev_impl.patch (1.4 KB) - added by ants.aasma@… 5 years ago.
Diff against the previous implementation to highlight the changes.
perm-addcache.patch (4.2 KB) - added by Chris Everest <ceverest@…> 2 years ago.
Add caching logic to perms.py

Download all attachments as: .zip

Change History

Changed 5 years ago by ants.aasma@…

Patch

comment:1 Changed 5 years ago by cmlenz

  • Milestone set to 0.11
  • Owner changed from jonas to cmlenz

I'll look into this tomorrow. Candidate for being backported to 0.10-stable.

Changed 5 years ago by ants.aasma@…

Second version of the patch. Using the env.get_known_users() interface to get users.

comment:2 Changed 5 years ago by cmlenz

  • Resolution set to fixed
  • Status changed from new to closed

Slightly modified patch applied in [4625].

comment:3 Changed 5 years ago by ecarter

  • Milestone changed from 0.11 to 0.11.1
  • Resolution fixed deleted
  • Status changed from closed to reopened

This change did not account for the magic 'authenticated' group, causing the regression reported in #4630. See [5849]. We went back to a very inefficient solution, so we need to improve that.

Changed 5 years ago by ants.aasma@…

Adds support for the implicit group providers

Changed 5 years ago by ants.aasma@…

Diff against the previous implementation to highlight the changes.

comment:4 Changed 5 years ago by ants.aasma@…

Attached a fix that accounts for the group providers. This would be a lot more efficient, if the group providers had an get_group_members method.

This isn't tested, because I don't have a test enviroment set up nor the time to do so. But I figured I'd get the code out there if anyone has.

comment:5 Changed 5 years ago by ants.aasma@…

Already spotted an oops, line 168 in new should read:

                        implicit_groups[groupname] = set([username])

(missing list brackets from set construction)

comment:6 Changed 3 years ago by cboos

  • Keywords restrict_owner get_users_with_permission added
  • Priority changed from normal to high
  • Severity changed from normal to major

#8034 / #8212 are recent duplicates. Especially #8212 somewhat proving the "back to a very inefficient solution" part of comment:3...

I see there is a patch here that has been overlooked (attachment:get_users_with_permission.groupprovider_fix.patch).

comment:7 Changed 3 years ago by cboos

The above patch fails on the regression test for #4630:

FAIL: Test for regression of http://trac.edgewall.org/ticket/4630 b
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Workspace\src\trac\repos\0.11-stable\trac\ticket\tests\functional.py", line 941, in runTest
    self.assertEqual(users, ['admin', 'user'])
AssertionError: [u'a', u'i', u'd', u'admin', u'user', u'm', u'n'] != ['admin', 'user']

comment:8 Changed 2 years ago by zach@…

  • Cc zach@… added

comment:9 Changed 2 years ago by fabien

Hi, I also have the same problem with a setup using 1000 users.
It's slow when "restrict_owner = true"...

I modified eventually_restrict_owner to not check for permissions (not needed for my config), and now it's much faster, but not as fast as without restrict_owner...
Maybe it would help to cache users and permissions ?

comment:10 Changed 2 years ago by Chris Everest <ceverest@…>

In regard to this ticket I am also seeing significant performance decreases using restrict_owner in combination with the PrivateTicket? plugin and a large user base. The main cause of the performance hit is in perms.py.

IPermissionStore.get_user_permissions() is calling 'SELECT username,action FROM permission' against all users in the system. This has caused 1000's of database queries for each page request to display a single ticket.

I have added two levels of caching to reduce this performance issue and provided a patch. Pulling from existing cache logic in DefaultPermissionPolicy?.check_permission() I created new cache logic in PermissionSystem?.get_users_with_permission() and IPermissionStore.get_user_permissions().

The cache in IPermissionStore.get_user_permissions() is the more effective change as it limits the query 'SELECT username,action FROM permission' to one execution and uses the cached results for each additional check of that routine, per page request.

The cache in PermissionSystem?.get_users_with_permission() is not activated as often, but will certianly avoid redundant calls to its subroutines/methods.

In each case, the cache variables are very small and should only survive a single page request, thus uncompromising any security or permissions.

Changed 2 years ago by Chris Everest <ceverest@…>

Add caching logic to perms.py

comment:11 Changed 2 years ago by lkraav <leho@…>

  • Cc leho@… added

comment:12 Changed 23 months ago by rblank

#9184 was closed as a duplicate, and contains a patch to increase the permission cache duration.

comment:13 Changed 23 months ago by Ismael de Esteban <ismael@…>

#9184 Makes a big improvement when you have many users and restrict_owner option.

We can consider to include that since it has make a big improvement in the instance I use.

comment:14 Changed 23 months ago by Ismael de Esteban <ismael@…>

  • Cc ismael@… added

comment:15 Changed 14 months ago by Mitar

  • Cc mmitar@… added

comment:16 Changed 13 months ago by anonymous

any solution to this huge problem?

comment:17 Changed 13 months ago by cboos

No.

(answer designed to be as useful as the question...)

For the longer run, see TracDev/Proposals/EvenFinerGrainedPermissions#Performance.

comment:18 follow-up: Changed 5 months ago by osimons

  • Cc osimons added

Helping out with a user on IRC, we've done some profiling of requests and looking into what actually takes time on a project with 1800 authenticated users (using the DefaultPermissionStore). In summary - all times should be seen as relative of course:

  • ~2.5 seconds = restrict owner false
  • ~63 seconds = restrict owner true
  • ~14 seconds = restrict owner true, with caching full permissions table by a simple patch that just stores the table after first read and reuses it for any subsequent use (not for production use, but a test to see how much the SQL impacts times in a single process setting)
  • ~40 seconds = restrict owner true, using @cached decorator on the get_all_permissions() output (+ invalidate code when granting or revoking permissions). This obviously re-introduces som SQL calls to handle cache status.

Having 1800 users requires creating 1800 permission caches and checking the permission against each one. It it bound to take time. 40 seconds is of course better than 63 seconds, but still too much work involved. 14 seconds would only be attainable if we used something like self.config.touch() to invalidate the cached permissions table across processes...

Further it may well be more useful to cache the actual list made by eventually_restrict_owner() and store it for each ticket. Then we'd need an IPermissionChangeListener interface so that others can listen in on changes and invalidate caches (#3302).

comment:19 Changed 5 months ago by peter@…

  • Cc peter@… added

Said user on IRC.

comment:20 in reply to: ↑ 18 Changed 5 months ago by osimons

Replying to osimons:

Further it may well be more useful to cache the actual list made by eventually_restrict_owner() and store it for each ticket. Then we'd need an IPermissionChangeListener interface so that others can listen in on changes and invalidate caches (#3302).

No, that won't be good. Users can be added and removed from that list without an actual permission change - new users could register, sessions purged, or a security policy could change which users are approved. Anything really. So, bad idea.

comment:21 Changed 5 weeks ago by verm

  • Cc verm@… added

I hit this problem tonight.. I spent a couple of hours tracing it back using a 1GB tracefile from python -m trace -t tracd.. It was running 13,000 queries per ticket. This bumped it from taking ~14 seconds per ticket to instantaneous. I'm also running the FlexibleAssignTo plugin which requires restrict_owner to be true in order to work.

If there's a way to make this plugin work without restrict_owner being set then it'll be a great workaround.

comment:22 Changed 5 weeks ago by verm

To be clear, I'm not asking for any support for the FlexibleAssignTo? plugin I added it in case anyone else was interested in poking around.

comment:23 Changed 44 hours ago by Ryan J Ollos <ryano@…>

  • Cc ryano@… added
View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as reopened
as The resolution will be set. Next status will be 'closed'
to The owner will be changed from cmlenz. Next status will be 'new'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.