Edgewall Software
Modify

Opened 10 years ago

Closed 6 years ago

#8834 closed enhancement (duplicate)

Add a generic IResourceChangeListener

Reported by: Christian Boos Owned by: Christian Boos
Priority: high Milestone:
Component: general Version: none
Severity: major Keywords: resource
Cc: felix.schwarz@…, leho@… Branch:
Release Notes:
API Changes:

Description

Most individual resources have their own change listener interface, with a few more or less gratuitous differences between them and a few shortcomings, like not providing the model object to the listeners.

Some kind of generic notification interface would be helpful, mostly for plugins in need to address any kind of changes.

See discussion in ticket #6543, notably ticket:6543#comment:13 and further clarifications in ticket:6543#comment:22.

Attachments (0)

Change History (32)

comment:1 by doki_pen@…, 10 years ago

This would be a great addition and would simplify the AnnouncerPlugin codebase. Here are some thoughts I've jotted down. Feel free to pick them apart.

In AnnouncerPlugin, there are 4 types of events that I know of(since it is extensible, there could be others that I don't know of):

  • ticket change events
  • wiki change events
  • fullblog comments and posts
  • accountmanager events (created, deleted, changed)

Of course, there should be standard event action names, like created, deleted, version_deleted and changed. But I think it should be more extensible then just static method calls. The action should be passed as a string. I can imagine the need for custom actions in some plugins. I would recommend something like the AccouncementEvent and AnnouncementSubscriber Interfaces, along with some embellishments.

One problem I'm trying to solve below, with the uid() stuff is coming up with a way to track email replies with resources. I'm also working on an email2trac replacement that is more tightly integrated with Trac. I'm thinking the email header Message-ID can be the resource UID encoded in base64. The current impl of email2trac tries to scrape the email text for clues about what resource(ticket) it is a response too. The important thing is that there is a standard way to identify any trac Resource so we can track replies to notifications via the Message-ID header.

class IResource(Interface):
    def realm(): pass
    def url(req): pass
    def uid():
        """Some unique identifier for the resource like
           "${projecturl}.${realm}.${id}" """

class IResourceChangeEvent(Interface):
    def action():  pass
    def resource():  pass
    def author():  pass
    def comment():  pass
    def old_values(): pass

class IResourceManager(Component):
    def distribute(event):
        """distribute event to interested listeners"""

    def find_resource(uid):
        """returns a resource from a uid.  Uid is something 
           like "${projecturl}.${realm}.${id}"."""

class IResourceChangeListener(Interface):
    def subscription_realms():
        """Return iterable realms that this listener 
           is interested in"""
    def subscription_actions(realm):
        """Return iterable actions that this listener 
           is interested in for the given realm"""
    def resource_event(event):
        """Handle events"""

The first two methods of ResourceChangeListener allow us to make perf enhancements if necessary. We could cache the list of interested listeners instead of figuring it out each time.

I'm not totally sure we need the old_value stuff. Shouldn't we be able to glean this data from the resource itself?

comment:2 by doki_pen@…, 10 years ago

After reading #6543 I see that some things we want events for are not really resources, like ticket-enums. In that case we have the following side effects:

  • url() isn't needed
  • old_values is very needed
  • find_resource becomes complex
  • resource should be called something else.

Announce shouldn't really care about non-resources though, me thinks.

in reply to:  1 comment:3 by Remy Blank, 10 years ago

Replying to doki_pen@…:

The first two methods of ResourceChangeListener allow us to make perf enhancements if necessary. We could cache the list of interested listeners instead of figuring it out each time.

Unfortunately, that's probably not possible. Not all components are instantiated at startup, and they can be added at any time during the lifetime of a Trac process.

I'm not totally sure we need the old_value stuff. Shouldn't we be able to glean this data from the resource itself?

At the point where the handlers are called, the resource has already been committed, and the _old information has been reset in the model object. So it's not available from there anymore, hence the additional argument.

comment:4 by doki_pen@…, 10 years ago

At the point where the handlers are called, the resource has already been committed, and the _old information has been reset in the model object. So it's not available from there anymore, hence the additional argument.

But I was thinking of wiki and tickets particularly. They both allow us to programmatically get the older version. What resources don't keep history?

in reply to:  4 ; comment:5 by Christian Boos, 10 years ago

Cc: felix.schwarz@… added

Replying to doki_pen@…:

I was thinking of wiki and tickets particularly. They both allow us to programmatically get the older version. What resources don't keep history?

Anything else ;-) But even if we add history for e.g. Milestone, there's some efficiency concern. We already should pass the current data model when we have it at hand. IIRC that was discussed on Trac-dev, and was brought up by Felix Schwartz, I can't find the reference however, CC'ing Felix. So for the same performance reasons, I think it's fine to pass the old_values dict even if we could retrieve it by different means.

in reply to:  4 comment:6 by Remy Blank, 10 years ago

Replying to doki_pen@…:

But I was thinking of wiki and tickets particularly. They both allow us to programmatically get the older version.

Ah, you meant to retrieve it from the DB. Yes, that's mostly correct, except when we manipulate history (e.g. #4582, #5658).

in reply to:  5 comment:7 by Felix Schwarz <felix.schwarz@…>, 10 years ago

Replying to cboos:

But even if we add history for e.g. Milestone, there's some efficiency concern. We already should pass the current data model when we have it at hand. IIRC that was discussed on Trac-dev, and was brought up by Felix Schwartz, I can't find the reference however, CC'ing Felix. So for the same performance reasons, I think it's fine to pass the old_values dict even if we could retrieve it by different means.

I think the point back then was about policies - we might need to load db objects quite often and this might be expensive for heavily linked objects.

comment:8 by Christian Boos, 10 years ago

Ah yes, thanks for the precision Felix, this explains why I didn't find anything ;-)

Still, a listener interested in a specific realm will also know how to deal with the corresponding model instance (and here we have the parallel with the permission policy discussion), so we could as well pass the instance we already have at hand when notifying.

comment:9 by Doki Pen <doki_pen@…>, 10 years ago

What do people think about the UID idea? Any other suggestions on resource lookup based on email Message-Id ?

comment:10 by Felix Schwarz <felix.schwarz@…>, 10 years ago

I like the UID stuff. Actually we (agile42) will need something like this anyway in the future :-)

comment:11 by Christian Boos, 10 years ago

See also #7758 for some discussion of selectively disabling "notifications" in some situations.

comment:12 by Doki Pen <doki_pen@…>, 10 years ago

I thought about the UID some more and keeping track of resources that emails are associated with. I think a better strategy would be to take advantage of the Reply-To header and encode the UID that way.

It isn't necessary to have the UID on each resource. I can use the decorator pattern to set the Reply-To header on emails that support replies. The decorator could know about the type of resource it's dealing with and know how to create the UID itself. Two such examples would be a ticket email decorator and a fullblog email decorator.

UID still may be a good idea for other purposes, but I don't think it's needed for the announcer system.

in reply to:  11 comment:13 by Doki Pen <doki_pen@…>, 10 years ago

Replying to cboos:

See also #7758 for some discussion of selectively disabling "notifications" in some situations.

I think whether or not an announcement is sent should be the job of the subscriber. The event should be able to store arbitrary name/value pairs. This will allow the event producer and event subscriber to conspire without the knowledge of the rest of the system and handle all the special cases.

I plan on adding anti-subscribers to announcer in the near future. These will allow the user to opt-out of certain announcements. In the case of commit tickets, a tag could be added to the ticket or event that the anti-subscriber would see and stop the email from being sent.

comment:14 by Doki Pen <doki_pen@…>, 10 years ago

I've been thinking about the potential of many IResourceChangeListeners slowing down trac. What are the pros and cons of having events pushed on a Queue and a separate thread to actually handle the events? It seems like handling the events should be lower priority then a snappy UI. Does this seem plausible?

in reply to:  14 comment:15 by Christian Boos, 10 years ago

Replying to Doki Pen <doki_pen@…>:

… Does this seem plausible?

Using threads for this doesn't sound possible. We must support a few situations (TracCgi, TracFastCgi) where we don't control the duration of the process, so a separate thread handling the event queue could be killed any time.

Maybe it would be possible to achieve this by spawning another process to do the job (using multiprocessing). That process would be short-lived, simply used to dispatch the event in an asynchronous way.

comment:16 by Doki Pen <doki_pen@…>, 10 years ago

We also support python 2.5, which doesn't have multiprocessing, correct? Do you think more users are using cgi/fastcgi then are using < python 2.6? Either way we go, we'll need on/off configuration.

in reply to:  16 comment:17 by Christian Boos, 10 years ago

Replying to Doki Pen <doki_pen@…>:

We also support python 2.5, which doesn't have multiprocessing, correct?

Yes, and we will still support 2.4 for 0.12. Dropping 2.4 support for the next major release (0.13) has not yet been discussed. I think that for Python versions not supporting multiprocessing, we'll simply keep the current synchronous behavior, I'm afraid that's the only alternative. Whether we should need a way to disable asynchronous behavior when we have multiprocessing really depends if some downsides are identified or not. If it's all benefits, then I don't see a reason to disable it.

comment:18 by Felix Schwarz <felix.schwarz@…>, 10 years ago

What about measuring first if there is a performance problem? Premature optimization etc…

in reply to:  18 ; comment:19 by Remy Blank, 10 years ago

Replying to Felix Schwarz <felix.schwarz@…>:

What about measuring first if there is a performance problem? Premature optimization etc…

I was going to write the same comment… But maybe Doki Pen has already made some measurements in th:AnnouncerPlugin?

comment:20 by Christian Boos, 10 years ago

Well, we already know that there's a problem (#3220, #7709). So far, we decided to not address it and tolerate the delays, asking the admin to optimize his setup. But I think that with the upcoming work on the Announcer, it's probably time to address this:

  • taking into account recipient's preferences (#4056, but mostly about content preferences like #2625) and eventually recipient's language (#8903) means we're going to have more than a single e-mail to dispatch; the simplest solution would be one e-mail per recipient, we wouldn't have to bother with grouping logic and in practice we'll probably end up with 1 or 2 recipients per group anyway; in this scenario, even a well-configured SMTP relay could impose some noticeable delays
  • other notification methods could have their own, incompressible, delays
  • some setups (fcgi) have a timeout for limiting the time a request could take, killing the process past that limit; so there's another incentive for short-lived requests, beyond providing a snappy user interface ;-) (I wonder if some problems we had with undelivered notification on t.e.o were not due to that issue, as we're using lighttpd, idle-timeout of 20)

Beyond notifications, a wide variety of plugins can be imagined for reacting on IResourceChangeListener, each could trigger various, potentially slow, side-effects. The difference with other interfaces is that the user is not notified in case of failure (such errors are logged and the admin is able to see them - the only one who could do anything about those errors anyway). Those are really side-effects and as such, it doesn't make sense to make the user wait for them until they have completed, as there won't be any difference in the resulting page.

If we decide to not go the asynchronous way, then we should at least consider adding some notification to the user once the request is done (using req.add_notice or req.add_warning, depending on success status), or even "live" progress status some AJAXy way. But that wouldn't solve the issue with the timeout.

in reply to:  19 ; comment:21 by Doki Pen <doki_pen@…>, 10 years ago

Replying to rblank:

Replying to Felix Schwarz <felix.schwarz@…>:

What about measuring first if there is a performance problem? Premature optimization etc…

I was going to write the same comment… But maybe Doki Pen has already made some measurements in th:AnnouncerPlugin?

Correct. It directly effects the user experience to have to wait, especially for emails to be sent off. It is threaded in announcerplugin, but just the email delivery. I was thinking that for little cost, I could move it up to the event dispatch layer.

in reply to:  20 comment:22 by Doki Pen <doki_pen@…>, 10 years ago

Replying to cboos:

If we decide to not go the asynchronous way, then we should at least consider adding some notification to the user once the request is done (using req.add_notice or req.add_warning, depending on success status), or even "live" progress status some AJAXy way. But that wouldn't solve the issue with the timeout.

I really don't think it's worth it. I have a ticket on t-h for this behavior that I've been ignoring. Currently, the announcer system doesn't even pass around the req object, so doing custom notifications is impossible.

I _could_ keep the thread/process at the distribution level though and eliminate all the current issues. I just was thinking that it would be easy to move it up even further at a very low cost.

in reply to:  21 ; comment:23 by Remy Blank, 10 years ago

Replying to Doki Pen <doki_pen@…>:

Correct. It directly effects the user experience to have to wait, especially for emails to be sent off. It is threaded in announcerplugin, but just the email delivery.

Do you know exactly what (in the delivery) takes so much time? Is it the SMTP connection? The server response time?

We have an alternate delivery method using the sendmail executable. I haven't done any timings, but I would expect it to be able to send several dozen e-mails per second, as it probably writes directly to the delivery queue. Still, not having to wait at all would certainly be better.

in reply to:  23 comment:24 by Doki Pen <doki_pen@…>, 10 years ago

Replying to rblank:

Replying to Doki Pen <doki_pen@…>:

Correct. It directly effects the user experience to have to wait, especially for emails to be sent off. It is threaded in announcerplugin, but just the email delivery.

Do you know exactly what (in the delivery) takes so much time? Is it the SMTP connection? The server response time?

I'm my tests, with all the modules enabled, the entire announcement process takes between .30 and .70 seconds. This is with a Core Duo 2Ghz and 1-3 users that recieve notifications on a tracd instance, but on one active user (me).

For smtp I'm using lamson on localhost and the smtp part takes between .07 and .20 seconds. I haven't profiled the smtp connection at all. Do you think it's worth it?

Point .7 seconds seems like a long time to add to the request. Granted, I haven't done any optimization of the announcement process. But it would be nice to wait .01 seconds by queuing the event.

We have an alternate delivery method using the sendmail executable. I haven't done any timings, but I would expect it to be able to send several dozen e-mails per second, as it probably writes directly to the delivery queue. Still, not having to wait at all would certainly be better.

This is an interesting option but I'm not sure it's worth the trouble since all users can't use it. A lot of users use gmail to send their mails (and this is probably _really_ slow).

in reply to:  20 comment:25 by Christian Boos, 10 years ago

Replying to cboos:

… we should at least consider adding some notification to the user once the request is done (using req.add_notice or req.add_warning, depending on success status)

Forgot that we already do this in case of errors during notification (r8739).

comment:26 by Doki Pen <doki_pen@…>, 10 years ago

One thing that a co-worker brought up to me today, was that he would like a solution to notification spam. With the wiki change subscriber, sometimes we have this problem. A user will make multiple updates to a wiki page.

I know that in a perfect world they would be using the Preview functionality, but sometimes it doesn't happen. It would be nice if the announcer was smart enough to combine notifications that happened within a certain amount of time.

We could do something like this with a specialized Queue. Just another (+) for the concept of dispatching events to a queue before distributing notifications.

comment:27 by Carsten Klein <carsten.klein@…>, 10 years ago

I once implemented such a notification mechanism in PHP for my former employee. However, it requires a cronjob on the server that would check and read in available notification messages from the database.

Does mod_python allow services to be run, or at least sub processes or threads? Would the standalone trac server support such background threads?

In that case, such a solution could be easily implemented, I think, coalescing multiple pending notifications into one single notification mail.

comment:28 by Carsten Klein <carsten.klein@…>, 10 years ago

considering this once more, with high traffic sites in mind, it would be best to move all pending notifications into either a dedicated spooling folder of some sort, or a dedicated spooling database, so that the temporary notification data does not go into the standard trac database.

comment:29 by doki_pen, 10 years ago

I think it would be best to store it in normal tracdb by default, and make such a high performance setup a configurable option. trac should be easy to install and get working. It's a good idea though, and one I've been thinking about adding to announcer.

It could be a plugin that implements the AnnouncementDistributor interface. A seperate process could poll with xmlrpc for distribution jobs. Then the default install would just be normal, realtime notifications and the messiness could be separated out to a plugin.

in reply to:  29 comment:30 by Carsten Klein <carsten.klein@…>, 9 years ago

Replying to doki_pen:

+1 for that

comment:31 by lkraav <leho@…>, 6 years ago

Cc: leho@… added

comment:32 by Christian Boos, 6 years ago

Milestone: next-major-releases
Resolution: duplicate
Status: newclosed

There's been a recent implementation proposal for the ideas developed here, and more, in #11148. Closing this one in favor of the new one.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christian Boos.
The resolution will be deleted. Next status will be 'reopened'.
to as closed The owner will be changed from Christian Boos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.