Edgewall Software
Modify

Opened 18 years ago

Closed 18 years ago

Last modified 18 years ago

#5586 closed enhancement (fixed)

WSGI application entry point should be cleaned up.

Reported by: Graham.Dumpleton@… Owned by: Christopher Lenz
Priority: low Milestone: 0.11
Component: general Version: 0.10
Severity: minor Keywords: wsgi
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

One purpose of the WSGI environment passed to an application which supports the WSGI protocol is configuration. How Trac uses the WSGI environment is however quite poorly done, with it not actually being possible to use the WSGI environment to configure Trac. The result is that one is forced to set the configuration information using environment variables in os.environ rather than the more obvious method for WSGI applications of it being passed in the WSGI environment. The result is that for some WSGI web server adapters, one has to use fiddles outside of the normal configuration mechanisms for WSGI application entry points.

Further the WSGI application entry point for Trac includes code which is specific to mod_python when it shouldn't do this. The mod_python specific code should be in the mod_python front end and the information passed from the mod_python front end to the WSGI entry point should be the same as for any other WSGI application. This would leave the WSGI application entry point as a pure WSGI application rather than it mixing in mod_python specific code.

Suggested Changes

  1. Move the mod_python specific code from trac.web.main.dispatch_request() into trac.web.modpython_frontend where it belongs. In particular, do not have the mod_python front end pass:
  • mod_python.options
  • mod_python.subprocess_env

Instead, have the mod_python front end itself extract from those tables supplied by the mod_python request objects the values it needs and pass only:

  • trac.env_path
  • trac.env_parent_dir
  • trac.env_index_template
  • trac.template_vars
  • trac.locale

This would see the following code eliminated from the WSGI entry point and replaced with equivalent code in the mod_python front end.

    if 'mod_python.options' in environ:
        options = environ['mod_python.options']
        environ.setdefault('trac.env_path', options.get('TracEnv'))
        environ.setdefault('trac.env_parent_dir',
                           options.get('TracEnvParentDir'))
        environ.setdefault('trac.env_index_template',
                           options.get('TracEnvIndexTemplate'))
        environ.setdefault('trac.template_vars',
                           options.get('TracTemplateVars'))
        environ.setdefault('trac.locale', options.get('TracLocale'))

Most importantly though, it shouldn't pass through values in the WSGI environment where there was no equivalent entry set in the mod_python options.

For example, in the mod_python front end it would use:

        if options.has_key('TracEnv'):
            environ['trac.env_path'] = options['TracEnv']

If the WSGI entry point has to add dummy entries for values which weren't defined then it should be doing it and the mod_python front end shouldn't be required to provide dummy entries.

  1. Have the mod_python front end also do the mod_python specific fiddle of SCRIPT_NAME and PATH_INFO required when mod_python configuration is not hosted at the root of the URL namespace.

This fiddle is only required because of shortcomings of how the WSGI adapter for mod_python is written, the need to fix up these values should therefore be in the mod_python front end.

This would see the following code eliminated from the WSGI entry point and replaced with equivalent code in the mod_python front end.

        if 'TracUriRoot' in options:
            # Special handling of SCRIPT_NAME/PATH_INFO for mod_python, which
            # tends to get confused for whatever reason
            root_uri = options['TracUriRoot'].rstrip('/')
            request_uri = environ['REQUEST_URI'].split('?', 1)[0]
            if not request_uri.startswith(root_uri):
                raise ValueError('TracUriRoot set to %s but request URL '
                                 'is %s' % (root_uri, request_uri))
            environ['SCRIPT_NAME'] = root_uri
            environ['PATH_INFO'] = urllib.unquote(request_uri[len(root_uri):])
  1. Change the WSGI entry point so that if the values:
  • trac.env_path
  • trac.env_parent_dir
  • trac.env_index_template
  • trac.template_vars

exist in the WSGI environment that they should be used in preference to the following environment variables defined in os.environ:

  • TRAC_ENV
  • TRAC_ENV_PARENT_DIR
  • TRAC_ENV_INDEX_TEMPLATE
  • TRAC_TEMPLATE_VARS

Also don't wipe out the value for:

  • trac.locale

This would thus see the code:

        environ.setdefault('trac.env_path', os.getenv('TRAC_ENV'))
        environ.setdefault('trac.env_parent_dir',
                           os.getenv('TRAC_ENV_PARENT_DIR'))
        environ.setdefault('trac.env_index_template',
                           os.getenv('TRAC_ENV_INDEX_TEMPLATE'))
        environ.setdefault('trac.template_vars',
                           os.getenv('TRAC_TEMPLATE_VARS'))
        environ.setdefault('trac.locale', '')

needing to be changed. In other words for each entry it should say:

        if not environ.has_key('trac.env_path'):
            environ.setdefault('trac.env_path', os.getenv('TRAC_ENV'))
  1. Change how Python egg cache directory is set so that value is read from entry in WSGI environment directly. This would necessitate mod_python front end copying entry if it exists from mod_python subprocess_env table to WSGI environment.

This sees the code:

    if 'mod_python.subprocess_env' in environ:
        egg_cache = environ['mod_python.subprocess_env'].get('PYTHON_EGG_CACHE')
        if egg_cache:
            os.environ['PYTHON_EGG_CACHE'] = egg_cache

changing.

Outcome Of Changes

The result of the changes is that one has a pure WSGI entry point for Trac that can be properly configured through the WSGI environment. The possible configuration values that could be supplied in this way would be:

  • trac.env_path
  • trac.env_parent_dir
  • trac.env_index_template
  • trac.template_vars
  • trac.locale
  • PYTHON_EGG_CACHE

The mod_python front end would then be a WSGI adapter for Trac that yields a proper WSGI environment adhering to this interface. This includes SCRIPT_NAME and PATH_INFO having already been fixed up to work around not being able to have a WSGI adapter for mod_python that can work this out itself.

For other WSGI hosting solutions such as mod_wsgi, its standard mechanism for configuring WSGI applications could then be used, namely whereby any Apache variables set using SetEnv will be passed through to the WSGI application in exactly the same way that such variables would be automatically passed through to CGI scripts. For example:

SetEnv trac.env_path /usr/local/trac/mysite
SetEnv PYTHON_EGG_CACHE /usr/local/trac/mysite/eggs

This would replace the current need to step outside of the normal configuration mechanisms for Apache modules to configure Trac by having to explicitly set variables in os.environ in the WSGI application script file itself.

import os
os.environ['TRAC_ENV'] = '/usr/local/trac/mysite'
os.environ['PYTHON_EGG_CACHE'] = '/usr/local/trac/mysite/eggs'

import trac.web.main

application = trac.web.main.dispatch_request

Why this is beneficial for Apache at least is that one can then use a single WSGI application script for Trac and for the configuration to be dynamically generated by Apache configuration using rewrite rules, thereby making it somewhat easier to manage a mass hosting like situation for distinct Trac instances.

The only way around this at present for mod_wsgi is to create a WSGI middleware component that wraps the Trac WSGI entry point which fakes up the mod_python.options and mod_python.subprocess_env entries in the WSGI environment so that Trac thinks it is actually mod_python and will use those dynamic entries.

Making the Trac WSGI entry point a better behaved WSGI application, by cleaning up the WSGI entry point so that isn't also doing mod_python stuff and can accept configuration through the WSGI environment may also be of benefit to other WSGI deployment solutions such as provided by Paste.

Attachments (0)

Change History (7)

comment:1 by Graham.Dumpleton@…, 18 years ago

Note that the mod_wsgi documentation has now been updated to document a workaround for the above problem whereby a WSGI middleware component is used to wrap the Trac application, making Trac think it is being run under mod_python, allowing configuration to be supplied from the Apache configuration files rather than having to be hardwired into the script file. Details can be found at:

http://code.google.com/p/modwsgi/wiki/IntegrationWithTrac

comment:2 by Christian Boos, 18 years ago

Keywords: wsgi added
Milestone: 0.11.1
Owner: changed from Jonas Borgström to Christopher Lenz

It would be nice to hear what cmlenz thinks of those suggestions.

comment:3 by Christopher Lenz, 18 years ago

Status: newassigned

I must be missing something here. We're using setdefault in all the places you point out being problematic. That means that if the environ already has an entry with that key, that entry is not overridden. Basically,

if 'foobar' not in environ:
    environ['foobar'] = 42

is equivalent to:

environ.setdefault('foobar', 42)

So in fact, if all the Trac-related keys are already in the environ, the WSGI dispatcher shouldn't be replacing them.

I do agree that we should move the mod_python stuff into the ModPythonGateway though.

comment:4 by Christopher Lenz, 18 years ago

I've just setup a Trac on mod_wsgi on a test machine, and it seems to work okay, using:

from trac.web.main import dispatch_request as application

in a trac.wsgi file mapped with WSGIScriptAlias, and a simple:

SetEnv trac.env_path /var/trac/foobar

in the Apache config.

BTW, why is it we need this script file between mod_wsgi and the WSGI app?
Why can't I just map WSGIScriptAlias to trac.web.main:dispatch_request (or can I)?

comment:5 by Graham.Dumpleton@…, 18 years ago

Ahhh, trapped by setdefault again. I always forget which way that works. :-(

Doesn't help that I don't have Trac myself so couldn't test. Was thus reacting to someone wanting to know how to make it work and assuming that had actually tried to get it working properly in the first place. Was most likely trying to get a solution to them quickly and didn't look and think properly.

BTW, the reason that WSGIScriptAlias maps to a script file rather than a module/function is that using files in this way as an intermediary is the Apache way of doing things. It means that all Apache's access controls on what a user can designate as a script can still be applied. This sort of thing is important where Apache needs to be controlled quite well such as in a shared hosting environment.

A script file is also used as it gives a greater measure of control to the user to be able to setup the environment or any other prerequisites which couldn't otherwise be done through a simple key/value setting in Apache configuration. In many cases, such as a shared hosting environment, a user wouldn't even necessary have any ability to add key/value settings into the Apache configuration themselves.

Another thing and a big one, is that without the use of a script file and the way the directive works within Apache, the SCRIPT_NAME, ie., mount point, wouldn't be able to deduced automatically. This would mean one would need to use the horrible hack required with mod_python of having to manually specify what the mount point is when a WSGI application isn't mounted at the root of the URL namespace.

There are other reasons why the script file is beneficial as well, but all up, doing things in the script file also suits WSGI model better anyway as it is much easier to then take that file and use it as is under a different WSGI hosting mechanism without needing to separately translate some Apache configuration to work with the alternate WSGI hosting mechanism. For example, it is very simple to add a few mores lines to the file to check for when being run as a program and run it as WSGI under CGI instead. Similar things might be done for flup based FASTCGI/SCGI, thus allowing one script to serve for all purposes.

Anyway, I'll fix up my Trac documentation for mod_wsgi and drop the unnecessary workaround now that you have pointed out my folly. :-)

comment:6 by Christopher Lenz, 18 years ago

Milestone: 0.11.10.11
Priority: normallow
Resolution: fixed
Severity: normalminor
Status: assignedclosed
Version: 0.10

Okay, I've moved the mod_python stuff into the mod_python frontend in [5863].

in reply to:  5 comment:7 by Christopher Lenz, 18 years ago

Replying to Graham.Dumpleton@gmail.com:

Another thing and a big one, is that without the use of a script file and the way the directive works within Apache, the SCRIPT_NAME, ie., mount point, wouldn't be able to deduced automatically. This would mean one would need to use the horrible hack required with mod_python of having to manually specify what the mount point is when a WSGI application isn't mounted at the root of the URL namespace.

Okay, this is a very good reason :-)

Thanks for the detailled explanation!

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Christopher Lenz.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Christopher Lenz to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.