Edgewall Software

Opened 11 years ago

Last modified 6 years ago

#9536 closed task

Remove Python 2.4 compatibility — at Version 23

Reported by: Remy Blank Owned by: Remy Blank
Priority: normal Milestone: 1.0
Component: general Version: 0.13dev
Severity: normal Keywords: python25
Cc: Branch:
Release Notes:

Removed Python 2.4 support

API Changes:

db API makeover by introducing context managers and a shortcut for query execution; see TracDev/DatabaseApi#Trac0.13API for details

Internal Changes:

Description (last modified by Remy Blank)

The goal of this task is to remove Python 2.4 compatibility code, and to start using Python 2.5-only features. See what's new in Python 2.5. This includes:

  • Use context managers and the with statement for: database transactions, handling locks, handling files.
  • Use ... if ... else ... expressions instead of the more error-prone ... and ... or ....
  • Use try: except: finally: instead of nested try: finally: and try: except:.
  • Use except Exception: instead of except:.
  • Remove our own implementations for functionality that was introduced in Python 2.5.
  • Use generator expressions instead of list comprehensions where adequate.
  • Use tuple arguments to str.startswith() and str.endswith().
  • Use relative imports for parent, sibling and child modules?
  • Remove workarounds for 2.3 and 2.4.

Please add new items to this list as they pop up.

Change History (29)

comment:1 by Christian Boos, 11 years ago

Description: modified (diff)
Keywords: python25 added
Version: 0.13dev

The change concerning the context managers and the with statement mostly concerns the refactor transaction handling topic (#8751), as discussed in the googlegroups:trac-dev:4efe0b99b4e78d23 mail and follow-ups.

But we could also use it for other things, like holding locks.

Slightly related, a few places could also benefit from the unification of try/except/finally.

comment:2 by Christian Boos, 11 years ago

  • Use ... if ... else ... expressions instead of the more error-prone ... and ... or ....

Btw, Genshi supports this at least since the merge of the AST branch (see log:trunk/scripts/ast_generator.py) nearly 2 years ago, so we can also use that in templates (same thing for generator expressions).

comment:3 by Remy Blank, 11 years ago

Description: modified (diff)

Copying items to the description, to get a nice overview. Keep 'em coming!

We should probably also start using except Exception: instead of except:.

What about using relative imports for parent, sibling and child modules (e.g. use from .api import ...)? I wouldn't want to use relative imports for cross-package imports though.

in reply to:  1 ; comment:4 by Christian Boos, 11 years ago

The change concerning the context managers and the with statement mostly concerns the refactor transaction handling topic (#8751), as discussed in the googlegroups:trac-dev:4efe0b99b4e78d23 mail and follow-ups.

I'd like to use with env.db_transaction as db (TracDev/ApiChanges/0.13@3#DatabaseAPIChanges).

See 9536-db-context-manager-r10102.patch.

Example usage:

  • trac/versioncontrol/api.py

    diff -r c3a770df352e trac/versioncontrol/api.py
    a b class DbRepositoryProvider(Component):  
    205205        if type_ and type_ not in rm.get_supported_types():
    206206            raise TracError(_("The repository type '%(type)s' is not "
    207207                              "supported", type=type_))
    208         @self.env.with_transaction()
    209         def do_add(db):
     208        with self.env.db_transaction as db:
    210209            id = rm.get_repository_id(reponame)
    211210            cursor = db.cursor()
    212211            cursor.executemany("""

As you can see in the patch, I'd like to have the context manager to be simpler to use than the @with_transaction decorator of 0.12, in particular we assume no one will obtain a database connection by other means so no need to keep a db parameter. I'll do more clean-ups in that direction later (in particular to handle the issue discussed in #9470).

by Christian Boos, 11 years ago

db: introduce context managers for accessing Connection instances.

in reply to:  4 ; comment:5 by Remy Blank, 11 years ago

Replying to cboos:

See 9536-db-context-manager-r10102.patch.

A few comments:

  • Why set self._env to None after getting the connection? Wouldn't it be more intuitive to set a _db = None at class scope, and check for that in __exit__() to decide if a commit or rollback is needed?
  • Does it make sense to use "private" attributes like _env and _db, when the context manager won't ever be accessible to the users anyway? I would prefer dropping the underscores.
  • I like the idea of preventing commits and rollbacks in query connections, but could we add that functionality to ConnectionWrapper instead? (I haven't got time to look now, so correct me if I'm wrong.) (Yes, I'm wrong, we probably need another wrapper.)
Last edited 11 years ago by Remy Blank (previous) (diff)

in reply to:  5 ; comment:6 by Christian Boos, 11 years ago

Replying to rblank:

Replying to cboos:

See 9536-db-context-manager-r10102.patch.

A few comments:

  • Why set self._env to None after getting the connection? Wouldn't it be more intuitive to set a _db = None at class scope, and check for that in __exit__() to decide if a commit or rollback is needed?

The idea was that the _env was "consumed" after being used. Your idea is slightly better, though.

  • Does it make sense to use "private" attributes like _env and _db, when the context manager won't ever be accessible to the users? I would prefer dropping the underscores.

Well, it's really a private, but as it's highly unlikely that someone would write db = env.db_transaction.__enter__().db, I concede we could drop the _.

  • I like the idea of preventing commits and rollbacks in query connections, but could we add that functionality to ConnectionWrapper instead? (I haven't got time to look now, so correct me if I'm wrong.)

Well, I first thought about subclassing, but as the main method __getattr__ would have had to be reimplemented anyway, it was not worth the import… However we could add a flag readonly, and that would be shorter.

Refined patch in 9536-db-context-manager-r10102.2.patch.

Now a bit of clean-up remains to be done. We currently have the following clumsy sequence:

  1. Environment.db_transaction
  2. trac.db.api.TransactionContextManager
  3. Environment.get_db_cnx()
  4. trac.db.api.get_read_db()
  5. _transaction_local.db or DatabaseManager(env).get_connection()

We have 1. for the same reason we had Environment.with_transaction so that we don't need to import trac.db.api from everywhere. For 2. it's fine to have the context managers themselves in trac.db.api. Now, 3. is problematic: at least we should rename it to _get_db_cnx to discourage its use (would that be enough? people would also do the same rename and curse us ;-) ). The interest of having to go through the Environment here as opposed to inline directly DatabaseManager(env).get_connection() is that the tests (and more precisely the EnvironmentStub) do some hacking at this level in order to reuse one single DatabaseManager instance across all test environments, especially useful when dealing with for PostgreSQL and MySQL backend.

I'd really prefer to inline DatabaseManager(env).get_connection() in the context managers and fine a way to make the tests work nevertheless.

Last edited 11 years ago by Christian Boos (previous) (diff)

by Christian Boos, 11 years ago

Updated patch, reuse ConnectionWrapper

in reply to:  6 ; comment:7 by Remy Blank, 11 years ago

Replying to cboos:

Well, I first thought about subclassing, but as the main method __getattr__ would have had to be reimplemented anyway, it was not worth the import… However we could add a flag readonly, and that would be shorter.

What still bothers me is that for nested queries, or queries nested in a transaction, this wraps the connection repeatedly. We probably won't have more than 3-4 levels of nesting, but it's still… unaesthetic. We could check if the connection is already wrapped as read-only, and only wrap if it isn't. That would only re-wrap the first query level in a transaction, and we probably can't avoid that.

Now a bit of clean-up remains to be done. We currently have the following clumsy sequence:

I agree this should be simplified.

in reply to:  7 comment:8 by Christian Boos, 11 years ago

Replying to rblank:

Replying to cboos:

Well, I first thought about subclassing, but as the main method __getattr__ would have had to be reimplemented anyway, it was not worth the import… However we could add a flag readonly, and that would be shorter.

What still bothers me is that for nested queries, or queries nested in a transaction, this wraps the connection repeatedly. … We could check if the connection is already wrapped as read-only, and only wrap if it isn't.

Sure.

Now a bit of clean-up remains to be done. We currently have the following clumsy sequence:

I agree this should be simplified.

Ok, I think I found a way which is not that bad (I just hope I won't get shot by the anti-monkey-patching patrol ;-) ).

Please have a look at 9536-db-context-manager-r10102.3.patch.

For the follow-up clean-ups, I think we should keep the db or cnx parameters, but document the fact they're not used anymore and are going to be removed in 0.14.

And while we're at refactoring the db queries, what about adding an "execute" convenience method on our ConnectionWrapper, so that instead of:

        with self.env.db_query as db:
            cursor = db.cursor()
            cursor.execute("""
                SELECT type, id, filename, time, description, author
                FROM attachment
                WHERE time > %s AND time < %s AND type = %s
                """, (to_utimestamp(start), to_utimestamp(stop), realm))
            for realm, id, filename, ts, description, author in cursor:
                time = from_utimestamp(ts)
                ...

we could just write:

        with self.env.db_query as db:
            for realm, id, filename, ts, description, author in db.execute("""
                    SELECT type, id, filename, time, description, author
                    FROM attachment
                    WHERE time > %s AND time < %s AND type = %s
                    """, (to_utimestamp(start), to_utimestamp(stop), realm)):
                time = from_utimestamp(ts)
                ...
Last edited 11 years ago by Christian Boos (previous) (diff)

by Christian Boos, 11 years ago

only wrap once with a readonly ConnectionWrapper, and remove trac.env / trac.db.api round trips for obtaining a connection

comment:9 by Christian Boos, 11 years ago

I've been busy those last days with some heavy refactoring of the db api, following up the changes above. I'm running a last complete test with the postgres, mysql and sqlite (file and in-memory) to be really sure, but it should all work now. …

Yes, testing with the regular SQLite backend and not only the in-memory one is now possible, a long story (ticket:8169#comment:22) which comes to an end ;-)

Would you be OK with me committing directly the changes on trunk?

The changes come in two sequences, first the trac/db and trac/tests changes, then updates of every other part of the code to use the new style (more than 2600+ and 2400- as I also did lots of other smaller style clean-ups along the way).

For now, the queries are using the with self.env.db_query as db: style as in the first example of comment:8, but I'd actually like to go with one of the following:

with env.db_query as db:     
    for a, b, c in db.select("""
            a, b, c FROM test
            """):
       ...

if we need the db instance (e.g. for db.quote or db.cast needed), or if we don't need it, directly:

for a, b, c in env.db_query.select("""
        a, b, c FROM test
        """):
    ...

The advantage of using select("a, b, c") over execute("SELECT a, b, c") is that we can't misuse the "read-only" Connection.

in reply to:  9 ; comment:10 by Remy Blank, 11 years ago

Replying to cboos:

Would you be OK with me committing directly the changes on trunk?

Sure, we can do any fine tuning on trunk if necessary.

For now, the queries are using the with self.env.db_query as db: style as in the first example of comment:8, but I'd actually like to go with one of the following:

with env.db_query as db:     
    for a, b, c in db.select("""
            a, b, c FROM test
            """):
       ...

What's the use of a context manager whose __exit__() is a no-op? Couldn't env.db_query just return the wrapped read-only connection?

The advantage of using select("a, b, c") over execute("SELECT a, b, c") is that we can't misuse the "read-only" Connection.

Nice idea. Using a select() generator method means the cursor will be kept in the generator frame. Hopefully this won't keep open cursors around, e.g. in the case where the iterator isn't consumed completely. Also, this will instantiate a new cursor for every SELECT, but I assume this isn't too expensive an operation.

in reply to:  10 comment:11 by Christian Boos, 11 years ago

Replying to rblank:

Replying to cboos:

Would you be OK with me committing directly the changes on trunk?

Sure, we can do any fine tuning on trunk if necessary.

Great, thanks!

The complete set of tests did work fine. However in some situations I still have to fight against a strange:

psycopg2.ProgrammingError: permission denied for relation system

I hope it won't take me too long…

For now, the queries are using the with self.env.db_query as db: style as in the first example of comment:8, but I'd actually like to go with one of the following:

with env.db_query as db:     
    for a, b, c in db.select("""
            a, b, c FROM test
            """):
       ...

What's the use of a context manager whose __exit__() is a no-op? Couldn't env.db_query just return the wrapped read-only connection?

I think doing this change at the same occasion as the with env.db_transaction one is good for a) symmetry reasons with the write operation (only one "idiom" to use), b) the visual change which makes apparent the "duration" of the connection, and finally c) it's likely that it won't stay a no-op…

Even though I didn't implement that yet, I'd like to find ways to enforce that the db is not used longer than it should. Closing the cursors opened during that time could be a possible exit operation. Also, do you remember that discussion about relying on the gc for returning the connections to the pool? Doing an explicit return from the outer context's __exit__ would be a possible solution.

The advantage of using select("a, b, c") over execute("SELECT a, b, c") is that we can't misuse the "read-only" Connection.

Nice idea. Using a select() generator method means the cursor will be kept in the generator frame. Hopefully this won't keep open cursors around, e.g. in the case where the iterator isn't consumed completely.

Well, doing this would maintain the cursor alive, and therefore the connection for pysqlite. The idea of using select(...) is on the contrary to consume the cursor entirely in order to use the connection the shortest time possible. That's what we do anyway for sqlite, with the EagerCursor, which btw would then become useless. If in the future there are places where this could be a problem, we would need to use LIMIT.

Also, this will instantiate a new cursor for every SELECT, but I assume this isn't too expensive an operation.

That shouldn't be a problem, no. What is problematic is that we have really too many different SQL queries, it would be good to cut that down a bit so that we could benefit more from the statement cache in PySqlite ;-)

comment:12 by Remy Blank, 11 years ago

Description: modified (diff)

I have cleaned up trac.util.compat and fixed all occurrences where it is imported in [10128].

comment:13 by Remy Blank, 11 years ago

Removed some compatibility code in [10129].

comment:14 by Christian Boos, 11 years ago

Great! On my side, I pushed the ideas in comment:11 further and now I'm close to have it all working in a way I like. However the changes are quite big, so I'll push those in a clone first, so that we can discuss them.

comment:15 by Christian Boos, 11 years ago

Please have a look at the branch db-context-managers. I did some extensive review and testing, but as the changes are also quite extensive, it's possible I've missed one thing or two…

I'm especially interested in comments on the first patch [e74279c9ae85/cboos], which introduces the changes at the trac.db level. Note that this change one alone doesn't pass the tests, the next changeset is needed for that, as it fixes the test machinery. The follow-up changes are mainly for applying the new style to the rest of the code, module by module.

Ah yes, I didn't add the from __future__ import with_statement for 2.5 yet, so if you'd like to test, use 2.6 or 2.7.

comment:16 by Christian Boos, 11 years ago

With r10189, the branch has been merged to trunk.

Now the next cosmetic fix I have in mind is improving the imports, by using multiline imports and relative imports…

comment:17 by Christian Boos, 11 years ago

API Changes: modified (diff)
Release Notes: modified (diff)

comment:18 by Christian Boos, 11 years ago

On the files touched by new changes, we could also do some preliminary clean-ups, like I did for example in r10239.

comment:19 by Christian Boos, 11 years ago

I replaced a bunch of finally statements with their context manager counterpart in attachment:9536-cleanup-finally-r10401.diff.

All tests pass, even with Python 2.5, but an extra review wouldn't hurt.

by Christian Boos, 11 years ago

use with for automatically closing files and releasing locks.

in reply to:  19 comment:20 by Christian Boos, 11 years ago

by Remy Blank, 11 years ago

First part of the conversion from and - or to if - else.

comment:21 by Remy Blank, 11 years ago

I have started converting the most obvious uses of ... and ... or ... into ... if ... else ... (current state above). There are quite a lot of them, so before I do them all, I'd just like to get a quick "ok" or "ko" for the change. Personally, I find the if - else construct much clearer.

comment:22 by Christian Boos, 11 years ago

I've reviewed the patch, all changes look correct (at least I couldn't spot an error ;-) and the few semantic changes seem for the better). So yes, +1 and if you'd like a review on the next batch, don't hesitate to post it here as a follow-up patch.

comment:23 by Remy Blank, 11 years ago

Description: modified (diff)

First batch of changes applied in [10580].

by Remy Blank, 11 years ago

Second (and final) "and or → if else" batch.

Note: See TracTickets for help on using tickets.