Edgewall Software
Modify

Opened 10 years ago

Closed 8 years ago

Last modified 8 years ago

#11872 closed enhancement (fixed)

DatabaseManager should support adding and removing table columns

Reported by: Ryan J Ollos Owned by: Ryan J Ollos
Priority: normal Milestone: 1.2
Component: database backend Version:
Severity: normal Keywords:
Cc: Branch:
Release Notes:
API Changes:

Added drop_tables method to DatabaseManager class. Added method to ConnectionBase class for dropping a column: drop_column.

Internal Changes:

Description

The DatabaseManager class has methods to create and drop tables (#11512). It would be useful to have a method to add a column to a table.

Attachments (0)

Change History (37)

comment:1 by Ryan J Ollos, 10 years ago

A drop_column method could be useful as well. One use-case in #9612.

The approach to adding columns has been to create a temporary table for the data and drop/re-create the table. For example: tags/trac-1.0.2/trac/upgrades/db11.py.

Last edited 8 years ago by Ryan J Ollos (previous) (diff)

comment:2 by Ryan J Ollos, 10 years ago

Summary: DatabaseManager should support adding a column to a tableDatabaseManager should support adding and removing table columns

comment:3 by Ryan J Ollos, 9 years ago

Milestone: next-dev-1.1.xnext-dev-1.3.x

Narrowing focus for milestone:1.2. Please move ticket to milestone:1.2 if you intend to fix it.

comment:4 by Ryan J Ollos, 8 years ago

Milestone: next-dev-1.3.x1.2
Owner: set to Ryan J Ollos
Status: newassigned

comment:5 by Ryan J Ollos, 8 years ago

API Changes: modified (diff)

This is being addressed as part of #9612.

comment:6 by Jun Omae, 8 years ago

+1. dropping column would be faster than drop/create table for #9612.

comment:7 by Christian Boos, 8 years ago

SQLite seems to support ADD COLUMN, but still not DROP COLUMN. The other backends could of course do it as an optimization.

comment:8 by Ryan J Ollos, 8 years ago

API Changes: modified (diff)

Added in [6ce60c5d/rjollos.git].

Notes:

TravisCI tests are passing.

Last edited 8 years ago by Ryan J Ollos (previous) (diff)

in reply to:  8 ; comment:9 by Jun Omae, 8 years ago

Replying to Ryan J Ollos:

Unit tests pass with PostgreSQL 8.4.20 on CentOS 6.

comment:10 by Ryan J Ollos, 8 years ago

I noticed a mix of self.cnx.cursor and self.cursor() used in the methods. Is there a good way to know which should be used?: trunk/trac/db/sqlite_backend.py@14911:365,401#L364

comment:11 by Ryan J Ollos, 8 years ago

API Changes: modified (diff)

comment:12 by Ryan J Ollos, 8 years ago

Resolution: fixed
Status: assignedclosed

Committed to 1.2-stable in r15025, merged to trunk in r15026.

in reply to:  9 comment:13 by Ryan J Ollos, 8 years ago

Replying to Jun Omae:

Replying to Ryan J Ollos:

Unit tests pass with PostgreSQL 8.4.20 on CentOS 6.

Thanks for testing with the older version on CentOS.

On a related note, Travis CI is currently running against PostgreSQL 9.1. It looks like we could add build configurations for other versions as well.

in reply to:  10 comment:14 by Jun Omae, 8 years ago

Replying to Ryan J Ollos:

I noticed a mix of self.cnx.cursor and self.cursor() used in the methods. Is there a good way to know which should be used?: trunk/trac/db/sqlite_backend.py@14911:365,401#L364

According to SQLiteConnection.rollback, it seems all active cursors should be closed before calling rollback. See trunk/trac/db/sqlite_backend.py@14911:335-338#L335. Then, I think we should use self.cursor().

  • trac/db/sqlite_backend.py

    diff --git a/trac/db/sqlite_backend.py b/trac/db/sqlite_backend.py
    index 10a02161a..5aa8730e7 100644
    a b class SQLiteConnection(ConnectionWrapper):  
    341341            cursor.execute("DROP TABLE IF EXISTS " + self.quote(table))
    342342
    343343    def get_column_names(self, table):
    344         cursor = self.cnx.cursor()
    345         rows = cursor.execute("PRAGMA table_info(%s)"
    346                               % self.quote(table))
    347         return [row[1] for row in rows]
     344        cursor = self.cursor()
     345        cursor.execute("PRAGMA table_info(%s)" % self.quote(table))
     346        return [row[1] for row in cursor]
    348347
    349348    def get_last_id(self, cursor, table, column='id'):
    350349        return cursor.lastrowid

comment:15 by Ryan J Ollos, 8 years ago

That makes sense. Should we just call self.execute, as in get_table_names?

comment:16 by Jun Omae, 8 years ago

get_column_names with self.execute doesn't work. The ConnectionWrapper.execute closes automatically used cursor, however, returns results set only for SELECT query. Therefore, get_column_names needs to use self.cursor().

Last edited 8 years ago by Ryan J Ollos (previous) (diff)

comment:17 by Ryan J Ollos, 8 years ago

Thank you for recommendations. Committed to 1.0-stable in r15034, merged to 1.2-stable in r15035, merged to trunk in r15036.

comment:18 by Jun Omae, 8 years ago

SQLiteConnection.drop_column removes all indices from the given table.

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import trac
>>> trac.__version__
'1.2dev'
>>> from trac.test import EnvironmentStub
>>> from trac.db.api import get_column_names
>>> from trac.util.text import print_table
>>>
>>> env = EnvironmentStub()
>>>
>>> def query(stmt, *args):
...   with env.db_query as db:
...     cursor = db.cursor()
...     cursor.execute(stmt, args)
...     print_table(cursor.fetchall(), get_column_names(cursor))
...
>>> query('PRAGMA table_info(wiki)')

cid  name      type     notnull  dflt_value  pk
-----------------------------------------------
0    name      text     0                    0
1    version   integer  0                    0
2    time      integer  0                    0
3    author    text     0                    0
4    ipnr      text     0                    0
5    text      text     0                    0
6    comment   text     0                    0
7    readonly  integer  0                    0

>>> query('PRAGMA index_list(wiki)')

seq  name                     unique
------------------------------------
0    wiki_time_idx            0
1    sqlite_autoindex_wiki_1  1

>>> with env.db_transaction as db:
...   db.drop_column('wiki', 'ipnr')
...
>>> query('PRAGMA table_info(wiki)')

cid  name      type  notnull  dflt_value  pk
--------------------------------------------
0    name      TEXT  0                    0
1    version   INT   0                    0
2    time      INT   0                    0
3    author    TEXT  0                    0
4    text      TEXT  0                    0
5    comment   TEXT  0                    0
6    readonly  INT   0                    0

>>> query('PRAGMA index_list(wiki)')

seq  name  unique
-----------------

comment:19 by Jun Omae, 8 years ago

Proposed changes in [5d0eed1a4/jomae.git].

In the changes, it generates Table instance using table_info, index_list and index_info via PRAGMA and creates new table using Table.remove_columns and _to_sql. We should add more unit tests for SQLite.

comment:20 by Jun Omae, 8 years ago

I noticed that removing column which is used in its index behaves differently between PostgreSQL and MySQL. Should we block removing a column used in pk/index?

PostgreSQL

The pk/index also would be dropped.

trac=# \d foo
     Table "tractest.foo"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer | not null
 name   | text    |
Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)
    "idx_foo" btree (id, name)

trac=# alter table foo drop column name;
ALTER TABLE
trac=# \d foo
     Table "tractest.foo"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer | not null
Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)

MySQL

The column would be dropped from pk/index. The pk/index wouldn't be dropped.

mysql> show create table foo\G
*************************** 1. row ***************************
       Table: foo
Create Table: CREATE TABLE `foo` (
  `id` int(11) NOT NULL,
  `name` varchar(30) COLLATE utf8mb4_bin DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `id` (`id`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

mysql> alter table foo drop column name;
Query OK, 0 rows affected (0.33 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show create table foo\G
*************************** 1. row ***************************
       Table: foo
Create Table: CREATE TABLE `foo` (
  `id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

in reply to:  20 ; comment:21 by Christian Boos, 8 years ago

I'm not sure I understand… When you say pk/index in the above, you actually mean the other, non-primary key index in which the dropped column is used.

Replying to Jun Omae:

PostgreSQL

The pk/index also would be dropped.

...
Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)
    "idx_foo" btree (id, name)

trac=# alter table foo drop column name;
...
Indexes:
    "foo_pkey" PRIMARY KEY, btree (id)

Here, the index for (id, name) is removed entirely.

MySQL

The column would be dropped from pk/index. The pk/index wouldn't be dropped.

...
  KEY `id` (`id`,`name`)
... 
mysql> alter table foo drop column name;
...
  KEY `id` (`id`)

Here, the index for (id, name) is altered, and is a bit redundant as we already have the primary key index on id.

I think PostgreSQL's behavior is the one that makes most sense.

In drop_columns, wouldn't it be possible to drop all the indexes in MySQL in which the dropped columns are involved, to standardize on PostgreSQL behavior? Same thing for SQLite, only recreate indexes that are not affected by the drop of columns (on top of [5d0eed1a4/jomae.git]).

If we were to redo upgrade_tables in terms of drop_columns (and add_columns), and if the re-organized indexes still makes sense, they would have to be specified in the Tables of the "new_schema, and upgrade_tables would then take care of recreating those (Table.remove_columns should also update the .indices).

comment:22 by Ryan J Ollos, 8 years ago

I can't say what the right solution is, but does something need to be done before creating the 1.2 release candidate? I was planning to create it tomorrow.

comment:23 by Jun Omae, 8 years ago

I think it's nothing to be done before releasing 1.2rc because the drop_column is not used in Trac core. But we should fix it before 1.2.

comment:24 by Ryan J Ollos, 8 years ago

Resolution: fixed
Status: closedreopened

in reply to:  21 comment:25 by Jun Omae, 8 years ago

Replying to Christian Boos:

In drop_columns, wouldn't it be possible to drop all the indexes in MySQL in which the dropped columns are involved, to standardize on PostgreSQL behavior? Same thing for SQLite, only recreate indexes that are not affected by the drop of columns (on top of [5d0eed1a4/jomae.git]).

I Agree.

In MySQL, ALTER TABLE .. DROP COLUMN .. would fail if dropping column leads unique key violation. We should drop all the indices before call of the ALTER TABLE in MySQL.

mysql> create table foo (id int, name varchar(10), primary key (id, name));
Query OK, 0 rows affected (0.06 sec)

mysql> desc foo;
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   | PRI | 0       |       |
| name  | varchar(10) | NO   | PRI |         |       |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

mysql> insert into foo values (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b');
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from foo;
+----+------+
| id | name |
+----+------+
|  1 | a    |
|  1 | b    |
|  2 | a    |
|  2 | b    |
+----+------+
4 rows in set (0.00 sec)

mysql> alter table foo drop column name;
ERROR 1062 (23000): Duplicate entry '1' for key 'PRIMARY'

I'll post new patch.

comment:26 by Jun Omae, 8 years ago

Proposed changes in jomae.git@t11872.3.

comment:27 by Jun Omae, 8 years ago

Revised jomae.git@t11872.3. Added unit tests with PostgreSQL in [3ba8b1fc8/jomae.git].

comment:28 by Ryan J Ollos, 8 years ago

The changes look good. Thanks for fixing!

I encountered a few test failures with in-memory and on-disk SQLite, but tests pass for other databases. Here is one of the failures:

======================================================================
ERROR: test_remove_composite_keys (trac.db.tests.sqlite_test.SQLiteConnectionTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/rjollos/Documents/Workspace/trac-dev/teo-rjollos.git/trac/db/tests/sqlite_test.py", line 198, in test_remove_composite_keys
    indices_0 = self._index_info('test_composite')
  File "/Users/rjollos/Documents/Workspace/trac-dev/teo-rjollos.git/trac/db/tests/sqlite_test.py", line 138, in _index_info
    for seq, index, unique in cursor)
  File "/Users/rjollos/Documents/Workspace/trac-dev/teo-rjollos.git/trac/db/tests/sqlite_test.py", line 138, in <genexpr>
    for seq, index, unique in cursor)
ValueError: too many values to unpack

My SQLite3 version on OSX is 3.8.10.2.

comment:29 by Jun Omae, 8 years ago

Thanks for the catching! It seems extra columns have been added to PRAGMA index_list() since SQLite 3.8.9, SQLite: Check-in [2743846c].

Fixed in [6127ee25d/jomae.git] and verified with SQLite 3.8.2 and 3.14.1.

comment:30 by Jun Omae, 8 years ago

Additional changes in [faaf82fbe/jomae.git]. I'll push the changes tonight.

comment:31 by Ryan J Ollos, 8 years ago

Thanks, I'll prepare release candidate 1 tomorrow after the changes are pushed.

comment:32 by Jun Omae, 8 years ago

Resolution: fixed
Status: reopenedclosed

Committed in [15137] and merged to trunk in [15138].

comment:33 by Jun Omae, 8 years ago

1 failure with PostgreSQL 9.3.5 on Mac OS X…. I'll investigate it.

======================================================================
FAIL: test_remove_composite_keys (trac.db.tests.postgres_test.PostgresConnectionTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/travis/build/edgewall/trac/trac/db/tests/postgres_test.py", line 302, in test_remove_composite_keys
    indices_0['test_composite_name_enabled_idx'])
AssertionError: {'pk': False, 'unique': True, 'columns': ['name', 'enabled']} != {'pk': False, 'unique': True, 'columns': [u'enabled', u'name']}

----------------------------------------------------------------------
Ran 2013 tests in 299.604s

FAILED (failures=1)

comment:34 by Ryan J Ollos, 8 years ago

For reference, I don't see the failure on OSX with Postgres 9.5.4.

comment:35 by Jun Omae, 8 years ago

Proposed changes in [dcc272e11/jomae.git]. It's needed to use pg_index.indkey and pg_attribute.attnum to retrieve column position in the index. Confirmed tests running with PostgreSQL and Mac OS X on Travis CI, https://travis-ci.org/jun66j5/trac/jobs/159481907.

I'll push it later.

comment:36 by Ryan J Ollos, 8 years ago

Tests pass for me on OSX with PostgreSQL 9.5.4.

comment:37 by Jun Omae, 8 years ago

Thanks for the testing. Committed in [15139] and merged to trunk in [15140].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Ryan J Ollos.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Ryan J Ollos to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.