Opened 10 years ago
Last modified 10 years ago
#12390 new defect
Attachment table primary key not unique when migrating from SQLite to MySQL
| Reported by: | Ryan J Ollos | Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | next-major-releases |
| Component: | attachment | Version: | |
| Severity: | normal | Keywords: | mysql |
| Cc: | Branch: | ||
| Release Notes: | |||
| API Changes: | |||
| Internal Changes: | |||
Description (last modified by )
We might just file this under the category of poor database schema, but I'm hoping to get feedback.
I'm working on a migration from SQLite to MySQL and encountered the following error:
Copying tables: attachment table... IntegrityError: (1062, "Duplicate entry 'ticket-393-lv_pause_you_can_say_orderstatus_requestacatalog_cust' for key 'PRIMARY'")
The length of each primary key is 111, which is calculated as: tags/trac-1.0.10/trac/db/mysql_backend.py@:149#L140.
Here is the schema:
mysql> SHOW CREATE TABLE attachment; +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | attachment | CREATE TABLE `attachment` ( `type` text COLLATE utf8_bin NOT NULL, `id` text COLLATE utf8_bin NOT NULL, `filename` text COLLATE utf8_bin NOT NULL, `size` int(11) DEFAULT NULL, `time` bigint(20) DEFAULT NULL, `description` text COLLATE utf8_bin, `author` text COLLATE utf8_bin, `ipnr` text COLLATE utf8_bin, PRIMARY KEY (`type`(111),`id`(111),`filename`(111)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
The issue is that a particular ticket has several attachments that differ only in one of the last characters, having a .1 and .2 suffix. The attachment filename length is greater than 111 characters.
SELECT LENGTH(filename),filename FROM attachment WHERE id=393 AND LENGTH(filename) > 111 ORDER BY filename; 127|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav 125|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav 120|lv_pause_you_can_say_placeaorder_orderstatus_requestacatalog_or_customerservice_say_repeat_to_hear_the_options_again.wav 121|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav 119|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav 114|lv_you_can_say_placeaorder_orderstatus_customerservice_or_requestacatalog_say_repeat_to_hear_the_options_again.wav
I haven't tested, but assume that if the organization had always been using MySQL the users would have seen IntegrityErrors when adding some of these attachments.
Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?
Attachments (0)
Change History (5)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Replying to Ryan J Ollos:
Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?
Sounds good. The id column in primary key has the same. We could add also the hashed id as a primary key instead of id column.
comment:3 by , 10 years ago
| Keywords: | mysql added |
|---|
comment:4 by , 10 years ago
| Description: | modified (diff) |
|---|
Would it be simpler to just add an auto-incrementing primary key, like was done in [13782]?
comment:5 by , 10 years ago
No. The attachment system expects (type, id, filename) is unique. Replacing with auto-increment primary key couldn't keep the uniqueness. It would be good to modify the attachment table like this:
-
trac/db_default.py
diff --git a/trac/db_default.py b/trac/db_default.py index fb2aac848..a01d3d162 100644
a b schema = [ 63 63 Column('key')], 64 64 65 65 # Attachments 66 Table('attachment', key=('type', ' id', 'filename'))[66 Table('attachment', key=('type', 'hashed_id', 'hashed_filename'))[ 67 67 Column('type'), 68 Column('hashed_id', size=40), # sha1(id) 69 Column('hashed_filename'), # sha1(filename) + filename's extension 68 70 Column('id'), 69 71 Column('filename'), 70 72 Column('size', type='int'), 71 73 Column('time', type='int64'), 72 74 Column('description'), 73 75 Column('author'), 74 Column('ipnr')], 76 Column('ipnr'), 77 Index(['type', 'id', 'filename'])], 75 78 76 79 # Wiki system 77 80 Table('wiki', key=('name', 'version'))[



I suppose. It always bothered me that we lost an easy way to associate the original attachment filename with the on-disk attachment hashed filename. Having both in the same record would restore that (in addition to fixing the issue you reported).