Opened 9 years ago
Last modified 9 years ago
#12390 new defect
Attachment table primary key not unique when migrating from SQLite to MySQL
Reported by: | Ryan J Ollos | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | next-major-releases |
Component: | attachment | Version: | |
Severity: | normal | Keywords: | mysql |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description (last modified by )
We might just file this under the category of poor database schema, but I'm hoping to get feedback.
I'm working on a migration from SQLite to MySQL and encountered the following error:
Copying tables: attachment table... IntegrityError: (1062, "Duplicate entry 'ticket-393-lv_pause_you_can_say_orderstatus_requestacatalog_cust' for key 'PRIMARY'")
The length of each primary key is 111, which is calculated as: tags/trac-1.0.10/trac/db/mysql_backend.py@:149#L140.
Here is the schema:
mysql> SHOW CREATE TABLE attachment; +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | attachment | CREATE TABLE `attachment` ( `type` text COLLATE utf8_bin NOT NULL, `id` text COLLATE utf8_bin NOT NULL, `filename` text COLLATE utf8_bin NOT NULL, `size` int(11) DEFAULT NULL, `time` bigint(20) DEFAULT NULL, `description` text COLLATE utf8_bin, `author` text COLLATE utf8_bin, `ipnr` text COLLATE utf8_bin, PRIMARY KEY (`type`(111),`id`(111),`filename`(111)) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
The issue is that a particular ticket has several attachments that differ only in one of the last characters, having a .1
and .2
suffix. The attachment filename length is greater than 111 characters.
SELECT LENGTH(filename),filename FROM attachment WHERE id=393 AND LENGTH(filename) > 111 ORDER BY filename; 127|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav 125|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav 120|lv_pause_you_can_say_placeaorder_orderstatus_requestacatalog_or_customerservice_say_repeat_to_hear_the_options_again.wav 121|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav 119|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav 114|lv_you_can_say_placeaorder_orderstatus_customerservice_or_requestacatalog_say_repeat_to_hear_the_options_again.wav
I haven't tested, but assume that if the organization had always been using MySQL the users would have seen IntegrityError
s when adding some of these attachments.
Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?
Attachments (0)
Change History (5)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
Replying to Ryan J Ollos:
Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?
Sounds good. The id
column in primary key has the same. We could add also the hashed id
as a primary key instead of id
column.
comment:3 by , 9 years ago
Keywords: | mysql added |
---|
comment:4 by , 9 years ago
Description: | modified (diff) |
---|
Would it be simpler to just add an auto-incrementing primary key, like was done in [13782]?
comment:5 by , 9 years ago
No. The attachment system expects (type, id, filename) is unique. Replacing with auto-increment primary key couldn't keep the uniqueness. It would be good to modify the attachment table like this:
-
trac/db_default.py
diff --git a/trac/db_default.py b/trac/db_default.py index fb2aac848..a01d3d162 100644
a b schema = [ 63 63 Column('key')], 64 64 65 65 # Attachments 66 Table('attachment', key=('type', ' id', 'filename'))[66 Table('attachment', key=('type', 'hashed_id', 'hashed_filename'))[ 67 67 Column('type'), 68 Column('hashed_id', size=40), # sha1(id) 69 Column('hashed_filename'), # sha1(filename) + filename's extension 68 70 Column('id'), 69 71 Column('filename'), 70 72 Column('size', type='int'), 71 73 Column('time', type='int64'), 72 74 Column('description'), 73 75 Column('author'), 74 Column('ipnr')], 76 Column('ipnr'), 77 Index(['type', 'id', 'filename'])], 75 78 76 79 # Wiki system 77 80 Table('wiki', key=('name', 'version'))[
I suppose. It always bothered me that we lost an easy way to associate the original attachment filename with the on-disk attachment hashed filename. Having both in the same record would restore that (in addition to fixing the issue you reported).