Edgewall Software
Modify

Opened 8 years ago

Last modified 8 years ago

#12390 new defect

Attachment table primary key not unique when migrating from SQLite to MySQL

Reported by: Ryan J Ollos Owned by:
Priority: normal Milestone: next-major-releases
Component: attachment Version:
Severity: normal Keywords: mysql
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description (last modified by Ryan J Ollos)

We might just file this under the category of poor database schema, but I'm hoping to get feedback.

I'm working on a migration from SQLite to MySQL and encountered the following error:

Copying tables:
  attachment table... IntegrityError: (1062, "Duplicate entry 'ticket-393-lv_pause_you_can_say_orderstatus_requestacatalog_cust' for key 'PRIMARY'")

The length of each primary key is 111, which is calculated as: tags/trac-1.0.10/trac/db/mysql_backend.py@:149#L140.

Here is the schema:

mysql> SHOW CREATE TABLE attachment;
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table      | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attachment | CREATE TABLE `attachment` (
  `type` text COLLATE utf8_bin NOT NULL,
  `id` text COLLATE utf8_bin NOT NULL,
  `filename` text COLLATE utf8_bin NOT NULL,
  `size` int(11) DEFAULT NULL,
  `time` bigint(20) DEFAULT NULL,
  `description` text COLLATE utf8_bin,
  `author` text COLLATE utf8_bin,
  `ipnr` text COLLATE utf8_bin,
  PRIMARY KEY (`type`(111),`id`(111),`filename`(111))
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin |

The issue is that a particular ticket has several attachments that differ only in one of the last characters, having a .1 and .2 suffix. The attachment filename length is greater than 111 characters.

SELECT LENGTH(filename),filename FROM attachment WHERE id=393 AND LENGTH(filename) > 111 ORDER BY filename;
127|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav
125|lv_pause_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav
120|lv_pause_you_can_say_placeaorder_orderstatus_requestacatalog_or_customerservice_say_repeat_to_hear_the_options_again.wav
121|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..2.wav
119|lv_you_can_say_orderstatus_requestacatalog_customerservice_or_placeanorder_lv_say_repeat_to_hear_the_options_again..wav
114|lv_you_can_say_placeaorder_orderstatus_customerservice_or_requestacatalog_say_repeat_to_hear_the_options_again.wav

I haven't tested, but assume that if the organization had always been using MySQL the users would have seen IntegrityErrors when adding some of these attachments.

Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?

Attachments (0)

Change History (5)

in reply to:  description comment:1 by Christian Boos, 8 years ago

Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?

I suppose. It always bothered me that we lost an easy way to associate the original attachment filename with the on-disk attachment hashed filename. Having both in the same record would restore that (in addition to fixing the issue you reported).

in reply to:  description comment:2 by Jun Omae, 8 years ago

Replying to Ryan J Ollos:

Would it make sense to change the database schema to use the hashed filename as a primary key instead of the filename?

Sounds good. The id column in primary key has the same. We could add also the hashed id as a primary key instead of id column.

comment:3 by Ryan J Ollos, 8 years ago

Keywords: mysql added

comment:4 by Ryan J Ollos, 8 years ago

Description: modified (diff)

Would it be simpler to just add an auto-incrementing primary key, like was done in [13782]?

comment:5 by Jun Omae, 8 years ago

No. The attachment system expects (type, id, filename) is unique. Replacing with auto-increment primary key couldn't keep the uniqueness. It would be good to modify the attachment table like this:

  • trac/db_default.py

    diff --git a/trac/db_default.py b/trac/db_default.py
    index fb2aac848..a01d3d162 100644
    a b schema = [  
    6363        Column('key')],
    6464
    6565    # Attachments
    66     Table('attachment', key=('type', 'id', 'filename'))[
     66    Table('attachment', key=('type', 'hashed_id', 'hashed_filename'))[
    6767        Column('type'),
     68        Column('hashed_id', size=40),   # sha1(id)
     69        Column('hashed_filename'),      # sha1(filename) + filename's extension
    6870        Column('id'),
    6971        Column('filename'),
    7072        Column('size', type='int'),
    7173        Column('time', type='int64'),
    7274        Column('description'),
    7375        Column('author'),
    74         Column('ipnr')],
     76        Column('ipnr'),
     77        Index(['type', 'id', 'filename'])],
    7578
    7679    # Wiki system
    7780    Table('wiki', key=('name', 'version'))[

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned.
as The resolution will be set. Next status will be 'closed'.
The owner will be changed from (none) to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.