Opened 18 years ago
Closed 17 years ago
#4378 closed defect (fixed)
MySQL structure case insensitive causes collision with file names
Reported by: | Owned by: | Jonas Borgström | |
---|---|---|---|
Priority: | high | Milestone: | 0.11.1 |
Component: | version control | Version: | 0.10.3rc1 |
Severity: | critical | Keywords: | mysql, 4378, patch |
Cc: | Branch: | ||
Release Notes: | |||
API Changes: | |||
Internal Changes: |
Description (last modified by )
When running init with a MySQL backend on a subversion repository that has a case collision, the following error comes up. The repo in this case has two symlinks inside of it named "CTE" and "cte". This needs to be changed for case sensitive file systems.
Failed to initialize environment. (1062, "Duplicate entry '1-html/cte-A' for key 1") Traceback (most recent call last): File "/usr/lib/python2.3/site-packages/trac/scripts/admin.py", line 628, in do_initenv repos = self.__env.get_repository() File "/usr/lib/python2.3/site-packages/trac/env.py", line 195, in get_repository return RepositoryManager(self).get_repository(authname) File "/usr/lib/python2.3/site-packages/trac/versioncontrol/api.py", line 101, in get_repository repos = self._connector.get_repository(rtype, rdir, authname) File "/usr/lib/python2.3/site-packages/trac/versioncontrol/svn_fs.py", line 260, in get_repository crepos = CachedRepository(self.env.get_db_cnx(), repos, None, self.log) File "/usr/lib/python2.3/site-packages/trac/versioncontrol/cache.py", line 34, in __init__ self.sync() File "/usr/lib/python2.3/site-packages/trac/versioncontrol/cache.py", line 100, in sync (str(current_rev), path, kind, action, File "/usr/lib/python2.3/site-packages/trac/db/util.py", line 50, in execute return self.cursor.execute(sql_escape_percent(sql), args) File "/usr/lib/python2.3/site-packages/trac/db/util.py", line 50, in execute return self.cursor.execute(sql_escape_percent(sql), args) File "/usr/lib/python2.3/site-packages/MySQLdb/cursors.py", line 163, in execute self.errorhandler(self, exc, value) File "/usr/lib/python2.3/site-packages/MySQLdb/connections.py", line 35, in defaulterrorhandler raise errorclass, errorvalue IntegrityError: (1062, "Duplicate entry '1-html/cte-A' for key 1")
Attachments (3)
Change History (21)
comment:1 by , 18 years ago
Component: | trac-admin → general |
---|---|
Description: | modified (diff) |
Milestone: | → 0.10.4 |
Owner: | changed from | to
comment:2 by , 18 years ago
(The case sensitive file systems comment was referring to what's in the subversion repo. If subversion was storing from something on a case insensitive file system, the file system used would not allow the prerequisite condition to exist, and no error would occur in trac either. The comment was not applying to how the database or trac is storing anything, but how it tracks the subversion repo)
We're doing some investigation, and though this is page talks about char/varchar, it appears to be true of text fields as well. From http://sql-info.de/mysql/gotchas.html#1_4:
The manual says: … Values in CHAR and VARCHAR columns are sorted and compared in case-insensitive fashion, unless the BINARY attribute was specified when the table was created. The BINARY attribute means that column values are sorted and compared in case-sensitive fashion according to the ASCII order of the machine where the MySQL server is running. BINARY doesn't affect how the column is stored or retrieved. … —http://dev.mysql.com/doc/mysql/en/CHAR.html
Note that MySQL's behaviour in this regard is the diametrical opposite of the default behaviour of most other databases. (Tested: DB2 8.1, Firebird 1.5.1, Oracle 8.1.7 and PostgreSQL 7.4.3, details of other databases welcome).
The database structure for trac in mysql would need to be changed to account for this, and since this doesn't happen in Postgres (assumed from above) or sqlite (confirmed by testing), it's definitely a MySQL issue that needs to be worked around.
comment:3 by , 18 years ago
Workaround found, patched attached. It basically makes the path column in the node change table be a binary (latin1_bin) field. I'm not sure if this will effect unicode.
by , 18 years ago
Attachment: | mysql_backend.path.binary.diff added |
---|
changes mysql_backend.py to use binary to store the path in node_change
by , 18 years ago
Attachment: | mysql_backend.path.binary.2.diff added |
---|
changes mysql_backend.py to use binary to store the path in node_change
comment:4 by , 18 years ago
Keywords: | needinfo added |
---|
follow-up: 6 comment:5 by , 18 years ago
CREATE DATABASE trac
DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;
and to avoid #3673 changed limit to 333, still get the same error.
Patching as the above attachement still fixes the problem. Text fields in MySQL are apparently still case insensitive even in unicode.
comment:6 by , 18 years ago
Replying to jackie.m@iddl.vt.edu:
CREATE DATABASE
trac
DEFAULT CHARACTER SET utf8 COLLATE utf8_bin; and to avoid #3673 changed limit to 333, still get the same error.Patching as the above attachement still fixes the problem. Text fields in MySQL are apparently still case insensitive even in unicode.
And with COLLATE utf8_general_ci
instead of COLLATE utf8_bin
?
comment:7 by , 18 years ago
Component: | general → version control |
---|---|
Keywords: | needinfo removed |
Milestone: | 0.10.5 → 0.12 |
Owner: | changed from | to
The new repository cache scheduled for milestone:0.12 will fix this.
Of course, if someone has a better idea in the meantime for 0.11, feel free to submit it.
comment:8 by , 18 years ago
Milestone: | 0.12 |
---|---|
Resolution: | → duplicate |
Status: | new → closed |
oops, I intended to make the above comment on #3676, see there.
comment:9 by , 17 years ago
Milestone: | → 0.10.5 |
---|---|
Resolution: | duplicate |
Status: | closed → reopened |
As 0.11 is going to stay for a while, I'm going to apply the patch.
comment:10 by , 17 years ago
Keywords: | 4378 patch added |
---|---|
Severity: | normal → critical |
Hi all:
After applying the patch, nothing has changed. I was wondering if anyone was able to figure out where the clashing keys are and a routine to just delete it?
comment:11 by , 17 years ago
btw, the reason I classified this as "critical" is due to the fact that all web based functionality has stopped.
comment:12 by , 17 years ago
Milestone: | 0.10.5 → 0.12 |
---|
The new VcCache needs to take this limitation into account.
comment:13 by , 17 years ago
Hmm, are you guys sure this patch really works, even with non-ascii text?
Even though I hate mysql as much as the next guy I took a quick look at this and this is how I understand the situation:
- Trac pretty much expect the database to be case insensitive and all other supported backends except mysql always are.
- There are two different ways to get a mysql database to be case insensitive:
2.1. Make the entire database case insensitive by specifying the collation at database creation time:
CREATE DATABASE foo CHARACTER SET utf8 COLLATION utf8_bin;
2.2. Specify the collation at the column level:
CREATE TABLE foo (x VARCHAR(10) BINARY);
Using either approach the end result is the same. MySQL will "tag" all text values returned from the database as having the collation "BINARY". This in turn triggers a python-mysqldb bug where python-mysqldb thinks all text is binary blobs and is returned as str
objects instead of unicode
.
So until that python-mysqldb bug is fixed I can't see any way to configure mysql to be case sensitive and at the same time get pyhon-mysqldb to return proper unicode strings. The only solution I can think of right now is that we take care of the utf8→unicode decoding ourself in mysql_backend.py.
by , 17 years ago
Attachment: | mysql_unicode.patch added |
---|
Implements str→conversion in mysql_backend.py instead of relying on python-mysql
comment:14 by , 17 years ago
Attached a sample patch that implements the str→unicode conversion in mysql_backend.py instead of relying on the currently broken (bug) implementation in the python-mysql driver.
With this patch Trac seems to work correctly with non-ascii text and a case sensitive database (CREATE DATABASE name CHARACTER SET utf8 COLLATE utf8_bin
).
This change actually simplifies mysql_backend.py a bit but also makes the code more dependent on python-mysql internal structures (The Cursor class).
comment:15 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | reopened → new |
comment:16 by , 17 years ago
Milestone: | 0.12 → 0.11.1 |
---|---|
Status: | new → assigned |
OK, I've now done some further testing and it seems to work ok so I've committed it to trunk (r7222).
I'll leave this ticket open for now since I hope to be able to include these changes in 0.11.1 unless some problems with this approach is found.
comment:17 by , 17 years ago
Milestone: | 0.11.2 → 0.11.1 |
---|
comment:18 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Replying to jackie.m@iddl.vt.edu:
Well, the file system has nothing to do here, as we're taking the information straight from the repository. Looks like it would be MySQL which has this case sensitivity issue for its keys. There must be something to configure at the MySQL level, I can't believe it's that broken.
See also #3676, for a different but related problem for the same key.