[PATCH] improved bugzilla to trac conversion
|Reported by:||Eric Auer||Owned by:|
|Severity:||normal||Keywords:||bugzilla trac csv xml import converter update script patch|
Description (last modified by )
When trying to use bugzilla2trac.py mentioned in http://trac.edgewall.org/wiki/TracImport I ran into the problem that (admin / firewall wise) it was not possible to give the TRAC server direct access and suitable drivers for the PostgreSQL database of the source Bugzilla instance.
Because of this, I wrote a script which converts the XML output of the Bugzilla search result web page into CSV suitable for the csv2trac.2.py script. An example of the search results is on:
At the bottom, you see a link CSV (which only returns columns like the bug titles) and a button XML. The XML button returns the full content of all bugs in the result list. Note that attached files are not included in this.
To find all bugs, just use all states as the search constraint. If simple search does not let you search without search terms, you can use advanced search. It is also configurable by Bugzilla admins whether trivial searches are allowed. Yet even at strict settings, it is possible to search "all states, no term".
To convert the Bugzilla XML data to CSV for csv2trac usage, I wrote a small Perl script. At least compared to bugzilla2trac, it is quite small. The script can be put under any free open source license of your choice. It can even be put into the public domain…
Dependency import lines of the Perl script are:
use XML::Reader; use DateTime; use DateTime::Format::ISO8601; # ISO8601 includes YYYY[-MM[-DD]] [HH[:MM[:SS]]] [+hhmm]] but # DateTime::Format::ISO8601 needs e.g. YYYY-MM-DDThh:mm:ss[+-]hh:mm
Those extra modules are small: light dependencies.
To run the script, put your XML into bugs.xml and receive your CSV as bugs.csv - do not forget to read or store the output. It will give you a list of all converted bugs and all attachments.
Note that you have to copy attachments manually: The XML does not contain them and csv2trac does not handle them either. If you want to script a download of attachments from Bugzilla, they all have download URLs in the style of yourbugzilla.example.com/dir/attachment.cgi?id=N where N is their attachment number / ID value.
The script combines all posts about one bug into one string for TRAC usage. Wiki markup keeps the posts in original layout, with headings about the who and when of the posts. Attachment meta data is also added there, using Wiki markup, as is information about duplicate-of, blocked-by and depends-on bug bug relations.
Bugs are normally converted to defects, but if the severity is enhancement, they are converted as enhancements. The type called task is never used by the current version of the script.
The isobsolete, ispatch and isprivate flags of Bugzilla attachments and the global properties of maintainer, urlbase and version of Bugzilla itself are not converted (although the latter three are logged to standard output). The name attribute of users is also not converted, the script copies the user name / email instead.
The Bugzilla bug number is used as start of the bug title (e.g. "bug 42: internet broken") because TRAC has to use other ticket numbers.
The versions unspecified and svn HEAD are not copied and the milestones —- and *unspecified* are not copied. They are converted into empty strings. The component field is converted as keywords like Component_examplecomponent but the value core is not converted.
The flags reporter_accessible, cclist_accessible, classification and …_id and everconfirmed are not converted.
The rep_platform is converted as a keyword like Reporter_Macintosh unless the value was All. The op_sys is converted as keyword as well, e.g. OS_Linux, unless the value was originally All.
Resolutions FIXED, INVALID, WONTFIX, DUPLICATE and WORKSFORME are converted to lower case to fit into TRAC syntax. Open bugs have the empty string as resolution state. Also, states as NEW, ASSIGNED, REOPENED, CLOSED and RESOLVED are converted to lower case for TRAC and the value RESOLVED is replaced by closed for TRAC.
Priorities P1 to P5 are mapped to TRAC strings highest, high, normal, low, lowest respectively. Severities blocker, critical, major, normal, minor, trivial and enhancement are copied as is, but if severity is enhancement then the type enhancement is used instead of defect.
The complete workflow is:
- Search the bugs you want to import.
- Download the XML and save as bugs.xml
perl bugzilla-xml2csv.pl > log.txt
python bugs.csv /path/to/trac
- Copy attachments manually, if necessary (see log.txt)
It turned out to be necessary to update csv2trac.2.py to process even bugs with special characters in their discussion or title, so I also provide a patch for the csv2trac script. In short, the original
"""INSERT INTO component (name) VALUES ("%s")""" % value
style syntax was using values inline, causing SQL injection style problems. The patched script uses
'INSERT INTO component (name) VALUES (%s)', [value]
style syntax based on a variable / placeholder scheme.
I also added
at one place to allow non-ASCII chars in the content. Be aware that other fields do not use decode yet, so e.g. component names or version numbers still have to be plain ASCII even with my patch applied to csv2trac.
Change History (5)
comment:1 by , 8 years ago