Edgewall Software
Modify

Opened 8 years ago

Last modified 8 years ago

#9940 new enhancement

[PATCH] improved bugzilla to trac conversion

Reported by: Eric Auer Owned by:
Priority: normal Milestone: unscheduled
Component: contrib Version:
Severity: normal Keywords: bugzilla trac csv xml import converter update script patch
Cc: Thijs Triemstra Branch:
Release Notes:
API Changes:

Description (last modified by Remy Blank)

When trying to use bugzilla2trac.py mentioned in http://trac.edgewall.org/wiki/TracImport I ran into the problem that (admin / firewall wise) it was not possible to give the TRAC server direct access and suitable drivers for the PostgreSQL database of the source Bugzilla instance.

Because of this, I wrote a script which converts the XML output of the Bugzilla search result web page into CSV suitable for the csv2trac.2.py script. An example of the search results is on:

https://landfill.bugzilla.org/bugzilla-tip/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__open__&product=&content=broken

At the bottom, you see a link CSV (which only returns columns like the bug titles) and a button XML. The XML button returns the full content of all bugs in the result list. Note that attached files are not included in this.

To find all bugs, just use all states as the search constraint. If simple search does not let you search without search terms, you can use advanced search. It is also configurable by Bugzilla admins whether trivial searches are allowed. Yet even at strict settings, it is possible to search "all states, no term".


To convert the Bugzilla XML data to CSV for csv2trac usage, I wrote a small Perl script. At least compared to bugzilla2trac, it is quite small. The script can be put under any free open source license of your choice. It can even be put into the public domain…

Dependency import lines of the Perl script are:

use XML::Reader;
use DateTime;
use DateTime::Format::ISO8601;
# ISO8601 includes YYYY[-MM[-DD]] [HH[:MM[:SS]]] [+hhmm]] but
# DateTime::Format::ISO8601 needs e.g. YYYY-MM-DDThh:mm:ss[+-]hh:mm

Those extra modules are small: light dependencies.

To run the script, put your XML into bugs.xml and receive your CSV as bugs.csv - do not forget to read or store the output. It will give you a list of all converted bugs and all attachments.

Note that you have to copy attachments manually: The XML does not contain them and csv2trac does not handle them either. If you want to script a download of attachments from Bugzilla, they all have download URLs in the style of yourbugzilla.example.com/dir/attachment.cgi?id=N where N is their attachment number / ID value.


The script combines all posts about one bug into one string for TRAC usage. Wiki markup keeps the posts in original layout, with headings about the who and when of the posts. Attachment meta data is also added there, using Wiki markup, as is information about duplicate-of, blocked-by and depends-on bug bug relations.

Bugs are normally converted to defects, but if the severity is enhancement, they are converted as enhancements. The type called task is never used by the current version of the script.

The isobsolete, ispatch and isprivate flags of Bugzilla attachments and the global properties of maintainer, urlbase and version of Bugzilla itself are not converted (although the latter three are logged to standard output). The name attribute of users is also not converted, the script copies the user name / email instead.

The Bugzilla bug number is used as start of the bug title (e.g. "bug 42: internet broken") because TRAC has to use other ticket numbers.

The versions unspecified and svn HEAD are not copied and the milestones —- and *unspecified* are not copied. They are converted into empty strings. The component field is converted as keywords like Component_examplecomponent but the value core is not converted.

The flags reporter_accessible, cclist_accessible, classification and …_id and everconfirmed are not converted.

The rep_platform is converted as a keyword like Reporter_Macintosh unless the value was All. The op_sys is converted as keyword as well, e.g. OS_Linux, unless the value was originally All.

Resolutions FIXED, INVALID, WONTFIX, DUPLICATE and WORKSFORME are converted to lower case to fit into TRAC syntax. Open bugs have the empty string as resolution state. Also, states as NEW, ASSIGNED, REOPENED, CLOSED and RESOLVED are converted to lower case for TRAC and the value RESOLVED is replaced by closed for TRAC.

Priorities P1 to P5 are mapped to TRAC strings highest, high, normal, low, lowest respectively. Severities blocker, critical, major, normal, minor, trivial and enhancement are copied as is, but if severity is enhancement then the type enhancement is used instead of defect.


The complete workflow is:

  1. Search the bugs you want to import.
  1. Download the XML and save as bugs.xml
  1. Run perl bugzilla-xml2csv.pl > log.txt
  1. Run python bugs.csv /path/to/trac
  1. Copy attachments manually, if necessary (see log.txt)

It turned out to be necessary to update csv2trac.2.py to process even bugs with special characters in their discussion or title, so I also provide a patch for the csv2trac script. In short, the original

"""INSERT INTO component (name) VALUES ("%s")""" % value

style syntax was using values inline, causing SQL injection style problems. The patched script uses

'INSERT INTO component (name) VALUES (%s)', [value]

style syntax based on a variable / placeholder scheme.

I also added

description.decode('utf-8')

at one place to allow non-ASCII chars in the content. Be aware that other fields do not use decode yet, so e.g. component names or version numbers still have to be plain ASCII even with my patch applied to csv2trac.

Attachments (3)

bugzilla-xml2csv.pl (9.9 KB ) - added by anonymous 8 years ago.
Perl script to convert Bugzilla web search results XML output to CSV for csv2trac import into TRAC
csv2trac.2.py.diff (5.8 KB ) - added by anonymous 8 years ago.
patch for csv2trac.2.py to avoid SQL injection and allow special chars in ticket content
csv2trac.2.py (13.6 KB ) - added by anonymous 8 years ago.
patched version of csv2trac.2.py - avoids SQL injection, allows special chars in ticket content

Download all attachments as: .zip

Change History (5)

by anonymous, 8 years ago

Attachment: bugzilla-xml2csv.pl added

Perl script to convert Bugzilla web search results XML output to CSV for csv2trac import into TRAC

by anonymous, 8 years ago

Attachment: csv2trac.2.py.diff added

patch for csv2trac.2.py to avoid SQL injection and allow special chars in ticket content

by anonymous, 8 years ago

Attachment: csv2trac.2.py added

patched version of csv2trac.2.py - avoids SQL injection, allows special chars in ticket content

comment:1 by Remy Blank, 8 years ago

Description: modified (diff)
Milestone: unscheduled
Reporter: changed from anonymous to Eric Auer

Thanks for your contribution!

comment:2 by Thijs Triemstra, 8 years ago

Cc: Thijs Triemstra added
Keywords: patch added
Summary: patch: improved bugzilla to trac conversion[PATCH] improved bugzilla to trac conversion

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The ticket will remain with no owner.
The ticket will be disowned.
as The resolution will be set.
The owner will be changed from (none) to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.