Converting wiki to DocBook XML

From FedoraProject

Jump to: navigation, search

Contents

Useful References

http://fedoraproject.org/wiki/DocsProject/WorkFlow#WikitoDocBookXML

Converting from MediaWiki to XML

Before Conversion

The document needs to follow these lengthy but accurate guidelines:

Converting to XML

  1. Install python-mwlib package.
  2. Run this command to render one page: /usr/bin/mw-render -c http://fedoraproject.org/w/ -w docbook PageTitle -o output.xml
  3. Make a script or for-loop to iterate through a list of pages.

If you have more than a few files, you may want to use this process to make it easier:

  • Make a plain text file /tmp/wiki_pages with the name of each wiki article on a line by itself. The name should include the underscore '_' instead of spaces, but no brackets or file extensions:
Chapter_One_-_Called_something
Chapter_Two_-_Called_another_thing
...
  • Use this for-loop to iterate through each file, rendering it to XML:
mkdir XML_files
for i in `cat /tmp/wiki_files`;
    do /usr/bin/mw-render -c http://fedoraproject.org/w/ -w docbook $i -o XML_files/$i.xml;
done

Processing DocBook Pages

Note.png
This content needs updating for mw-render.
This content needs reviewing and updating based on the current output from mw-render and what is required to clean it up for publican.

Follow this process guideline with each XML file:

  1. Open the file in a full-featured text editor
  2. Ensure the XML file has the proper header, with proper chapters or sections
    1. Change to 'chapter' type
  3. Remove extraneous XML stylesheet call
  4. Change XML file type within the file
    1. book => 0
    2. article => chapter
    3. articleinfo => 0
  5. Search through the file for each of the markup output types covered in [#Wiki_markup_output_to_XML,_mapped_to_DocBook_XML Wiki markup output to XML, mapped to DocBook XML] ; that is, do the following:
  6. Search for each instance of 'emphasis' and replace it with the proper DocBook contextual markup
  7. Search for each instance of 'code' and 'programlisting' and replace it with the proper DocBook contextual markup
  8. Search and replace empty literallayout containers
  9. Convert inlinemediaobject to proper admonition

Wiki markup output to XML, mapped to DocBook XML

Was ''two-ticks'' (<code>''two-ticks''</code>) in wiki
     \=> <emphasis> => <application>, <guibutton>, <keycap>, <keycode>, <keycombo>,
                       <firstterm>, <menuchoice>, <guimenu>, <guisubmenu>, 
                       <guimenuitem>, <guilabel>, <guibutton>, <guiicon>, <glossterm>

Was '''three-ticks''' (<code>'''three-ticks'''</code>) in Wiki
     \=> <emphasis> => <application>

Was <code></code> in wiki
     \=> <programlisting> => <command>, <filename>, <classname>
         <programlisting format="linespecific"> => <command>, <filename>, <classname>

Was <pre /> block in wiki
     \=> <programlisting> block

Was <something_replaceable> in wiki
     \=> <something_replaceable> => <replaceable>something</replaceable>

guibutton, key*

Remove the "[] " and "+", these are handled by the XSL/CSS.

Example Usage

The following examples show very short <para> (paragraph) elements as examples.

menuchoice/gui*

To indicate a selection from a graphical menu in DocBook XML, retag like this:

<para>From the main menu, select <menuchoice>
<guimenu>System</guimenu>
<guisubmenu>Administration</guisubmenu>
<guimenuitem>Display</guimenuitem>
</menuchoice>.</para>

You do not need to put the > symbol in, that is handled by the XSL or CSS.

key*

To indicate a key combination in DocBook XML, retag like this:

<para>To reboot, hit <keycombo>
<keycap>Ctrl</keycap>
<keycap>Alt</keycap>
<keycap>Delete</keycap>
</keycombo>.</para>

screen

To show multiline commands, break out a single important command line, or show a section of a configuration file or output in DocBook XML, retag like this:

<para>Run the following commands:</para>
<screen><![CDATA[rpm -qa 'kernel' > /tmp/kernels1.txt
]rpm -qa 'kernel*' > /tmp/kernels2.txt
diff -u /tmp/kernels?.txt  ></screen>

Notice that the <![CDATA[ ... ] > content allows you type anything verbatim between the markers. This means you don't have to change special characters like < > & into their XML character entity equivalents. This simplifies the process somewhat, but you can't use any XML tags between the <![CDATA[ ... ] > markers, of course. Avoid putting in extra space such as line breaks.

ulink (URLs)

To make a link to a URL:

Visit my page at <ulink url="http://example.com/mypage.html" />.

It's best not to hyperlink other text because in some formats people may have a hard time finding the actual URL. Edit judiciously.