Converting XML to Another Format

InqScribe exports XML that looks like this:

<transcript>
<prologue>
</prologue>
<scene id="1" in="00:00:00.00" out="00:01:00.00">Line 1.</scene>
<scene id="2" in="00:01:15.00" out="00:03:00.00">Line 2</scene>
</transcript>

But sometimes you may want to convert this XML to another format, like this:

<myformat>
  <transcript>
    <start>00:00:00.00</start>
    <end>00:01:00.00</end>
    <text>Line 1.</text>
  </transcript>
  <transcript>
    <start>00:01:15.00</start>
    <end>00:03:00.00</end>
    <text>Line 2.</text>
  </transcript>
</myformat>

You can do this fairly easily using a technology called XSLT. We won’t go into XSLT in depth here. If you want to read up on XSLT, try these starting points:

Here are a couple of good, free XSLT processors you can use to do the actual conversion.

Now for the good stuff. Here’s an example XSLT file that will convert the sample XML output above into the desired “myformat”.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<myformat>
  <xsl:for-each select="//scene">
    <transcript>
      <start><xsl:value-of select="@in"/></start>
      <end><xsl:value-of select="@out"/></end>
      <text><xsl:value-of select="."/></text>
    </transcript>
  </xsl:for-each>
</myformat>
</xsl:template>
</xsl:stylesheet>

A few comments about this example:

  • “//scene” means find every <scene> node in the XML file.
  • <xsl:value-of select=”.”/> means take the value of the current (.) node. Since we’ve looping over all scene nodes, that grabs the transcript text associated with each scene in the output.
  • Since “in” and “out” are XML attributes, not notes, you use the @ sign to refer to them.
  • Note that if you want to assign values to attributes of a note, you can use a slightly different notation. For example, if you want out that looks like <item a=”00:00:00.00” b=”00:01:00.00”/>, you’d use this XSLT code: <item a=”{@in}” b=”{@out}”/>.

Hopefully this makes InqScribe’s XML output even more useful to you.

 

Questions or comments on this page? Send us a note.

 
inqscribe/subtitle_xslt.txt · Last modified: 2008/02/25 15:28 by ebaum
 
Recent changes RSS feed Valid XHTML 1.0 Valid CSS Driven by DokuWiki