[Framers] OT: Word to XML

Lynne A. Price lprice at txstruct.com
Mon Jul 22 08:23:34 PDT 2019


On 7/22/2019 5:18 AM, Roger Shuttleworth wrote:
> I have a set of well over 100 Word documents (I know...) that I would 
> like to convert to simple XML (not the kind that Word exports!). They 
> are all four pages long and pretty consistent in terms of structure, 
> and paragraph styles are used for the most part, though not character 
> styles. If you were me, what methods would you look at?
>
> I have used structured FM for years and am familiar with DITA and 
> DocBook. I know that there is a route from Word doc > FrameMaker > 
> Structured FrameMaker > XML that would involve creating a conversion 
> table, a DTD, and a structured application. I have done that in the 
> past, though it was a few years ago. I realise that it would mean a 
> lot of up-front work to get it working, as well as ensuring that 
> styles are used fully and consistently in my source documents.
Roger,
    You now have three approaches to consider--conversion table, MIF2Go, 
and Word XML. Often in such projects, the developer's experience has a 
lot to do with the chosen route. I would probably start with a 
conversion table and if you have past experience doing so, it might be 
the most straightforward approach. I often touch up the structure 
produced by a conversion table with XSLT. I would be cautious about 
starting from Word XML because it is very focused on formatting details 
and there would be a lot to ignore.

    Your last clause, "ensuring that styles are used fully and 
consistently in my source documents," may well indicate where the bulk 
of the work has to be done. You don't indicate how long your hundred 
Word documents are or how consistently the authors attempted to use Word 
styles, but even in the best of practical cases there is probably a lot 
of work to do.

    Also, you mention creating a DTD as part of a conversion table 
approach. Does the target XML you want to create use a DTD? A schema? 
Neither? Has it been designed? FM can export XML without a DTD, although 
tables, graphics, and cross-references may require one.

    And I will join the other respondents and offer to meet with you 
online to look at a conversion table approach.
     --Lynne


-- 
Lynne A. Price
Text Structure Consulting, Inc.
Specializing in structured FrameMaker consulting, application development, and training
lprice at txstruct.com            http://www.txstruct.com
voice/fax: (510) 583-1505      cell phone: (510) 421-2284



More information about the Framers mailing list