[Framers] OT: Word to XML
Lynne A. Price
lprice at txstruct.com
Mon Jul 22 08:23:34 PDT 2019
On 7/22/2019 5:18 AM, Roger Shuttleworth wrote:
> I have a set of well over 100 Word documents (I know...) that I would
> like to convert to simple XML (not the kind that Word exports!). They
> are all four pages long and pretty consistent in terms of structure,
> and paragraph styles are used for the most part, though not character
> styles. If you were me, what methods would you look at?
>
> I have used structured FM for years and am familiar with DITA and
> DocBook. I know that there is a route from Word doc > FrameMaker >
> Structured FrameMaker > XML that would involve creating a conversion
> table, a DTD, and a structured application. I have done that in the
> past, though it was a few years ago. I realise that it would mean a
> lot of up-front work to get it working, as well as ensuring that
> styles are used fully and consistently in my source documents.
Roger,
You now have three approaches to consider--conversion table, MIF2Go,
and Word XML. Often in such projects, the developer's experience has a
lot to do with the chosen route. I would probably start with a
conversion table and if you have past experience doing so, it might be
the most straightforward approach. I often touch up the structure
produced by a conversion table with XSLT. I would be cautious about
starting from Word XML because it is very focused on formatting details
and there would be a lot to ignore.
Your last clause, "ensuring that styles are used fully and
consistently in my source documents," may well indicate where the bulk
of the work has to be done. You don't indicate how long your hundred
Word documents are or how consistently the authors attempted to use Word
styles, but even in the best of practical cases there is probably a lot
of work to do.
Also, you mention creating a DTD as part of a conversion table
approach. Does the target XML you want to create use a DTD? A schema?
Neither? Has it been designed? FM can export XML without a DTD, although
tables, graphics, and cross-references may require one.
And I will join the other respondents and offer to meet with you
online to look at a conversion table approach.
--Lynne
--
Lynne A. Price
Text Structure Consulting, Inc.
Specializing in structured FrameMaker consulting, application development, and training
lprice at txstruct.com http://www.txstruct.com
voice/fax: (510) 583-1505 cell phone: (510) 421-2284
More information about the Framers
mailing list