Structure/Schema - Custom or off the shelf?

Alan Houser arh at groupwellesley.com
Thu Feb 2 19:52:35 PST 2006


Marcus,

I've valued your opinions over the years, but I must take exception to 
your assessments of both DITA and DocBook. DITA architect Michael 
Priestley (a co-author of the 2001 paper you cited) has more recently 
addressed the misconception that DITA is an exchange format, not an 
authoring format 
(http://groups.yahoo.com/group/dita-users/message/1081). My anecdotal 
experience matches Michael's -- that about half of all implementations 
use the DITA DTD "out of the box" for content authoring.

Regarding DocBook -- I acknowledge that it's big, and has a "designed by 
committee" feel. However, I've seen too many companies use it 
successfully to dismiss it. Having a set of extensible XSLT 
transformations is absolutely invaluable -- not for the easy stuff, like 
transforming "<title>" to "<h1>", but for the hard stuff, like building 
a back-of-the-book index. Try writing that XSLT code from scratch.

DocBook's size isn't necessarily the problem it may appear to be. 
Authors tend to learn markup languages by example. Their approach is 
typically "here's how we mark up a help topic in DocBook" (bottom-up), 
as opposed to "which of DocBook's 400 elements do I need to mark up a 
help topic?" (top-down). I would also argue that the "ugly but legal" 
DocBook constructs you observed are due to limitations in the expressive 
capabilities of XML DTDs, not of DocBook per se.

I'm not saying "use DITA" or "use DocBook." There's lots of value in 
building custom DTDs, and organizations do it successfully all the time. 
However, many organizations under-estimate the effort required to build 
the publishing component (e.g., XSLT transformations) to accompany a 
custom DTD. If you have the time and expertise to do this yourself, 
great. If not, or if you would prefer to devote this effort elsewhere, 
architectures like DITA, DocBook, or Scriptorium's DocFrame, which 
include the necessary publishing component, can become much more 
appealing than a home-grown alternative.

-Alan

mcarr at allette.com.au wrote:
> DITA was designed by IBM for data interchange, so was never really
> intended as a data authoring structure. This can be confirmed by the
> creators of DITA at
> http://www-128.ibm.com/developerworks/xml/library/x-dita1/ where it
> states:
>   
>> First, both SGML and XML are recognized as meta languages that allow
>> communities of data owners to describe their information assets in ways
>> that reflect how they develop, store, and process that information.
>> Because knowledge representation is so strongly related to corporate
>> cultures and community jargon, most attempts to define a universal DTD
>> have ended up either unused or unfinished. The ideal for information
>> interchange is to share the semantics and the transformational rules
>> for this information with other data-owning communities.
>>     
>
> So the bottom line is that DITA was never intended to replace a custom
> schema - it was designed to facilitate date exchange between arbitrary
> schemas. Nothing wrong with using DITA for that - it makes good sense in
> that role.
>
> DocBook is a worthless bucket of elements. Sorry. I had a look yesterday
> and quickly found two examples that were enough to reconfirm my opinion.
> The first was that footnotes can contain paras that can contain footnotes,
> so you could have bottomlessly recursive nested footnotes. No typesetting
> application is going to be able to make sense of that, so you're going to
> have to either tell it to ignore the ridiculous scenarios or remove them
> from the structure in the first place. The second was that table entries
> can nest tables in the same way. Good luck rendering that - FrameMaker
> won't even break an entry over a page, let alone support a handful of
> levels of tables nested in entries. Sure I could cut DocBook down, but
> when the starting point involves removing the ridiculous before I can even
> think of doing anything sensible, I lose patience fast.
>
> A far better approach (IMHO) is to start with the simplest schema you can
> and extend it as required. When someone asks for a new element or looser
> structure, make 'em justify it. Be ruthless about this - someone has to
> code for the changes, so make sure they really need it. Extending then
> becomes an engineering exercise - you can evaluate the impact and the cost
> of the proposed change, carry out regression testing on any XSLT or other
> system components, document the change properly, etc.
>
> If you must use a DocBook tool, create your schema using DocBook naming
> conventions and structures, but build it from the ground up, don't cut the
> big one down. Your analysis will be more thourough and your results
> better. Oh yeah, and make sure that when you finish playing with DocBook
> you wash your hands...
>
>
> Marcus
> _______________________________________________
>   
-- 
---
Alan Houser, President
Group Wellesley, Inc.
412-363-3481
www.groupwellesley.com




More information about the framers mailing list