Frame's File Comparison Feature

Kevin Farwell kevinf at dim.com
Fri May 7 01:19:31 PDT 2010


Hi Dr. Reng,

You're right in some of the things you said, but other things need a 
little tuning. I've salted some responses and corrections in below.

Kevin

>Hi Steve,
>
>I do not use XML yet. Therefore I may be wrong. Please
>correct me, if this is not correct! My understanding is
>this:
>
>o Of course the translation memory system can import
>   XML as easily as FrameMaker. Some systems charge
>   additionally for a FrameMaker filter. Or XML (depending
>   on the EDD/DTT) might need a special definition in the
>   translation memory system to identify the part which
>   needs to be translated (e.g. attributes).

All TM tools I've used come with a filter for FrameMaker. I don't 
know of any that charge extra for filters, but none should. My 
opinion is translation memory technology is still in version 1.0, 
with only filters to differentiate tools.  (Okay, maybe some 
segmenting is better, but really, the process is just comparing 
bytes, and the bytes have been pretty standard since, well, computers 
were invented. No flames, please) All TM tools will also handle XML, 
but you're right that it is more difficult because XML is random as 
an input. Even standards like DITA can be used in wildly different 
ways. Users must define their own filters, which might result in a 
setup fee at the beginning of a project. However, once the filter is 
built, a word of FrameMaker should cost the same to translate as a 
word of XML.

It's worth mentioning structured FrameMaker filters just like 
unstructured FrameMaker. The structure parts are outside the text 
definition parts of the file, so any filter that works on one will 
work on the other. If you want to move to structure within 
FrameMaker, it shouldn't alarm your translation vendor much. The one 
exception, when last I looked, was Alchemy Publisher, but that was 
over six months ago.


>
>o When you give XML files to a translation agency, the
>   resulting translation memory could be re-used with
>   other translation memory systems (via TMX) better
>   than when you use FrameMaker files. FrameMaker files
>   contains lots of information which is handled differently
>   than XML.
>   Therefore the number of pretranslated segments and fuzzy-
>   matches would increase, _if_ you switch the system.
>   (That's just an assumption. This could be wrong.)

TMX is a bit of a sham, actually. True, it is a standard interchange, 
but tools don't necessarily have the same internal markup. A TM 
created in one tool might be unusable in another. TMX would make the 
TM legible to the second tool, but the database itself would not be. 
A six millimeter square peg would not fit a six millimeter round 
hole, even if a team of physicists agreed the standard millimeter was 
used.

You are right FrameMaker has a lot of information that is handled 
differently or not at all in XML, but the same is true going the 
other way. No DTP format compares well with any other, and none 
compares well with XML. Any sentence with no internal markup will be 
the same, but toss an an index marker, cross-reference, etc., and 
fuzzy match rates start climbing. OpenTag, and subsequently XLIFF, 
made an effort to address this, but they suffer the same weakness as 
TMX. A standardized way of capturing disparate information doesn't 
help humans move the information around.

I have had much better luck fixing TMs outside of TM tools. Changing 
formats is not easy, but removing index markers from FrameMaker 
segments, for example, will make them match XML segments, which don't 
have markers of any kind. This is a through-the-looking-glass kind of 
process, but it results in better TM matching. The minimum gain has 
been10% and my record is 28%. With moderate word and language counts, 
the process pays for itself many times over.


>
>o Translation cost saving calculations with XML are mostly
>   based on chunking.
>   That means only those chunks are translated which are
>   actually changed. As DITA is supposed to split a FrameMaker
>   file into more XML files as compared to a regular FM file,
>   the files to be translated are smaller. This might save
>   money.
>   However, I could also use FrameMaker insets to have
>   smaller files.
>   Additionally, make sure that you will be notified of
>   terminology changes. Such changes must also be done
>   in already translated files.

Insets, absolutely! I wouldn't move to another format until the one 
you're in is exhausted. The drill these days is a rote prescription 
of DITA plus a content management system to solve all problems, but a 
few shared chapter files and a few insets and a variable here and 
there might solve the problems just as well with no additional 
purchases.

I'm all for XML, and mostly for content management, but neither 
should be used without a good reason. FrameMaker actually comes out 
of the box with a lot of good reasons not to move to XML.

Translation cost is the main part of localization, but keep 
formatting in mind. Automated formatting with XML tools also 
represents a fat savings. However, the smaller the translation cost 
saving chunks the harder the localization cost increasing publishing, 
no matter the source format or the automation involved. A balance 
must be struck. If your vendor manages the publishing process, the 
balance shifts a bit toward simplicity.

>
>
>o When you use XML as your primary storage format, infos
>   like table column widths or graphics scaling get lost.
>   I want to have this information present after translation.
>   Therefore I would prefer to use structured FrameMaker and
>   not XML. But that's my oppinion.
>   (Possibly FrameMaker does store such information in
>   processing instructions. Or someone wrote a plug-in
>   for this. I do not know.)

Table widths and graphic scaling can be stored in XML, depending on 
the content model being used. CALS tables store column widths quite 
nicely, and it's incorporated into most XML models (it's a wheel to 
scary to reinvent).


>
>Best regards
>
>Winfried
>
>>  -----Original Message-----
>>  From: Steve Johnson [mailto:chinaski69 at gmail.com]
>>  Sent: Thursday, May 06, 2010 4:27 AM
>>  To: Reng, Dr. Winfried
>>  Cc: FrameMaker Forum
>>  Subject: Re: Frame's File Comparison Feature
>>
>>  Can't translation vendors do memory diffs just as easily on Frame
>>  files vs. XML files?
>>
>>  I don't see the advantage there.
>>
>>  On Wed, May 5, 2010 at 3:04 AM, Reng, Dr. Winfried
>>  <wreng at tycoint.com> wrote:
>>  > Hi,
>>  >
>>  >> What you're considering is (or should be) neither necessary
>>  >> nor desirable. Your translation vendor should be using a
>>  >> translation memory (and you should request a copy of it,
>>  >> since you've paid for it, so that you're not locked into this
>>  >> vendor because it's holding your translation memory hostage).
>>  >>
>>  >> When you send an updated set of files for a book that's
>>  >> already been translated once, the unchanged paragraphs will
>>  >> match the translation memory. Only the portions that are new
>>  >> or changed need to be translated.
>>  >>
>>  >> If your vendor isn't using translation memory, find a new
>>  >> one. If it is using translation memory, there's no point in
>>  >> you trying to dissect files and reassemble them -- you'd gain
>>  >> nothing and risk all kinds of problems.
>>  >
>>  > Of course almost all translation agencies use a translation memory
>>  > system nowadays.
>>  >
>>  > If the vendor uses a translation memory system, such a system can
>>  > easily check the number of non-translated segments (a segment is a
>>  > translation unit) and segments which can be pretranslated or
>>  > translated with the help of fuzzy-matches.
>>  > However, the vendor will still charge for pretranslated segments.
>>  > The reason is that often the terminology must be changed with
>>  > new text. Or references to a previous segment will not be correct
>>  > any longer, because e.g. you inserted another segment. The reference
>>  > may still be correct in English but not in a foreign language.
>>  > The costs per pretranslated segment depend on your vendor, mostly
>>  > around 25 % of non-translated segments.
>>  >
>>  > Best regards
>>  >
>>  > Winfried
>_______________________________________________
>
>
>You are currently subscribed to framers as kevinf at dim.com.
>
>Send list messages to framers at lists.frameusers.com.
>
>To unsubscribe send a blank email to
>framers-unsubscribe at lists.frameusers.com
>or visit http://lists.frameusers.com/mailman/options/framers/kevinf%40dim.com
>
>Send administrative questions to listadmin at frameusers.com. Visit
>http://www.frameusers.com/ for more resources and info.




More information about the framers mailing list