Extracting art from Word docs

Fred Ridder docudoc at hotmail.com
Mon Mar 28 18:12:47 PDT 2011


The WinZip technique only works for the new Microsoft Word formats (.docx, .docm, .dotx rather than .doc and .dot) that were introduced in Office 2007 and continued in Office 2010. These file formats all contain a collection of XML text files and graphics objects that have been zipped into a single wrapper. 
 
Word 2003 was still the era of the monolithic binary file format and Word 2003 files *cannot* be opened with WinZip regardless of the filename extension. If you open a Word 2003 .doc file in Word 2007 or Word 2010 it will initially keep the file in "compatibility mode", which is not the zip-type file. Even if you specifically save it in Word 2007 .docx format , it will not necessarily handle the graphics the same was as if it were a native Word 2007 file. All of which leaves you with the other two methods of extracting graphics if you are starting with a Word 2003 or older file. 
 
-Fred Ridder
 


Date: Mon, 28 Mar 2011 17:02:31 -0700
From: mkrupp128 at yahoo.com
Subject: RE: Extracting art from Word docs
To: framers at lists.frameusers.com; docudoc at hotmail.com






Thanks for the good advice, Fred!
 
I'm with you on all but the last point. Unfortunately, this is one of those legacy docs that has a long and checkered history. It was originally done in PageMaker 6.5, and it was only through a series of gyrations and extractions that I got a Word doc at all. No source art files, nobody left to tell the tale!
 
I tried changing the extension, but I have only Word 2003 at work, and WinZip wouldn't buy it. Will have to try it at home with a more recent version, but I do see your point. Will try more maneuvers tomorrow.
 
This list is such a great resource! You've saved me enormous amounts of work so many times!
 
Thanks again,
Marguerite
 

--- On Mon, 3/28/11, Fred Ridder <docudoc at hotmail.com> wrote:


From: Fred Ridder <docudoc at hotmail.com>
Subject: RE: Extracting art from Word docs
To: mkrupp128 at yahoo.com, framers at lists.frameusers.com
Date: Monday, March 28, 2011, 6:03 PM




Marguerite Krupp wrote:

>Does the technique of changing the file extension to .zip require 
>that the graphics be imported by reference, so they exist as separate 
>files when unpacked?

First, note that it is *not* necessary to change the filename extension. All that is necessary is to open it with a tool like WinZip that looks past the extension to see what's inside the file itself. 
 
Second, the word/media folder you will find inside the Word file will contain a graphic object for every graphic in the document whether it was pasted, inserted by reference, or embedded as an editable object. Graphics that are in a vector format (WMF, Visio objects, etc.) are in the word/media folder as .emf or .wmf objects, and raster graphics seem to be in .png format. I was just working on a document that had a mixture of pasted vector figures, pasted raster images, and embedded Visio drawing objects, and all of them were present in the word/media folder.
 
Third, if the graphics were imported by reference you'd already have spearate external files for each one, so there seems to be little point in extracting another copy from the Word document.
 
-Fred Ridder  
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.frameusers.com/pipermail/framers-frameusers.com/attachments/20110328/3cb1460a/attachment.htm>


More information about the framers mailing list