Converting images with searchable text

Fred Ridder docudoc at hotmail.com
Sun Jun 23 06:30:20 PDT 2013


Tim DeWees wrote:

I’m working in a large, complex book (3000+ pages).  One of my co-workers was tasked with creating some new images for the book using visio. I suggested converting the visio images into pdf “images” and pulling them into the book that way thinking that the text within the images would be searchable. The text within the original pdf images is searchable. But once the entire book is converted to pdf, the text in the images is no longer searchable.  I understand that I can run OCR on the new pdf file and the text will be made searchable.  I was wondering though if there is something I can do in the process of converting the visio figures and pulling them into the frame book where the text would still be searchable after the frame book is converted to pdf?  I did some research and there seemed to be varying opinions on this, with some recommending third-party software.  But I was wondering what I could do to improve this process without using third-party software?  Thanks in advance for your help.  In theory, any supported vector-based graphic format (PDF, EPS, WMF, EMF, SVG), as opposed to a raster image format (TIFF, GIF, PNG, JPEG, etc) should produce searchable text because the text is rendered in the graphic as glyphs rather than as a pattern of pixels.

Ever since FM7 made its use reliable, I have used PDF as my graphics meta-file format of choice for source files of all types--Visio, Illustrator, Excel, PowerPoint, etc.  In all cases, the result has been fully searchable text in the PDF deliverables.  The only time that text in graphics is not searchable is when I have used a PDF that was created from a CAD or schematic capture application, since these tools render text as a series of vector graphics objects rather than as glyphs from a scalable font library.

I'd have to guess that the problem lies in how the PDFs were created from Visio. What version of Visio are you using? And what method are you using to create the PDFs? If you have some recent version of Acrobat installed (and it might help to know which version), there can be four different methods of making a PDF which use at least two completely separate mechanisms (one from Adobe, one from Microsoft).

-Fred Ridder


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.frameusers.com/pipermail/framers-frameusers.com/attachments/20130623/cbd126fc/attachment.htm>


More information about the framers mailing list