OT: Cannot extract text from PDF

Lise Bible rentagoodbook at gmail.com
Tue Mar 10 11:15:42 PDT 2009


I'm not an Acrobat expert by any means, but have you tried OCR Text
Recognition? I'm on Acrobat 8, and for me, that's Document > OCR Text
Recognition > Recognize Text Using OCR...
I think it's available in Acrobat 7, but I'm not sure.
Have used it occasionally to varying degrees of success.

Figured it might be worth a shot.

-Lise

On Tue, Mar 10, 2009 at 12:59 PM, Art Campbell <art.campbell at gmail.com>wrote:

> Well, at least you've got it down to a font problem.
>
> If you don't have access to a Mac that may have the missing fonts, you
> may want to try a third-party tool, such as:
> http://www.pdftodocconverterpro.com/ which at least gives you a free
> trial.
>
> But if you can't find a Mac and the converters don't work, you
> probably need to start typing.
>
> Art
>
> Art Campbell
>               art.campbell at gmail.com
>  "... In my opinion, there's nothing in this world beats a '52
> Vincent and a redheaded girl." -- Richard Thompson
>                                                      No disclaimers apply.
>                                                               DoD 358
>
>
>
> On Tue, Mar 10, 2009 at 1:51 PM, Shuttleworth, Roger
> <Roger_Shuttleworth at tvworks.com> wrote:
> > Wow, that was worth a try! However...
> >
> > I reprinted the PDF to the Adobe PDF printer. No problems. The file
> displays OK.
> >
> > I tried Save As RTF from the redistilled version and got an informative
> message:
> >
> > "Acrobat was able to make this document accessible but found the
> following oddities:
> >
> > Some font(s) missing information needed to determine the characters that
> correspond to the symbols (glyphs) in the font. [90 of 90 glyphs (Apple
> > Chancery)]"
> >
> > [I wonder what "accessible" means in this context? I'm none too familiar
> with Accessibility settings, but when I tried a Full Check it said, "All of
> > the text in this document lacks a language specification." But perhaps
> I'm barking up the wrong tree here.]
> >
> > Apple Chancery is indeed an embedded subset in the original PDF.
> > The resultant RTF is rather interesting but of no use to me. It consists
> of all caps, and a sample appears below:
> >
> ___'YYUIOGZK_SKSHKXY_YNGRR_HK_SKSHKXY_UL_ZNK_V[HROI_UX_;=5_LGI[RZ___]NU_NG\K_
> GT_OTZKXKYZ_OT_GZZKTJOTM_')+_SKKZOTMY_GTJ_VXUMXGSY_
> >
> > Saving as text produces similar all-cap text.
> >
> > It's beginning to look as though I'll have to retype the doc...the
> original source doc is lost (not by me, I might add!).
> >
> > Roger
> >
> >
> >
> > -----Original Message-----
> > From: knowhowpro at gmail.com [mailto:knowhowpro at gmail.com] On Behalf Of
> Peter Gold
> > Sent: March 10, 2009 1:08 PM
> > To: Shuttleworth, Roger
> > Cc: Art Campbell; framers at lists.frameusers.com
> > Subject: Re: OT: Cannot extract text from PDF
> >
> > Have you tried:
> >
> > * Copy/Paste
> > * Printing to PDF from Acrobat Pro, then trying to extract text by Save
> As?
> >
> > HTH
> >
> > Regards,
> >
> > Peter Gold
> > KnowHow ProServices
> >
> > On Tue, Mar 10, 2009 at 11:54 AM, Shuttleworth, Roger
> > <Roger_Shuttleworth at tvworks.com> wrote:
> >> Thanks for your help.
> >>
> >> I can save other PDFs without a problem.
> >> My Acrobat version is Acrobat Pro 7.1.0.
> >> The Application was AppleWorks. The PDF Producer is Mac OSX 10.3.9
> Quartz PdfContext according to the Document Properties window. There seems
> to be
> >> nothing else interesting in the metadata, and no security applied.
> >>
> >> Roger
> >>
> >> -----Original Message-----
> >> From: knowhowpro at gmail.com [mailto:knowhowpro at gmail.com] On Behalf Of
> Peter Gold
> >> Sent: March 10, 2009 12:47 PM
> >> To: Art Campbell
> >> Cc: Shuttleworth, Roger; framers at lists.frameusers.com
> >> Subject: Re: OT: Cannot extract text from PDF
> >>
> >>>> I have  a PDF that was created using Mac OSX 10.3.9. It displays fine
> on my Windows XP SP3 machine, but I cannot extract the text and create a
> > Word
> >>>> doc. When I try Save As, I get nothing produced except an error:
> >>>>
> >>>>
> >>>>
> >>>> Bad PDF; could not read page structure. <Bad PDF; error in processing
> fonts: cannot find CMAP resource file> [33]
> >>
> >> If the PDF was made using Mac's Preview application, this could be the
> problem;
> >> check document info for Creator.
> >>
> >> If you get the same error when trying to Save As with all documents,
> >> the Acrobat installation may be corrupted.
> >>
> >
> _______________________________________________
>
>
> You are currently subscribed to Framers as rentagoodbook at gmail.com.
>
> Send list messages to framers at lists.frameusers.com.
>
> To unsubscribe send a blank email to
> framers-unsubscribe at lists.frameusers.com
> or visit
> http://lists.frameusers.com/mailman/options/framers/rentagoodbook%40gmail.com
>
> Send administrative questions to listadmin at frameusers.com. Visit
> http://www.frameusers.com/ for more resources and info.
>



More information about the framers mailing list