Processing PDF:Processing PDF:How to Go from PDF toHow to Go from PDF toE-text to AudioE-text to Audio
Gaeir DietrichDirectorHigh Tech Center Training Unitof the California Community CollegesFoothill Community College District
PDF from PublishersPDF from Publishers
Portable document format (PDF) Reads the same on any computer Looks like the book Smaller than TIFFs Contains all the text
Always check to make sure the book is the right one!
Easy for publishers
Requesting through ATNRequesting through ATN
Access Text Network Now free for requesting files from ATN-
member publishers Paid membership to exchange files www.accesstext.org
Not all publishers But ATN does have the largest ones
Other Resources at ATNOther Resources at ATN
Accessible Textbook Finder http://www.accesstext.org/atf.php
Link to Publisher Lookup http://www.publisherlookup.org/ Will have to contact non-ATN member
publishers directly
Using Publisher PDFsUsing Publisher PDFs
Sometimes students can use files directly
Often files will need further processing for student use
At the very least, large files may need to be broken into chapters
PDF StrengthsPDF Strengths
Good format for large print Cropping Fit to page on large pages Print sections on large pages (tiling)
Adobe Reader has some nice features Change colors Reflow Limited voicing
Works on both Mac and PC Easy for most publishers to create
PDF WeaknessesPDF Weaknesses
Not always fully accessible Screen readers do not always like them—
even when they are text-based Reading order can be problematic
May be graphics (pictures of text) May have too much security
As an Aside…As an Aside…
When faculty create PDFs… The PDF always started as something
else…usually a Word file Try to get the starting document if the
student prefers audio Security concerns?
Word files can be password protected Button > Prepare > Encrypt
Types of PDF DocumentsTypes of PDF Documents
Text-based Text can be selected
Graphical Picture of text (i.e., a graphic) Text cannot be selected
Use text-select tool to tell the difference Files may be “locked”
Processing PDFsProcessing PDFs
Adobe Acrobat Professional Check on College Buys for discount
Good OCR program Abbyy FineReader Nuance OmniPage
IF you are a Kurzweil campus, you will also need Kurzweil
Adobe ToolsAdobe Tools
Adobe Reader Free Useful for students who need minimal
accessibility features http://www.adobe.com/products/reader/
Adobe Acrobat Professional Essential for alt media specialists Extract text, create accessible PDFs, enabled
Adobe Reader features www.uscollegebuy.com Discounted Price
Acrobat ReaderAcrobat Reader
Reads aloud But does not highlight or track
Enlarges text Nice reflow feature
Changes text/background colors Text highlighting, sticky notes, and
comments Access for text-based PDFs
Production Features in Reader
Really designed for reading, not reformatting
Export PDF Subscription service (about $20/year) Upload PDF file, service auto-converts to
Word, download
Process with Acrobat ProProcess with Acrobat Pro
Cropping Enlargement for printing Tiling Extracting/deleting pages Combining/inserting pages Text extraction
Works best with text-based PDF Does have built-in OCR capability
Customize Quick Tools
Click on the “gear”
View > Show/hide > Toolbar Items > Quick Tools
Quick Tools Menu
Customize
Please Note
To enable single-key shortcuts Open Preferences dialog box Ctrl + K Under General > select Use Single-Key
Accelerators To Access Tools (first checkbox under Basic Tools)
Cropping
Tools > Pages > Crop
Shortcut: C (Please note: This shortcut brings up the
mouse-driven cropping tool—must double click to open the dialog box!)
Crop Tool
Crop Toolbox
Enlarging
Choose paper size/printer File > Print > Size…to Fit
Shortcut: Ctrl + P (tab through)
Tip: Crop document before enlarging
Print to Fit
Tiling
Choose paper size/printer File > Print > Poster > Tile Scale and
Overlap
Shortcut: Ctrl + P (tab through)
Tip: Crop document before tiling
Enlarge with Tiling
Extracting Pages
Tools > Pages > Extract
Delete Shortcut: Ctrl + Shift + D Extract Pages Shortcut: Alt V + T + P
(opens Pages pane; F6 focuses in pane and can arrow down)
Extraction Tool
Tips for Extracting Chapters
Crop on complete file before extracting Work on a copy!!!!! Extract from end toward front! Use table of contents to help Place focus on first page of chapter to
extract (beginning with last)
Starting from the Back
Combining
File > Pages > Insert
OR
Create > Combine files
Inserting Pages
Combining Pages
Auto Extracting Text
File > Save As > MS Word Retains styles and paragraphs
File > Save As > More options… Text (Accessible)
Lose styles, places hard returns at end of line Text (Plain)
Lose styles, keeps paragraphs
Shortcut: Alt F + A
Save As Options
Better Text Extraction
OCR programs analyze text and structure Acrobat Pro has built-in OCR, but other
programs provide more control Can control which text to include
More Control over Text
For graphical PDFs Or To maintain more control over extracting
text from text-based PDFs Use an OCR program!
Processing Graphical PDFsProcessing Graphical PDFs
Must run optical character recognition (OCR) Computers cannot read pictures OCR programs recognize the “characters” in the
picture
How you process the file depends on the end format the student wants!
Want to Stay in PDF?
Sometimes students do want a text-based PDF
Can OCR in Adobe Pro Tools> Recognize Text
Under Tools
Want Text OutWant Text Out
OmniPage or FineReader FineReader generally easier to learn Save to Word or HTML or Text based on student
preference
Use virtual printer with Kurzweil Create KESI files
R&W Save as Word
Which One When?Which One When?
Want a Word file? Best choice is OmniPage or FineReader
Want a Kurzweil document? Use Kurzweil to process the PDF
For students to do themselves? Whichever program they prefer
Why?Why?
OCR programs are designed to make extraction and editing easy
Document readers (R&W, Kurzweil, etc.) are designed to make reading easy…NOT editing.
NEVER!!!NEVER!!!
Do NOT run OCR with FineReader or OmniPage…save to PDF…and then take into Kurzweil, R&W, etc.
Kurzweil, R&W, WYNN will run their own OCR on the PDF! Wastes time, adds error to do OCR twice
OCR ProgramsOCR Programs
Treat PDFs the same as a TIFF If you OCR scanned documents, use the
same process
Load image file Select zones Create templates as needed
OCR Process Details
Crop before loading into OCR engine Turn on multiple languages as needed
If doing math, turn on Greek Only turn on the languages you need
Edit in the OCR program Some OCR programs have font matching features
Save to Word
Captions and Such
For students who want audio or who are using screen readers Separate the main body of the text and the
“ancillary text” (captions, sidebars, footnotes)
Create two documents 00 Chapter and 00A Chapter
Allows the student to hear main text uninterrupted
Two Doc Workflow
Open PDF in OCR Program Analyze layout for entire document
Save a copy On one copy…delete all ancillary text
Save to Word as 00 Chapter On other copy…delete all main body text
Save as 00A Chapter Keep page numbers in both documents!
Once in Word
Learn to use “show hidden” Ctrl + Shift + 8
Beware of the optional hyphen Search and replace to delete Search for ^- replace with nothing Run spell check
Use styles to structure files for braille program
Converting Files
Mobile Readers?
Check formats that device can handle Some handle PDF and DOC, some do not
All readers handle TXT Also called text, ASCII Can save from Word as plain text
Magic Conversion Tool
Calibre Converts to and from many formats Fairly intuitive Free!
http://calibre-ebook.com/
Another Conversion Tool
TechAdapt http://www.techadapt.com/
TechAdapt Accessible Media Center (TAMC) For converting NIMAS and DAISY
DAISY to… RTF HTML
File Transfer
Can use DropBox or Box to transfer files for most readers
Kindle and iPad can often use e-mail
Resource InfoResource Info
Gaeir Dietrich [email protected] 408-996-6047
www.htctu.net Alt media listserv Manuals online