Extracting your data shouldn’t be like pulling teeth
Turning Content into Data With Intelligent Data ExtractionAdvanced Capture from DocuFi,
Inc.©2014 DocuFi
Time moves on…
…manybusinesses have moved from just scanning for storage purposes only.
Users want the brainwork taken out of working with their scans and files.
Capture software should see the images, extract the content, and integrate it into the workflow.
Brain mri.jpg,National Institutes of Health
Recognize Extract Integrate
Key Elements of Intelligent Data Capture
Recognition technologies such as OCR and barcode recognition can be used to pull data from structured or unstructured scans or existing files painlessly.
OCR has the greatest impact on the growth of intelligent data extraction and the potential continues to grow as the technologies continue to improve.
Barcode recognition offers the most trustworthy recognition technology for data capture and is widely deployed.
See What Can Barcodes Do for Me?
OMR (Optical Mark Recognition)• capturing human-marked data from
document forms such as surveys and tests
• continues to improve in accuracy and demand
ICR (Intelligent Character Recognition)• handwriting recognition• not as accurate as OCR• plays a limited role in some capture
systems • continues to improve in accuracy and
demand
Other Recognition Technologies
After the data has been captured (from barcode, OCR, etc.), pattern matching technology identifies the key data.
Regular expressions (regex) provide a fast and powerful method to search, extract and replace specific data found within scanned documents.
Regular expressions are essentially a special text string for describing a search pattern. You could think of regular expressions as extremely powerful wildcards.
See Using Regular Expressions in Document Management Data Capture and Indexing
See Using Regular Expressions in Document Management Data Capture and Indexing
Regex’s Lookahead , Lookbehind and Line Item Extraction features go beyond basic zonal OCR and let you identify and extract data from unstructured documents. These let you search for an identifiable keyword or string, like “PO Number” and then a word pattern to identify the desired text to extract.
There’s a Mountain of It!
Here is a partial invoice where you might need to capture the "Catalogue Number“ with line Item extraction technology.
Real World Example
So once the key data has been
identified or “extracted”, how
can it be used?
A large single file can be split into multiple files based on information extracted from barcodes and content.
Split Files
Name Files and FoldersName files, folders and subfolders with extracted information from the file or system information.
Route FilesRoute the files to another directory (and even create the folder and subfolder names) using content.
Create indexes from extracted information for the “searchable” fields.
Index
Create PDF BookmarksCreate PDF bookmarks based on extracted information.
ValidationData can be validated against business rules to reduce errors .
Integrate
Integration means sharing the information with:
• A simple search and retrieval system
• A Document Management (DM) system
• An Enterprise Content Management (ECM) system
• A back-end application such as an Enterprise Resource Planning (ERP) system
Molaire sur implant, jbessade — Travail, www.fr.wikipedia.org
Henry Schein, Dentri Dentrix EnterpriseDentrix Ascend, Easy Dx, entalViive, DentalVision, axiUm
… ImageRamp can share the extracted data with anyone who can accept a standard XML or CSV file
Laserfiche
Filenet
MyMedicalRecords
Eaglesoft
AllscriptsDentrixCSV or XML
Anyone
Documentum
Epic
So smile, this is where the content becomes data.
There’s a Mountain of It!
If a stack of invoices were scanned at one time, at each unique occurrence of the Invoice Number, the file could be split and named with the extracted invoice number. Furthermore, the Invoice Number could be shared with an AP system.
The Catalogue Numbers could be extracted and shared with an ERP for inventory purposes.
Remember our Real World Example?
So what needs brushing up?
What does the future hold for intelligent data capture? digicla, "Be good for your teeth and the will be good for you“.
Continued Improvement in Recognition Technologies Including:
Increased Mobility Integration For Smart Phones, Tablets, etc.
Increased Cloud Computing Options
Improved Validation Against Complex Business Rules
Increased Technical Support to Manage the Complexity
• OCR expansion to include services like translation• Better accuracy of ICR (handwriting recognition)• Faster, more accurate
Increased Information Governance Issues and Complexity
Want to Learn More about Document Imaging and Capture?
For more on:• Extracting meta data,• Data extraction from unstructured
data• Intelligent data capture• Data extraction• Using regex to extract data• Document scanning • Extracting data• Extract meta data, • Scanner software, • Barcode recognition, • OCR software, • Capture tutorial • Pdf scanning,• Scanning software • Indexing• Document indexing• Automated capture• Meta data • Scan to index• Batch Processing• Bulk scanning• Docufi• Imageramp• Data capture• Migration to document management
DocuFi
30 years’ experience in the Document Imaging and Capture market
Capture Products www.docufi.com
Copyright ©2014
makers of ImageRamp, Intelligent Capture Solution
Just take a bite and get started with us.