View
281
Download
7
Tags:
Embed Size (px)
Citation preview
With ChronoScan
Capture and Extraction: Where ECM Begins
Capture means many things when speaking of Document/Records Management or Enterprise Content Management.
AIIMAssociation for Information and Image Management
“Capture boils down to entering content into the system.”
Extraction is an important element of Capture…
By extraction we mean pulling the important information from the content to use for classification or taxonomy purposes, creation of the appropriate metadata or tags, and more.
Extraction is an important element of Capture…
So Why is Capture and Extraction so Important?
All Information Governance and Content Management Depends on Correct Metadata
• Find key information on demand
• Apply the correct data security/privacy rules
• Determine the correct data retention
• Protect your entity regarding eDiscovery/legal compliance issues
• Turn your content or knowledge into a competitive advantage
You have to correctly identify the document or content to:
a comprehensive suite of software for document scanning, data extraction and integration into your ECM, CMIS compliant, or line of business database.
ChronoScan is:
The capture of the “thing”:
• Scans• Faxes• Emails• PrintStreams
Exterior Interior
Let’s categorize capture by what we’ll call the Exterior and the Interior
The capture of the content of the “thing”:
Actual data and information extracted from the “thing” such as invoice number, line items, customer number, vendor number, patient name…whatever your information concerns.
This presentation looks at the “interior” capture accomplished by ChronoScan’s “extraction” features.
ChronoScan’s Extraction Features We’ll Examine
OCR technology is the foundation for many of
ChronoScan’s auto extraction capabilities.
Using sophisticated OCR technologies such as Zonal OCR and Grid OCR, ChronoScan can extract data to classify the document and create indexes (metadata or tags) from structured and unstructured
documents.
Extract only data from the area of your document where your important information is found for fast, automatic data extraction.
Zonal OCR Capture
Use Dynamic Text Anchors to link to moving text using constant or variable patterns, thus accommodating unstructured documents.
Zonal OCR Capture
Here, ChronoScan finds the word “subtotal” and captures the data to the right. Extracted data can be further manipulated and used for validation.
Optimize for your documents with multiple parameters like image processing, OCR engine, type of data to find, regular expression validation and more.
Zonal OCR Capture
Grid OCR is used for Line Item Extraction and
Advanced Report Breakdown or Dismount.
With Line Item Extraction, extract and manipulate line data found on such forms as invoices or delivery tickets.
Advanced Report Breakdown or DismountConvert complex reports to a structured data format.Convert complex PDF or scanned OCR reports into a structured data format. With this unique feature, ChronoScan is able to break down complex reports automatically, splitting every different record as an independent processing unit. The software is able to adapt extraction to different rules and page limits to break down and structure visually complex documents into a compressible data file (CSV/XLS).
Advanced Report Breakdown or Dismount
Break Down
Extract
Converts complex reports to structured data.
ChronoScan breaks down complex reports automatically, splitting every different record as an independent processing unit.
Easily adapt extraction to different rules and page limits to break down and structure visually complex documents into a compressible data file (CSV/XLS).
(using sophisticated Grid OCR)
Nuance OCR Plug-In Option
The world's most accurate and robust OCR available.
• Dramatically increases zonal OCR confidence
• Improves OCR triggers precision• Better & faster background OCR
increases precision on regular expression rules
• Better image orientation detection
Extract 1D/2D barcodes from your documents and assign any part of them to fields for indexing, database export, TXT report, file naming, etc.
Barcodes are tried and true information tags.
Read Barcodes from Images
Assign custom actions based on the barcoded values such as set field values, split documents, etc.
Process Captured Data
1 2
Barcodes can be used on separator or slip sheets to designate where documents should end and begin when a stack of documents are scanned. And the barcode information on the separator sheets can be extracted for indexing, naming and routing purposes too.
ChronoScan imports PDF files with native text so you can easily index the fields you want and export your data to TXT, CSV, Excel, Word, HTML, and OLE/ODBC databases to easily feed your indexing or database application.
Automate PDF Processing TasksAutomatically extract fields and tables from PDF files.
ChronoScan learns the Document Type using comprehensive layout recognition features to “remember” user actions. Every different document type can be assigned to a different template or job to customize OCR areas, settings and actions.
Result: Scan/import documents together, without previous preparation to automate repetitive tasks and improve data input.
Automatic Document Learning:
Training ChronoScan to identify documents with Intelligent Document Recognition to automatically capture information
Type 1 Documents
Type 2 Documents
Once data is identified, it can be used for many purposes
besides indexing or metadata creation.
Validation
File Naming
File Splitting Routing
Classification
ECM Integration
Bookmarking
Metadata
Once data is identified, it can be used for many purposes
besides indexing or metadata creation.
Relying on manual scrutiny to bring this “wild content” under control simply will not work. The failure of humans to consistently tag and classify new documents as they are filed has created the mess in the first place.
© AIIM 2014, www.aiim.org
Remember, Everything Depends on Correct Metadata
Relying on manual scrutiny to bring this “wild content” under control simply will not work. The failure of humans to consistently tag and classify new documents as they are filed has created the mess in the first place.
Remember, Everything Depends on Correct Metadata
The Key: Automatic Metadata Creation
With ChronoScan
© AIIM 2014, www.aiim.org
For more on:• Automated document classification• Automated metadata creation• Batch Document processing• Batch PDF mining• Batch text mining• Batch TIF mining• Text mining• Extracting metadata,• Data extraction from unstructured data• Intelligent data capture• Data extraction• Using regex to extract data• Document scanning • Extracting data• Extract meta data, • Scanner software, • Barcode recognition, • OCR software, • Capture tutorial • Pdf scanning,• Scanning software • Indexing• Document indexing• Automated capture• Meta data • Docufi• Imageramp• ChronoScan• Data capture• What is ChronoScan• US Chronoscan reseller• ChronoScan in the US
www.docufi.com [email protected] ©2014
Get Started With Us
Our solutions include, ImageRamp Batch for folder processing, and ChronoScan Capture for advanced data mining and barcode requirements.
Built on over 30 years’ experience in the Document Imaging and Capture market
DocuFi is a premier ChronoScan Solutions Partner offering extensive professional services to configure the system to your specific requirements. DocuFi has been providing custom solutions into health care, financial services, retail, educational and other markets since 2010.
Learn More: