23
PHP XPDF Documentation Release 0.1 Romain Neutron - Alchemy February 02, 2017

PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

Embed Size (px)

Citation preview

Page 1: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF DocumentationRelease 0.1

Romain Neutron - Alchemy

February 02, 2017

Page 2: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files
Page 3: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

Contents

1 Introduction 1

2 Installation 3

3 PDF2Text 53.1 Basic Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Using Custom Binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 Charset encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.4 Extract page range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.5 Silex Service Provider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 Handling Exceptions 7

5 Report a bug 9

6 Ask for a feature 11

7 Contribute 13

8 Run tests 15

9 About 17

10 License 19

i

Page 4: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

ii

Page 5: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 1

Introduction

PHP-XPDF is an object oriented library to handle XPDF, an open source project self-described as “a viewer forPortable Document Format files. The Xpdf project also includes a PDF text extractor, PDF-to-PostScript converter,and various other utilities.”

For the moment, there’s only one handler for PDF2Text.

This library depends on Symfony Process Component and Monolog.

1

Page 6: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

2 Chapter 1. Introduction

Page 7: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 2

Installation

PHP-XPDF relies on composer. If you do not yet use composer for your project, you can start with thiscomposer.json at the root of your project:

{"require": {

"php-xpdf/php-xpdf": "master"}

}

Install composer :

# Install composercurl -s http://getcomposer.org/installer | php# Upgrade your installphp composer.phar install

You now just have to autoload the library to use it :

<?phprequire 'vendor/autoload.php';

This is a very short intro to composer. If you ever experience an issue or want to know more about composer, you willfind help on its dedicated website http://getcomposer.org/.

3

Page 8: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

4 Chapter 2. Installation

Page 9: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 3

PDF2Text

3.1 Basic Usage

<?phpuse Monolog\Logger;use Monolog\Handler\NullHandler;use XPDF\PdfToText;

// Create a logger$logger = new Logger('MyLogger');$logger->pushHandler(new NullHandler());

// You have to pass a Monolog logger// This logger provides some usefull infos about what's happening$pdfToText = PdfToText::load($logger);

// open PDF$pdfToText->open('PDF-book.pdf');

// PDF text is now in the $text variable$text = $pdfToText->getText();$pdfToText->close();

3.2 Using Custom Binary

The PDF2Text is automatically detected if its path install is in the PATH. if you want to use your own PdfTotext binary,use as follow :

<?php$pdfToText = new PdfToText('/path/to/your/binary', $logger);

or, on Windows platform :

<?php$pdfToText = new PdfToText('C:\XPDF\PDF2Text.exe', $logger);

5

Page 10: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

3.3 Charset encoding

By default, output text is UTF-8 encoded. But if you want a custom output , use the setOutputEncoding method

<?php$pdfToText->setOutputEncoding('ISO-8859-5');

Note: The charset value should be an iconv compatible value.

3.4 Extract page range

You can restrict the text extraction on page range. For example to extract pages 3 to 6 ;

<?php$pdfToText->getText(3, 6);

3.5 Silex Service Provider

XPDF is bundled with a Silex Service Provider. Use it is very simple :

<?phpuse Silex\Application;use XPDF\XPDFServiceProvider;

$app = new Application();$app->register(new XPDFServiceProvider());

// You have access to PDF2Text$app['xpdf.pdf2text']->open(...);

You can, of course, customize it :

<?phpuse Silex\Application;use XPDF\XPDFServiceProvider;

$app = new Application();$app->register(new XPDFServiceProvider(), array(

'xpdf.pdf2text.binary' => '/your/custom/binary','xpdf.logger' => $my_logger,

));

// You have access to PDF2Text$app['xpdf.pdf2text']->open(...);

6 Chapter 3. PDF2Text

Page 11: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 4

Handling Exceptions

XPDF throws 4 different types of exception :

• \XPDF\Exception\BinaryNotFoundException is thrown when no acceptable pdf2text binary isfound.

• \XPDF\Exception\InvalidFileArgumentException is thrown when an invalid file is supplied fortext extraction

• \XPDF\Exception\LogicException which extends SPL LogicException

• \XPDF\Exception\RuntimeException which extends SPL RuntimeException

All these Exception implements \XPDF\Exception\Exception so you can catch any of these exceptions bycatching this exception interface.

7

Page 12: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

8 Chapter 4. Handling Exceptions

Page 13: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 5

Report a bug

If you experience an issue, please report it in our issue tracker. Before reporting an issue, please be sure that it is notalready reported by browsing open issues.

When reporting, please give us information to reproduce it by giving your platform (Linux / MacOS / Windows) andits version, the version of PHP you use (the output of php --version), and the version of xpdf you use (the outputof xpdf -v).

9

Page 14: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

10 Chapter 5. Report a bug

Page 15: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 6

Ask for a feature

We would be glad you ask for a feature ! Feel free to add a feature request in the issues manager on GitHub !

11

Page 16: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

12 Chapter 6. Ask for a feature

Page 17: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 7

Contribute

You find a bug and resolved it ? You added a feature and want to share ? You found a typo in this doc and fixed it ?Feel free to send a Pull Request on GitHub, we will be glad to merge your code.

13

Page 18: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

14 Chapter 7. Contribute

Page 19: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 8

Run tests

PHP-XPDF relies on PHPUnit for unit tests. To run tests on your system, ensure you have PHPUnit installed, and, atthe root of PHP-XPDF, execute it :

phpunit

15

Page 20: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

16 Chapter 8. Run tests

Page 21: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 9

About

PHP-XPDF has been written by Romain Neutron @ Alchemy for Phraseanet, our DAM software. Try it, it’s awesome!

17

Page 22: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

PHP XPDF Documentation, Release 0.1

18 Chapter 9. About

Page 23: PHP XPDF Documentation - Read the Docs · PHP-XPDF is an object oriented library to handleXPDF, an open source project self-described as “a viewer for Portable Document Format files

CHAPTER 10

License

PHP-XPDF is licensed under the MIT License

19