2016-12-03 13:16:45 +00:00
|
|
|
|
# Reading Files
|
|
|
|
|
|
|
|
|
|
## Security
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
XML-based formats such as OfficeOpen XML, Excel2003 XML, OASIS and
|
|
|
|
|
Gnumeric are susceptible to XML External Entity Processing (XXE)
|
|
|
|
|
injection attacks (for an explanation of XXE injection see
|
|
|
|
|
http://websec.io/2012/08/27/Preventing-XEE-in-PHP.html) when reading
|
|
|
|
|
spreadsheet files. This can lead to:
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
- Disclosure whether a file is existent
|
|
|
|
|
- Server Side Request Forgery
|
|
|
|
|
- Command Execution (depending on the installed PHP wrappers)
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
To prevent this, PhpSpreadsheet sets `libxml_disable_entity_loader` to
|
|
|
|
|
`true` for the XML-based Readers by default.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
## Loading a Spreadsheet File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The simplest way to load a workbook file is to let PhpSpreadsheet's IO
|
|
|
|
|
Factory identify the file type and load it, calling the static load()
|
|
|
|
|
method of the \PhpOffice\PhpSpreadsheet\IOFactory class.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
|
|
|
|
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($inputFileName);
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader01.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The load() method will attempt to identify the file type, and
|
|
|
|
|
instantiate a loader for that file type; using it to load the file and
|
|
|
|
|
store the data and any formatting in a `Spreadsheet` object.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The method makes an initial guess at the loader to instantiate based on
|
|
|
|
|
the file extension; but will test the file before actually executing the
|
|
|
|
|
load: so if (for example) the file is actually a CSV file or contains
|
|
|
|
|
HTML markup, but that has been given a .xls extension (quite a common
|
|
|
|
|
practise), it will reject the Xls loader that it would normally use for
|
|
|
|
|
a .xls file; and test the file using the other loaders until it finds
|
|
|
|
|
the appropriate loader, and then use that to read the file.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
While easy to implement in your code, and you don't need to worry about
|
|
|
|
|
the file type; this isn't the most efficient method to load a file; and
|
|
|
|
|
it lacks the flexibility to configure the loader in any way before
|
|
|
|
|
actually reading the file into a `Spreadsheet` object.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
## Creating a Reader and Loading a Spreadsheet File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If you know the file type of the spreadsheet file that you need to load,
|
|
|
|
|
you can instantiate a new reader object for that file type, then use the
|
|
|
|
|
reader's load() method to read the file to a `Spreadsheet` object. It is
|
|
|
|
|
possible to instantiate the reader objects for each of the different
|
|
|
|
|
supported filetype by name. However, you may get unpredictable results
|
|
|
|
|
if the file isn't of the right type (e.g. it is a CSV with an extension
|
|
|
|
|
of .xls), although this type of exception should normally be trapped.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Xls Reader **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xls();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\Excel2003XML();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\Ods();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\SYLK();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\Gnumeric();
|
|
|
|
|
// $reader = new \PhpOffice\PhpSpreadsheet\Reader\CSV();
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader02.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Alternatively, you can use the IO Factory's createReader() method to
|
|
|
|
|
instantiate the reader object for you, simply telling it the file type
|
|
|
|
|
of the reader that you want instantiating.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
// $inputFileType = 'Xlsx';
|
|
|
|
|
// $inputFileType = 'Excel2003XML';
|
|
|
|
|
// $inputFileType = 'Ods';
|
|
|
|
|
// $inputFileType = 'SYLK';
|
|
|
|
|
// $inputFileType = 'Gnumeric';
|
|
|
|
|
// $inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader03.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
If you're uncertain of the filetype, you can use the IO Factory's
|
|
|
|
|
identify() method to identify the reader that you need, before using the
|
|
|
|
|
createReader() method to instantiate the reader object.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Identify the type of $inputFileName **/
|
|
|
|
|
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($inputFileName);
|
|
|
|
|
/** Create a new Reader of the type that has been identified **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader04.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
## Spreadsheet Reader Options
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Once you have created a reader object for the workbook that you want to
|
|
|
|
|
load, you have the opportunity to set additional options before
|
|
|
|
|
executing the load() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
### Reading Only Data from a Spreadsheet File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If you're only interested in the cell values in a workbook, but don't
|
|
|
|
|
need any of the cell formatting information, then you can set the reader
|
|
|
|
|
to read only the data values and any formulae from each cell using the
|
|
|
|
|
setReadDataOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader that we only want to load cell data **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadDataOnly(true);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader05.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
It is important to note that Workbooks (and PhpSpreadsheet) store dates
|
|
|
|
|
and times as simple numeric values: they can only be distinguished from
|
|
|
|
|
other numeric values by the format mask that is applied to that cell.
|
|
|
|
|
When setting read data only to true, PhpSpreadsheet doesn't read the
|
|
|
|
|
cell format masks, so it is not possible to differentiate between
|
|
|
|
|
dates/times and numbers.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The Gnumeric loader has been written to read the format masks for date
|
|
|
|
|
values even when read data only has been set to true, so it can
|
|
|
|
|
differentiate between dates/times and numbers; but this change hasn't
|
|
|
|
|
yet been implemented for the other readers.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Reading Only Data from a Spreadsheet File applies to Readers:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
|
|
|
|
CSV | NO | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Reading Only Named WorkSheets from a File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If your workbook contains a number of worksheets, but you are only
|
|
|
|
|
interested in reading some of those, then you can use the
|
|
|
|
|
setLoadSheetsOnly() method to identify those sheets you are interested
|
|
|
|
|
in reading.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
To read a single sheet, you can pass that sheet name as a parameter to
|
|
|
|
|
the setLoadSheetsOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetname = 'Data Sheet #2';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader of which WorkSheets we want to load **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadSheetsOnly($sheetname);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader07.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
If you want to read more than just a single sheet, you can pass a list
|
|
|
|
|
of sheet names as an array parameter to the setLoadSheetsOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetnames = array('Data Sheet #1','Data Sheet #3');
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader of which WorkSheets we want to load **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadSheetsOnly($sheetnames);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader08.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
To reset this option to the default, you can call the setLoadAllSheets()
|
|
|
|
|
method.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader to load all Worksheets **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadAllSheets();
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader06.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Reading Only Named WorkSheets from a File applies to Readers:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
|
|
|
|
CSV | NO | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Reading Only Specific Columns and Rows from a File (Read Filters)
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If you are only interested in reading part of a worksheet, then you can
|
|
|
|
|
write a filter class that identifies whether or not individual cells
|
|
|
|
|
should be read by the loader. A read filter must implement the
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a
|
|
|
|
|
readCell() method that accepts arguments of \$column, \$row and
|
|
|
|
|
\$worksheetName, and return a boolean true or false that indicates
|
|
|
|
|
whether a workbook cell identified by those arguments should be read or
|
|
|
|
|
not.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetname = 'Data Sheet #3';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Read rows 1 to 7 and columns A to E only
|
|
|
|
|
if ($row >= 1 && $row <= 7) {
|
|
|
|
|
if (in_array($column,range('A','E'))) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** Create an Instance of our Read Filter **/
|
|
|
|
|
$filterSubset = new MyReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($filterSubset);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load only the rows and columns that match our filter to Spreadsheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader09.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
This example is not particularly useful, because it can only be used in
|
|
|
|
|
a very specific circumstance (when you only want cells in the range
|
|
|
|
|
A1:E7 from your worksheet. A generic Read Filter would probably be more
|
|
|
|
|
useful:
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
private $_startRow = 0;
|
|
|
|
|
private $_endRow = 0;
|
|
|
|
|
private $_columns = array();
|
|
|
|
|
|
|
|
|
|
/** Get the list of rows and columns to read */
|
|
|
|
|
public function __construct($startRow, $endRow, $columns) {
|
|
|
|
|
$this->_startRow = $startRow;
|
|
|
|
|
$this->_endRow = $endRow;
|
|
|
|
|
$this->_columns = $columns;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Only read the rows and columns that were configured
|
|
|
|
|
if ($row >= $this->_startRow && $row <= $this->_endRow) {
|
|
|
|
|
if (in_array($column,$this->_columns)) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** Create an Instance of our Read Filter, passing in the cell range **/
|
|
|
|
|
$filterSubset = new MyReadFilter(9,15,range('G','K'));
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader10.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
This can be particularly useful for conserving memory, by allowing you
|
|
|
|
|
to read and process a large workbook in “chunks”: an example of this
|
|
|
|
|
usage might be when transferring data from an Excel worksheet to a
|
|
|
|
|
database.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example2.xls';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class chunkReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
private $_startRow = 0;
|
|
|
|
|
private $_endRow = 0;
|
|
|
|
|
|
|
|
|
|
/** Set the list of rows that we want to read */
|
|
|
|
|
public function setRows($startRow, $chunkSize) {
|
|
|
|
|
$this->_startRow = $startRow;
|
|
|
|
|
$this->_endRow = $startRow + $chunkSize;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Only read the heading row, and the configured rows
|
|
|
|
|
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define how many rows we want to read for each "chunk" **/
|
|
|
|
|
$chunkSize = 2048;
|
|
|
|
|
/** Create a new Instance of our Read Filter **/
|
|
|
|
|
$chunkFilter = new chunkReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($chunkFilter);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
/** Loop to read our worksheet in "chunk size" blocks **/
|
|
|
|
|
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
|
|
|
|
|
/** Tell the Read Filter which rows we want this iteration **/
|
|
|
|
|
$chunkFilter->setRows($startRow,$chunkSize);
|
|
|
|
|
/** Load only the rows that match our filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
// Do some processing here
|
|
|
|
|
}
|
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader12.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Using Read Filters applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
|
|
|
|
CSV | YES | HTML | NO | | |
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
### Combining Multiple Files into a Single Spreadsheet Object
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
While you can limit the number of worksheets that are read from a
|
|
|
|
|
workbook file using the setLoadSheetsOnly() method, certain readers also
|
|
|
|
|
allow you to combine several individual "sheets" from different files
|
|
|
|
|
into a single `Spreadsheet` object, where each individual file is a
|
|
|
|
|
single worksheet within that workbook. For each file that you read, you
|
|
|
|
|
need to indicate which worksheet index it should be loaded into using
|
|
|
|
|
the setSheetIndex() method of the \$reader, then use the
|
|
|
|
|
loadIntoExisting() method rather than the load() method to actually read
|
|
|
|
|
the file into that worksheet.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileNames = array('./sampleData/example1.csv',
|
|
|
|
|
'./sampleData/example2.csv'
|
|
|
|
|
'./sampleData/example3.csv'
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Extract the first named file from the array list **/
|
|
|
|
|
$inputFileName = array_shift($inputFileNames);
|
|
|
|
|
/** Load the initial file to the first worksheet in a `Spreadsheet` Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title (to the filename that we've loaded) **/
|
|
|
|
|
$spreadsheet->getActiveSheet()
|
|
|
|
|
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Loop through all the remaining files in the list **/
|
|
|
|
|
foreach($inputFileNames as $sheet => $inputFileName) {
|
|
|
|
|
/** Increment the worksheet index pointer for the Reader **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setSheetIndex($sheet+1);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load the current file into a new worksheet in Spreadsheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->loadIntoExisting($inputFileName,$spreadsheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title (to the filename that we've loaded) **/
|
|
|
|
|
$spreadsheet->getActiveSheet()
|
|
|
|
|
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader13.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
Note that using the same sheet index for multiple sheets won't append
|
|
|
|
|
files into the same sheet, but overwrite the results of the previous
|
|
|
|
|
load. You cannot load multiple CSV files into the same worksheet.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Combining Multiple Files into a Single Spreadsheet Object applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | YES | Gnumeric | NO |
|
2016-12-03 13:16:45 +00:00
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the
|
|
|
|
|
Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to
|
|
|
|
|
1,048,576 rows in a worksheet; but a CSV file is not limited other than
|
|
|
|
|
by available disk space. This means that we wouldn’t ordinarily be able
|
|
|
|
|
to read all the rows from a very large CSV file that exceeded those
|
|
|
|
|
limits, and save it as an Xls or Xlsx file. However, by using Read
|
|
|
|
|
Filters to read the CSV file in “chunks” (using the chunkReadFilter
|
|
|
|
|
Class that we defined in section REF \_Ref275604563 \r \p 5.3 above),
|
|
|
|
|
and the setSheetIndex() method of the \$reader, we can split the CSV
|
|
|
|
|
file across several individual worksheets.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example2.csv';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define how many rows we want to read for each "chunk" **/
|
|
|
|
|
$chunkSize = 65530;
|
|
|
|
|
/** Create a new Instance of our Read Filter **/
|
|
|
|
|
$chunkFilter = new chunkReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
|
|
|
|
/** and that we want to store it in contiguous rows/columns **/
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($chunkFilter)
|
2016-12-03 13:16:45 +00:00
|
|
|
|
->setContiguous(true);
|
|
|
|
|
|
|
|
|
|
/** Instantiate a new Spreadsheet object manually **/
|
|
|
|
|
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
|
|
|
|
|
|
|
|
|
|
/** Set a sheet index **/
|
|
|
|
|
$sheet = 0;
|
|
|
|
|
/** Loop to read our worksheet in "chunk size" blocks **/
|
|
|
|
|
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
|
|
|
|
|
for ($startRow = 2; $startRow <= 1000000; $startRow += $chunkSize) {
|
|
|
|
|
/** Tell the Read Filter which rows we want to read this loop **/
|
|
|
|
|
$chunkFilter->setRows($startRow,$chunkSize);
|
|
|
|
|
|
|
|
|
|
/** Increment the worksheet index pointer for the Reader **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setSheetIndex($sheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load only the rows that match our filter into a new worksheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->loadIntoExisting($inputFileName,$spreadsheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title for the sheet that we've justloaded) **/
|
|
|
|
|
/** and increment the sheet index as well **/
|
|
|
|
|
$spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet));
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader14.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
This code will read 65,530 rows at a time from the CSV file that we’re
|
|
|
|
|
loading, and store each "chunk" in a new worksheet.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The setContiguous() method for the Reader is important here. It is
|
|
|
|
|
applicable only when working with a Read Filter, and identifies whether
|
|
|
|
|
or not the cells should be stored by their position within the CSV file,
|
|
|
|
|
or their position relative to the filter.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
For example, if the filter returned true for cells in the range B2:C3,
|
|
|
|
|
then with setContiguous set to false (the default) these would be loaded
|
|
|
|
|
as B2:C3 in the `Spreadsheet` object; but with setContiguous set to
|
|
|
|
|
true, they would be loaded as A1:B2.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Splitting a single loaded file across multiple worksheets applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO |
|
2016-12-03 13:16:45 +00:00
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Pipe or Tab Separated Value Files
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The CSV loader defaults to loading a file where comma is used as the
|
|
|
|
|
separator, but you can modify this to load tab- or pipe-separated value
|
|
|
|
|
files using the setDelimiter() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example1.tsv';
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the delimiter to a TAB character **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setDelimiter("\t");
|
|
|
|
|
// $reader->setDelimiter('|');
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
/** Load the file to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader15.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
In addition to the delimiter, you can also use the following methods to
|
|
|
|
|
set other attributes for the data load:
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
setEnclosure() | default is " setLineEnding() | default is PHP\_EOL
|
2016-12-03 13:16:45 +00:00
|
|
|
|
setInputEncoding() | default is UTF-8
|
|
|
|
|
|
|
|
|
|
Setting CSV delimiter applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO |
|
2016-12-03 13:16:45 +00:00
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
|
|
|
|
### A Brief Word about the Advanced Value Binder
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
When loading data from a file that contains no formatting information,
|
|
|
|
|
such as a CSV file, then data is read either as strings or numbers
|
|
|
|
|
(float or integer). This means that PhpSpreadsheet does not
|
|
|
|
|
automatically recognise dates/times (such as "16-Apr-2009" or "13:30"),
|
|
|
|
|
booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks
|
|
|
|
|
("http://www.phpexcel.net"), etc as anything other than simple strings.
|
|
|
|
|
However, you can apply additional processing that is executed against
|
|
|
|
|
these values during the load process within a Value Binder.
|
|
|
|
|
|
|
|
|
|
A Value Binder is a class that implement the
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a
|
|
|
|
|
bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a
|
|
|
|
|
value as arguments, and return a boolean true or false that indicates
|
|
|
|
|
whether the workbook cell has been populated with the value or not. The
|
|
|
|
|
Advanced Value Binder implements such a class: amongst other tests, it
|
|
|
|
|
identifies a string comprising "TRUE" or "FALSE" (based on locale
|
|
|
|
|
settings) and sets it to a boolean; or a number in scientific format
|
|
|
|
|
(e.g. "1.234e-5") and converts it to a float; or dates and times,
|
|
|
|
|
converting them to their Excel timestamp value – before storing the
|
|
|
|
|
value in the cell object. It also sets formatting for strings that are
|
|
|
|
|
identified as dates, times or percentages. It could easily be extended
|
|
|
|
|
to provide additional handling (including text or cell formatting) when
|
|
|
|
|
it encountered a hyperlink, or HTML markup within a CSV file.
|
|
|
|
|
|
|
|
|
|
So using a Value Binder allows a great deal more flexibility in the
|
|
|
|
|
loader logic when reading unformatted text files.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() );
|
|
|
|
|
|
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example1.tsv';
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
|
|
|
|
$reader->setDelimiter("\t");
|
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader15.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Loading using a Value Binder applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO
|
2016-12-03 13:16:45 +00:00
|
|
|
|
CSV | YES | HTML | YES
|
|
|
|
|
|
|
|
|
|
## Spreadsheet Reader Options
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Once you have created a reader object for the workbook that you want to
|
|
|
|
|
load, you have the opportunity to set additional options before
|
|
|
|
|
executing the load() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
### Reading Only Data from a Spreadsheet File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If you're only interested in the cell values in a workbook, but don't
|
|
|
|
|
need any of the cell formatting information, then you can set the reader
|
|
|
|
|
to read only the data values and any formulae from each cell using the
|
|
|
|
|
setReadDataOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader that we only want to load cell data **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadDataOnly(true);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader05.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
It is important to note that Workbooks (and PhpSpreadsheet) store dates
|
|
|
|
|
and times as simple numeric values: they can only be distinguished from
|
|
|
|
|
other numeric values by the format mask that is applied to that cell.
|
|
|
|
|
When setting read data only to true, PhpSpreadsheet doesn't read the
|
|
|
|
|
cell format masks, so it is not possible to differentiate between
|
|
|
|
|
dates/times and numbers.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The Gnumeric loader has been written to read the format masks for date
|
|
|
|
|
values even when read data only has been set to true, so it can
|
|
|
|
|
differentiate between dates/times and numbers; but this change hasn't
|
|
|
|
|
yet been implemented for the other readers.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Reading Only Data from a Spreadsheet File applies to Readers:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
2016-12-03 13:16:45 +00:00
|
|
|
|
CSV | NO | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Reading Only Named WorkSheets from a File
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If your workbook contains a number of worksheets, but you are only
|
|
|
|
|
interested in reading some of those, then you can use the
|
|
|
|
|
setLoadSheetsOnly() method to identify those sheets you are interested
|
|
|
|
|
in reading.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
To read a single sheet, you can pass that sheet name as a parameter to
|
|
|
|
|
the setLoadSheetsOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetname = 'Data Sheet #2';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader of which WorkSheets we want to load **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadSheetsOnly($sheetname);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader07.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
If you want to read more than just a single sheet, you can pass a list
|
|
|
|
|
of sheet names as an array parameter to the setLoadSheetsOnly() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetnames = array('Data Sheet #1','Data Sheet #3');
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader of which WorkSheets we want to load **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadSheetsOnly($sheetnames);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader08.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
To reset this option to the default, you can call the setLoadAllSheets()
|
|
|
|
|
method.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Advise the Reader to load all Worksheets **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setLoadAllSheets();
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader06.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Reading Only Named WorkSheets from a File applies to Readers:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
|
|
|
|
CSV | NO | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Reading Only Specific Columns and Rows from a File (Read Filters)
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
If you are only interested in reading part of a worksheet, then you can
|
|
|
|
|
write a filter class that identifies whether or not individual cells
|
|
|
|
|
should be read by the loader. A read filter must implement the
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Reader\IReadFilter interface, and contain a
|
|
|
|
|
readCell() method that accepts arguments of \$column, \$row and
|
|
|
|
|
\$worksheetName, and return a boolean true or false that indicates
|
|
|
|
|
whether a workbook cell identified by those arguments should be read or
|
|
|
|
|
not.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example1.xls';
|
|
|
|
|
$sheetname = 'Data Sheet #3';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Read rows 1 to 7 and columns A to E only
|
|
|
|
|
if ($row >= 1 && $row <= 7) {
|
|
|
|
|
if (in_array($column,range('A','E'))) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** Create an Instance of our Read Filter **/
|
|
|
|
|
$filterSubset = new MyReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($filterSubset);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load only the rows and columns that match our filter to Spreadsheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader09.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
This example is not particularly useful, because it can only be used in
|
|
|
|
|
a very specific circumstance (when you only want cells in the range
|
|
|
|
|
A1:E7 from your worksheet. A generic Read Filter would probably be more
|
|
|
|
|
useful:
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class MyReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
private $_startRow = 0;
|
|
|
|
|
private $_endRow = 0;
|
|
|
|
|
private $_columns = array();
|
|
|
|
|
|
|
|
|
|
/** Get the list of rows and columns to read */
|
|
|
|
|
public function __construct($startRow, $endRow, $columns) {
|
|
|
|
|
$this->_startRow = $startRow;
|
|
|
|
|
$this->_endRow = $endRow;
|
|
|
|
|
$this->_columns = $columns;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Only read the rows and columns that were configured
|
|
|
|
|
if ($row >= $this->_startRow && $row <= $this->_endRow) {
|
|
|
|
|
if (in_array($column,$this->_columns)) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** Create an Instance of our Read Filter, passing in the cell range **/
|
|
|
|
|
$filterSubset = new MyReadFilter(9,15,range('G','K'));
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader10.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
This can be particularly useful for conserving memory, by allowing you
|
|
|
|
|
to read and process a large workbook in “chunks”: an example of this
|
|
|
|
|
usage might be when transferring data from an Excel worksheet to a
|
|
|
|
|
database.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'Xls';
|
|
|
|
|
$inputFileName = './sampleData/example2.xls';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define a Read Filter class implementing \PhpOffice\PhpSpreadsheet\Reader\IReadFilter */
|
|
|
|
|
class chunkReadFilter implements \PhpOffice\PhpSpreadsheet\Reader\IReadFilter
|
|
|
|
|
{
|
|
|
|
|
private $_startRow = 0;
|
|
|
|
|
private $_endRow = 0;
|
|
|
|
|
|
|
|
|
|
/** Set the list of rows that we want to read */
|
|
|
|
|
public function setRows($startRow, $chunkSize) {
|
|
|
|
|
$this->_startRow = $startRow;
|
|
|
|
|
$this->_endRow = $startRow + $chunkSize;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public function readCell($column, $row, $worksheetName = '') {
|
|
|
|
|
// Only read the heading row, and the configured rows
|
|
|
|
|
if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define how many rows we want to read for each "chunk" **/
|
|
|
|
|
$chunkSize = 2048;
|
|
|
|
|
/** Create a new Instance of our Read Filter **/
|
|
|
|
|
$chunkFilter = new chunkReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($chunkFilter);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
/** Loop to read our worksheet in "chunk size" blocks **/
|
|
|
|
|
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
|
|
|
|
|
/** Tell the Read Filter which rows we want this iteration **/
|
|
|
|
|
$chunkFilter->setRows($startRow,$chunkSize);
|
|
|
|
|
/** Load only the rows that match our filter **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
// Do some processing here
|
|
|
|
|
}
|
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader12.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Using Read Filters applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | YES | Xls | YES | Excel2003XML | YES |
|
|
|
|
|
Ods | YES | SYLK | NO | Gnumeric | YES |
|
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Combining Multiple Files into a Single Spreadsheet Object
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
While you can limit the number of worksheets that are read from a
|
|
|
|
|
workbook file using the setLoadSheetsOnly() method, certain readers also
|
|
|
|
|
allow you to combine several individual "sheets" from different files
|
|
|
|
|
into a single `Spreadsheet` object, where each individual file is a
|
|
|
|
|
single worksheet within that workbook. For each file that you read, you
|
|
|
|
|
need to indicate which worksheet index it should be loaded into using
|
|
|
|
|
the setSheetIndex() method of the \$reader, then use the
|
|
|
|
|
loadIntoExisting() method rather than the load() method to actually read
|
|
|
|
|
the file into that worksheet.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileNames = array('./sampleData/example1.csv',
|
|
|
|
|
'./sampleData/example2.csv'
|
|
|
|
|
'./sampleData/example3.csv'
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Extract the first named file from the array list **/
|
|
|
|
|
$inputFileName = array_shift($inputFileNames);
|
|
|
|
|
/** Load the initial file to the first worksheet in a `Spreadsheet` Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title (to the filename that we've loaded) **/
|
|
|
|
|
$spreadsheet->getActiveSheet()
|
|
|
|
|
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Loop through all the remaining files in the list **/
|
|
|
|
|
foreach($inputFileNames as $sheet => $inputFileName) {
|
|
|
|
|
/** Increment the worksheet index pointer for the Reader **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setSheetIndex($sheet+1);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load the current file into a new worksheet in Spreadsheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->loadIntoExisting($inputFileName,$spreadsheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title (to the filename that we've loaded) **/
|
|
|
|
|
$spreadsheet->getActiveSheet()
|
|
|
|
|
->setTitle(pathinfo($inputFileName,PATHINFO_BASENAME));
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader13.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
Note that using the same sheet index for multiple sheets won't append
|
|
|
|
|
files into the same sheet, but overwrite the results of the previous
|
|
|
|
|
load. You cannot load multiple CSV files into the same worksheet.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Combining Multiple Files into a Single Spreadsheet Object applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | YES | Gnumeric | NO |
|
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
### Combining Read Filters with the setSheetIndex() method to split a large CSV file across multiple Worksheets
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
An Xls BIFF .xls file is limited to 65536 rows in a worksheet, while the
|
|
|
|
|
Xlsx Microsoft Office Open XML SpreadsheetML .xlsx file is limited to
|
|
|
|
|
1,048,576 rows in a worksheet; but a CSV file is not limited other than
|
|
|
|
|
by available disk space. This means that we wouldn’t ordinarily be able
|
|
|
|
|
to read all the rows from a very large CSV file that exceeded those
|
|
|
|
|
limits, and save it as an Xls or Xlsx file. However, by using Read
|
|
|
|
|
Filters to read the CSV file in “chunks” (using the chunkReadFilter
|
|
|
|
|
Class that we defined in section REF \_Ref275604563 \r \p 5.3 above),
|
|
|
|
|
and the setSheetIndex() method of the \$reader, we can split the CSV
|
|
|
|
|
file across several individual worksheets.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example2.csv';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';
|
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/** Define how many rows we want to read for each "chunk" **/
|
|
|
|
|
$chunkSize = 65530;
|
|
|
|
|
/** Create a new Instance of our Read Filter **/
|
|
|
|
|
$chunkFilter = new chunkReadFilter();
|
|
|
|
|
|
|
|
|
|
/** Tell the Reader that we want to use the Read Filter **/
|
|
|
|
|
/** and that we want to store it in contiguous rows/columns **/
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setReadFilter($chunkFilter)
|
2016-12-03 13:16:45 +00:00
|
|
|
|
->setContiguous(true);
|
|
|
|
|
|
|
|
|
|
/** Instantiate a new Spreadsheet object manually **/
|
|
|
|
|
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
|
|
|
|
|
|
|
|
|
|
/** Set a sheet index **/
|
|
|
|
|
$sheet = 0;
|
|
|
|
|
/** Loop to read our worksheet in "chunk size" blocks **/
|
|
|
|
|
/** $startRow is set to 2 initially because we always read the headings in row #1 **/
|
|
|
|
|
for ($startRow = 2; $startRow <= 1000000; $startRow += $chunkSize) {
|
|
|
|
|
/** Tell the Read Filter which rows we want to read this loop **/
|
|
|
|
|
$chunkFilter->setRows($startRow,$chunkSize);
|
|
|
|
|
|
|
|
|
|
/** Increment the worksheet index pointer for the Reader **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setSheetIndex($sheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Load only the rows that match our filter into a new worksheet **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->loadIntoExisting($inputFileName,$spreadsheet);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the worksheet title for the sheet that we've justloaded) **/
|
|
|
|
|
/** and increment the sheet index as well **/
|
|
|
|
|
$spreadsheet->getActiveSheet()->setTitle('Country Data #'.(++$sheet));
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader14.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
This code will read 65,530 rows at a time from the CSV file that we’re
|
|
|
|
|
loading, and store each "chunk" in a new worksheet.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The setContiguous() method for the Reader is important here. It is
|
|
|
|
|
applicable only when working with a Read Filter, and identifies whether
|
|
|
|
|
or not the cells should be stored by their position within the CSV file,
|
|
|
|
|
or their position relative to the filter.
|
|
|
|
|
|
|
|
|
|
For example, if the filter returned true for cells in the range B2:C3,
|
|
|
|
|
then with setContiguous set to false (the default) these would be loaded
|
|
|
|
|
as B2:C3 in the `Spreadsheet` object; but with setContiguous set to
|
|
|
|
|
true, they would be loaded as A1:B2.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Splitting a single loaded file across multiple worksheets applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO |
|
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
|
|
|
|
### Pipe or Tab Separated Value Files
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The CSV loader defaults to loading a file where comma is used as the
|
|
|
|
|
separator, but you can modify this to load tab- or pipe-separated value
|
|
|
|
|
files using the setDelimiter() method.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example1.tsv';
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
/** Create a new Reader of the type defined in $inputFileType **/ $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Set the delimiter to a TAB character **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader->setDelimiter("\t");
|
|
|
|
|
// $reader->setDelimiter('|');
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
/** Load the file to a Spreadsheet Object **/
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader15.php for a working example of this
|
|
|
|
|
> code.
|
|
|
|
|
|
|
|
|
|
In addition to the delimiter, you can also use the following methods to
|
|
|
|
|
set other attributes for the data load:
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
setEnclosure() | default is " setLineEnding() | default is PHP\_EOL
|
2016-12-03 13:16:45 +00:00
|
|
|
|
setInputEncoding() | default is UTF-8
|
|
|
|
|
|
|
|
|
|
Setting CSV delimiter applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N |
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:|
|
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO |
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO |
|
|
|
|
|
CSV | YES | HTML | NO
|
|
|
|
|
|
|
|
|
|
### A Brief Word about the Advanced Value Binder
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
When loading data from a file that contains no formatting information,
|
|
|
|
|
such as a CSV file, then data is read either as strings or numbers
|
|
|
|
|
(float or integer). This means that PhpSpreadsheet does not
|
|
|
|
|
automatically recognise dates/times (such as "16-Apr-2009" or "13:30"),
|
|
|
|
|
booleans ("TRUE" or "FALSE"), percentages ("75%"), hyperlinks
|
|
|
|
|
("http://www.phpexcel.net"), etc as anything other than simple strings.
|
|
|
|
|
However, you can apply additional processing that is executed against
|
|
|
|
|
these values during the load process within a Value Binder.
|
|
|
|
|
|
|
|
|
|
A Value Binder is a class that implement the
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Cell\IValueBinder interface. It must contain a
|
|
|
|
|
bindValue() method that accepts a \PhpOffice\PhpSpreadsheet\Cell and a
|
|
|
|
|
value as arguments, and return a boolean true or false that indicates
|
|
|
|
|
whether the workbook cell has been populated with the value or not. The
|
|
|
|
|
Advanced Value Binder implements such a class: amongst other tests, it
|
|
|
|
|
identifies a string comprising "TRUE" or "FALSE" (based on locale
|
|
|
|
|
settings) and sets it to a boolean; or a number in scientific format
|
|
|
|
|
(e.g. "1.234e-5") and converts it to a float; or dates and times,
|
|
|
|
|
converting them to their Excel timestamp value – before storing the
|
|
|
|
|
value in the cell object. It also sets formatting for strings that are
|
|
|
|
|
identified as dates, times or percentages. It could easily be extended
|
|
|
|
|
to provide additional handling (including text or cell formatting) when
|
|
|
|
|
it encountered a hyperlink, or HTML markup within a CSV file.
|
|
|
|
|
|
|
|
|
|
So using a Value Binder allows a great deal more flexibility in the
|
|
|
|
|
loader logic when reading unformatted text files.
|
|
|
|
|
|
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() );
|
|
|
|
|
|
|
|
|
|
$inputFileType = 'CSV';
|
|
|
|
|
$inputFileName = './sampleData/example1.tsv';
|
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
|
|
|
|
$reader->setDelimiter("\t");
|
|
|
|
|
$spreadsheet = $reader->load($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader15.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
Loading using a Value Binder applies to:
|
|
|
|
|
|
|
|
|
|
Reader | Y/N |Reader | Y/N |Reader | Y/N
|
|
|
|
|
----------|:---:|--------|:---:|--------------|:---:
|
|
|
|
|
Xlsx | NO | Xls | NO | Excel2003XML | NO
|
|
|
|
|
Ods | NO | SYLK | NO | Gnumeric | NO
|
|
|
|
|
CSV | YES | HTML | YES
|
|
|
|
|
|
|
|
|
|
## Error Handling
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
Of course, you should always apply some error handling to your scripts
|
|
|
|
|
as well. PhpSpreadsheet throws exceptions, so you can wrap all your code
|
|
|
|
|
that accesses the library methods within Try/Catch blocks to trap for
|
|
|
|
|
any problems that are encountered, and deal with them in an appropriate
|
|
|
|
|
manner.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The PhpSpreadsheet Readers throw a
|
|
|
|
|
\PhpOffice\PhpSpreadsheet\Reader\Exception.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:16:45 +00:00
|
|
|
|
$inputFileName = './sampleData/example-1.xls';
|
|
|
|
|
|
|
|
|
|
try {
|
|
|
|
|
/** Load $inputFileName to a Spreadsheet Object **/
|
|
|
|
|
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($inputFileName);
|
|
|
|
|
} catch(\PhpOffice\PhpSpreadsheet\Reader\Exception $e) {
|
|
|
|
|
die('Error loading file: '.$e->getMessage());
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
> See Examples/Reader/exampleReader16.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
## Helper Methods
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
You can retrieve a list of worksheet names contained in a file without
|
|
|
|
|
loading the whole file by using the Reader’s `listWorksheetNames()`
|
|
|
|
|
method; similarly, a `listWorksheetInfo()` method will retrieve the
|
|
|
|
|
dimensions of worksheet in a file without needing to load and parse the
|
|
|
|
|
whole file.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
### listWorksheetNames
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The `listWorksheetNames()` method returns a simple array listing each
|
|
|
|
|
worksheet name within the workbook:
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$worksheetNames = $reader->listWorksheetNames($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
echo '<h3>Worksheet Names</h3>';
|
|
|
|
|
echo '<ol>';
|
|
|
|
|
foreach ($worksheetNames as $worksheetName) {
|
|
|
|
|
echo '<li>', $worksheetName, '</li>';
|
|
|
|
|
}
|
|
|
|
|
echo '</ol>';
|
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader18.php for a working example of this
|
|
|
|
|
> code.
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
### listWorksheetInfo
|
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
The `listWorksheetInfo()` method returns a nested array, with each entry
|
|
|
|
|
listing the name and dimensions for a worksheet:
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 15:00:54 +00:00
|
|
|
|
``` php
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
2016-12-03 13:32:54 +00:00
|
|
|
|
$worksheetData = $reader->listWorksheetInfo($inputFileName);
|
2016-12-03 13:16:45 +00:00
|
|
|
|
|
|
|
|
|
echo '<h3>Worksheet Information</h3>';
|
|
|
|
|
echo '<ol>';
|
|
|
|
|
foreach ($worksheetData as $worksheet) {
|
|
|
|
|
echo '<li>', $worksheet['worksheetName'], '<br />';
|
|
|
|
|
echo 'Rows: ', $worksheet['totalRows'],
|
|
|
|
|
' Columns: ', $worksheet['totalColumns'], '<br />';
|
|
|
|
|
echo 'Cell Range: A1:',
|
|
|
|
|
$worksheet['lastColumnLetter'], $worksheet['totalRows'];
|
|
|
|
|
echo '</li>';
|
|
|
|
|
}
|
|
|
|
|
echo '</ol>';
|
|
|
|
|
```
|
2016-12-03 15:00:54 +00:00
|
|
|
|
|
|
|
|
|
> See Examples/Reader/exampleReader19.php for a working example of this
|
|
|
|
|
> code.
|