Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 09:15:18 +00:00
|
|
|
<?php
|
|
|
|
|
|
|
|
namespace PhpOffice\PhpSpreadsheetTests\Reader;
|
|
|
|
|
|
|
|
use PhpOffice\PhpSpreadsheet\Reader\Csv;
|
|
|
|
use PhpOffice\PhpSpreadsheet\Spreadsheet;
|
|
|
|
use PHPUnit\Framework\TestCase;
|
|
|
|
|
|
|
|
class CsvContiguousTest extends TestCase
|
|
|
|
{
|
2020-05-17 09:35:55 +00:00
|
|
|
private $inputFileName = 'samples/Reader/sampleData/example2.csv';
|
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 09:15:18 +00:00
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testContiguous(): void
|
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 09:15:18 +00:00
|
|
|
{
|
|
|
|
// Create a new Reader of the type defined in $inputFileType
|
|
|
|
$reader = new Csv();
|
|
|
|
|
|
|
|
// Define how many rows we want to read for each "chunk"
|
|
|
|
$chunkSize = 100;
|
|
|
|
// Create a new Instance of our Read Filter
|
|
|
|
$chunkFilter = new CsvContiguousFilter();
|
|
|
|
|
|
|
|
// Tell the Reader that we want to use the Read Filter that we've Instantiated
|
|
|
|
// and that we want to store it in contiguous rows/columns
|
|
|
|
self::assertFalse($reader->getContiguous());
|
|
|
|
$reader->setReadFilter($chunkFilter)
|
|
|
|
->setContiguous(true);
|
|
|
|
|
|
|
|
// Instantiate a new PhpSpreadsheet object manually
|
|
|
|
$spreadsheet = new Spreadsheet();
|
|
|
|
|
|
|
|
// Set a sheet index
|
|
|
|
$sheet = 0;
|
|
|
|
// Loop to read our worksheet in "chunk size" blocks
|
|
|
|
/** $startRow is set to 2 initially because we always read the headings in row #1 * */
|
|
|
|
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
|
|
|
|
// Tell the Read Filter, the limits on which rows we want to read this iteration
|
|
|
|
$chunkFilter->setRows($startRow, $chunkSize);
|
|
|
|
|
|
|
|
// Increment the worksheet index pointer for the Reader
|
|
|
|
$reader->setSheetIndex($sheet);
|
|
|
|
// Load only the rows that match our filter into a new worksheet in the PhpSpreadsheet Object
|
|
|
|
$reader->loadIntoExisting($this->inputFileName, $spreadsheet);
|
|
|
|
// Set the worksheet title (to reference the "sheet" of data that we've loaded)
|
|
|
|
// and increment the sheet index as well
|
|
|
|
$spreadsheet->getActiveSheet()->setTitle('Country Data #' . (++$sheet));
|
|
|
|
}
|
|
|
|
|
|
|
|
$sheet = $spreadsheet->getSheetByName('Country Data #1');
|
|
|
|
self::assertEquals('Kabul', $sheet->getCell('A2')->getValue());
|
|
|
|
$sheet = $spreadsheet->getSheetByName('Country Data #2');
|
|
|
|
self::assertEquals('Lesotho', $sheet->getCell('B4')->getValue());
|
|
|
|
$sheet = $spreadsheet->getSheetByName('Country Data #3');
|
|
|
|
self::assertEquals(-20.1, $sheet->getCell('C6')->getValue());
|
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testContiguous2(): void
|
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 09:15:18 +00:00
|
|
|
{
|
|
|
|
// Create a new Reader of the type defined in $inputFileType
|
|
|
|
$reader = new Csv();
|
|
|
|
|
|
|
|
// Create a new Instance of our Read Filter
|
|
|
|
$chunkFilter = new CsvContiguousFilter();
|
|
|
|
$chunkFilter->setFilterType(1);
|
|
|
|
|
|
|
|
// Tell the Reader that we want to use the Read Filter that we've Instantiated
|
|
|
|
// and that we want to store it in contiguous rows/columns
|
|
|
|
$reader->setReadFilter($chunkFilter)
|
|
|
|
->setContiguous(true);
|
|
|
|
|
|
|
|
// Instantiate a new PhpSpreadsheet object manually
|
|
|
|
$spreadsheet = new Spreadsheet();
|
|
|
|
|
|
|
|
// Loop to read our worksheet in "chunk size" blocks
|
|
|
|
$reader->loadIntoExisting($this->inputFileName, $spreadsheet);
|
|
|
|
|
|
|
|
$sheet = $spreadsheet->getActiveSheet();
|
|
|
|
self::assertEquals('Kabul', $sheet->getCell('A2')->getValue());
|
|
|
|
self::assertEquals('Kuwait', $sheet->getCell('B11')->getValue());
|
|
|
|
}
|
|
|
|
}
|