Commit Graph

35 Commits

Author SHA1 Message Date
oleibman 38fab4e632
Fix for #1505 (#1525)
This problem is the same as #1238, which was resolved by #1239.
For that issue, the fix was to check in one place whether
$this->mapCellXfIndex[$xfIndex] was set before using it.
The sample spreadsheet supplied as a description for this
problem had exactly the same problem in 2 other places in the code.
In addition, there were 7 other places in the code where that
particular item was used unchecked. This fix corrects all 9 locations.
The spreadsheet supplied with the problem is used as the basis
for some new tests, which particularly test column dimensions
and styles, the problems involved in this case.
2020-06-19 21:01:18 +02:00
oleibman 41b95c1542
CSV Sample File Was Miscoded (#1489)
File author erroneously assumed that backslash was used to escape
quotes in CSV; in fact, doubling the quote is used for escape.
The test still worked, but mainly because the content of the cell
with the escape wasn't tested. The file is now fixed, and
a new test added.
2020-05-24 19:57:39 +09:00
oleibman 7517cdd008
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.

There were some errors uncovered during development.

The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).

"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.

I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 18:15:18 +09:00
oleibman 082266aacd Conditionals - Extend Support for (NOT)CONTAINSBLANKS (#1278)
Support for the CONTAINSBLANKS conditional style was added a while ago.
However, that support was on write only; any cells which used
CONTAINSBLANKS on a file being read would drop that style.

I am also adding support for NOTCONTAINSBLANKS, on read and write.
2020-01-04 18:50:04 +01:00
oleibman afd070a756 Handle ConditionalStyle NumberFormat When Reading Xlsx File (#1296)
* Handle ConditionalStyle NumberFormat When Reading Xlsx File

ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string.
However, when reading conditional style in Xlsx file, NumberFormat
   is actually a SimpleXMLElement, so is not handled correctly.
While testing this change, it turned out that reader always expects
   that there is a "SharedString" portion of the XML, which is not
   true for spreadsheets with no string data, which causes a
   run-time message.
Likewise, when conditional number format is not one of the built-in
   formats, a run-time message is issued because 'isset' is used
   to determine existence rather than 'array_key_exists'.
The new workbook added to the testing data demonstrates both those
   problems (prior to the code changes).

* Move Comment to Resolve Conflict

Github reports conflict involving placement of one comment statement.

* Respond to Scrutinizer Style Suggestion

Change detection for empty SimpleXMLElement.
2020-01-04 00:10:41 +01:00
Mahmoud Abdo 785705b712
Best effort to support invalid colspan values in HTML reader
Closes #878
2019-07-27 23:31:23 -07:00
Mark Baker d8047b071b
Basic unit test and fix for loading data validations from xlsx file (#1063) 2019-07-08 19:55:14 +02:00
Mark Baker 0e6238c69e
CVE-2019-12331 (#1041)
* Detect doubly-encoded xml to hide XXE attacks
Correct use of LibXml_Disable_Entity_Loader

* New test for double-encoded xml in security scanner
2019-07-01 00:55:25 +02:00
Mark Baker 1e711541f1
Refactoring xlsx reader (#1033)
Start work on breaking up monolithic Reader and Writer classes into dedicated subclasses to make maintenance work easier
2019-06-30 23:42:25 +02:00
Mark Baker 6c25b6f422
Refactor Xlsx Properties Reader code into a separate class (#1001)
* Unit tests for refactoring Spreadsheet properties
* Refactor Xlsx Properties Reader code into a separate class
2019-06-10 16:44:55 +02:00
kraser 906bdc613c Fix failure when parsing xlsx with drawing having double (redefined) … (#945)
* Fix failure when parsing xlsx with drawing having double (redefined) attributes

* Fix failure when parsing xlsx with drawing having double (redefined) attributes
2019-05-30 11:42:00 +02:00
AlexPravdin ebc0b56959 Fix #853 when loading and saving XLSX file with empty drawing cause c… (#882)
* Fix #853 when loading and saving XLSX file with empty drawing cause corrupted output file. Store empty drawing as unparsed entity and save it as is when saving the file.

* Fix code style
2019-05-30 10:38:03 +02:00
Mark Baker 9b004b1e6a
Ignore escaped enclosures within an enclosure when inferring csv separator (#906) 2019-02-25 23:20:50 +01:00
Patrick Brouwers 1c99f4999c [Feature] Html reader improvements (#884)
* Extract character set, so we can convert to UTF-8 if required

* Set column width and row height when defined on tr/td

* Parse align and valign on td

* Specify number format of cell via html attribute

* Formatting of b, strong, i and em tags

* Inserting image in cell when using img tag in html

* Add applying inline styles: border, fonts, alignment, dimensions

* Add tests for applying inline styles
2019-02-16 23:11:16 +01:00
MarkBaker 41bcf9a21c Support for additional callback in XML Security Scanner 2018-11-25 14:00:35 +01:00
MarkBaker 7a06d71e1c Add UTF-7 XXE Unit test data 2018-11-19 23:22:59 +01:00
Laurent 79d86ef5cc
Csv reader avoid notice when the file is empty
Fixes #337
2018-10-28 14:16:53 +11:00
Paul Barton 813855b2b2
Fix CSV delimiter detection on line breaks
The CSV Reader can now correctly ignore line breaks inside
enclosures which allows it to determine the delimiter
correctly.

Fixes #716
Fixes #717
2018-10-21 18:23:55 +11:00
bayzhanov 08b4456641
Xls file threw exception during open by Xls reader
Ignore some exception in property, if stream is empty

Fixes #402
Fixes #659
2018-10-07 18:49:01 +11:00
Adrien Crivelli 9fdcaabe3c
Could not open CSV file containing HTML fragment
We now always trust the file extension to avoid false positive of mime
detection for most simple cases. But we still try to guess the mime type
if the file extension does not match or is missing.

Fixes #564
2018-06-25 11:12:27 +09:00
Robin D'Arcy c723833d6f Allow CSV escape character to be set
Fixes #492
Closes #510
2018-05-23 10:31:41 +09:00
Adrien Crivelli e31878ceb1
Check for MIME type to know if CSV reader can read a file
CSV reader used to accept any file without any kind of check. That made
users incorrectly believe that things were ok, even though there is no
way for CSV reader to read anything else that plain text files.

Fixes #167
2018-02-05 21:33:23 +09:00
Adrien Crivelli 481fc4a7c6
Support XML file without styles
Closes #331
Closes https://github.com/PHPOffice/PHPExcel/pull/559
Fixes https://github.com/PHPOffice/PHPExcel/issues/558
2018-01-14 17:08:50 +09:00
Adrien Crivelli 139d85d874
Better auto-detection of CSV separators
Closes #305
2017-12-28 12:25:37 +09:00
GreatHumorist 2abe56b946 Support missing attribute `r` in `c` node when reading xlsx
When describing a cell, the cell reference (r="A1") is optional.
When not present, we should just increment the index of the last processed row.

Fixes #201 
Closes #225
2017-09-22 14:49:38 +09:00
GreatHumorist 0477e6fcfe In Xml reader throw exception in case of invalid XML (#222)
When the xml file is not a standard xml file, the `simplexml_load_string` will return false, this will cause an error on "$xml->getNamespaces(true);" . So instead of showing the error, we throw an exception.
2017-09-20 14:20:12 +09:00
Markus Lanthaler 3ee9cc5ce6
Infer CSV delimiter if it hasn't been set explicitly
Closes #141
2017-04-20 17:02:03 +09:00
Paolo Agostinetto c954eddf57 Ods reader: fix sheet count and added a test for sheet names 2017-02-20 21:02:04 +01:00
Paolo Agostinetto 1dba2d1766 Ods reader: tests for repeated spaces and rich text 2017-02-18 20:49:48 +01:00
Paolo Agostinetto bcd1bc364c Ods reader: test loading of Worksheets 2017-02-18 13:55:22 +01:00
Paolo Agostinetto 3c7b2e23da Added unit tests for Ods reader 2017-02-18 13:36:08 +01:00
Adrien Crivelli e6bbc4bd25
Convert all line ending to unix style 2016-11-27 15:45:15 +09:00
Alexander Kurilo 408da0c17a Make HTML checks more strict 2016-11-16 22:21:30 +09:00
Alexander Kurilo edb3974a0d Move XEEE test data to add data for other readers 2016-11-16 22:21:30 +09:00
Adrien Crivelli e1f81f0fe0
Refactor tests data from custom format to PHP
FIX #14
2016-08-16 21:00:19 +09:00