* Improve Coverage for ODS Reader
Reader/ODS/Properties is now 100% covered.
Reader/ODS is covered except for 1 statement. As the original author
put it, "table-header-rows TODO: figure this out ... I'm not sure that
PhpExcel has an API for this". I'm still thinking about it, but, so far,
I agree with the author.
There are minimal code changes.
- Several places test !zip->open() to see whether the test failed.
However, zip->open() returns true or a string, so the test never
detects failure. Change to zip->open() !== true. No previous tests.
- Suppress warning messages from simplexml_load_string (there had
been no tests for invalid xml).
- One document property was misnamed, and one non-existent property
was tested for.
I added a number of tests, creating an ODS directory, and moving
OdsTest to that directory.
* Scrutinizer Recommendation
Unused variable in one test.
* Update CHANGELOG
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
This problem is the same as #1238, which was resolved by #1239.
For that issue, the fix was to check in one place whether
$this->mapCellXfIndex[$xfIndex] was set before using it.
The sample spreadsheet supplied as a description for this
problem had exactly the same problem in 2 other places in the code.
In addition, there were 7 other places in the code where that
particular item was used unchecked. This fix corrects all 9 locations.
The spreadsheet supplied with the problem is used as the basis
for some new tests, which particularly test column dimensions
and styles, the problems involved in this case.
File author erroneously assumed that backslash was used to escape
quotes in CSV; in fact, doubling the quote is used for escape.
The test still worked, but mainly because the content of the cell
with the escape wasn't tested. The file is now fixed, and
a new test added.
I believe that both CSV Reader and Writer are 100% covered now.
There were some errors uncovered during development.
The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).
"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.
I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
Support for the CONTAINSBLANKS conditional style was added a while ago.
However, that support was on write only; any cells which used
CONTAINSBLANKS on a file being read would drop that style.
I am also adding support for NOTCONTAINSBLANKS, on read and write.
* Handle ConditionalStyle NumberFormat When Reading Xlsx File
ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string.
However, when reading conditional style in Xlsx file, NumberFormat
is actually a SimpleXMLElement, so is not handled correctly.
While testing this change, it turned out that reader always expects
that there is a "SharedString" portion of the XML, which is not
true for spreadsheets with no string data, which causes a
run-time message.
Likewise, when conditional number format is not one of the built-in
formats, a run-time message is issued because 'isset' is used
to determine existence rather than 'array_key_exists'.
The new workbook added to the testing data demonstrates both those
problems (prior to the code changes).
* Move Comment to Resolve Conflict
Github reports conflict involving placement of one comment statement.
* Respond to Scrutinizer Style Suggestion
Change detection for empty SimpleXMLElement.
* Fix failure when parsing xlsx with drawing having double (redefined) attributes
* Fix failure when parsing xlsx with drawing having double (redefined) attributes
* Fix#853 when loading and saving XLSX file with empty drawing cause corrupted output file. Store empty drawing as unparsed entity and save it as is when saving the file.
* Fix code style
* Extract character set, so we can convert to UTF-8 if required
* Set column width and row height when defined on tr/td
* Parse align and valign on td
* Specify number format of cell via html attribute
* Formatting of b, strong, i and em tags
* Inserting image in cell when using img tag in html
* Add applying inline styles: border, fonts, alignment, dimensions
* Add tests for applying inline styles
We now always trust the file extension to avoid false positive of mime
detection for most simple cases. But we still try to guess the mime type
if the file extension does not match or is missing.
Fixes#564
CSV reader used to accept any file without any kind of check. That made
users incorrectly believe that things were ok, even though there is no
way for CSV reader to read anything else that plain text files.
Fixes#167
When describing a cell, the cell reference (r="A1") is optional.
When not present, we should just increment the index of the last processed row.
Fixes#201Closes#225
When the xml file is not a standard xml file, the `simplexml_load_string` will return false, this will cause an error on "$xml->getNamespaces(true);" . So instead of showing the error, we throw an exception.