Commit Graph

177 Commits

Author SHA1 Message Date
oleibman 497a934374
Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742)
* Fix for 3 Issues Involving ReadXlsx and NamedRange

Issues #1686 and #1723, which provide sample spreadsheets, are probably
solved by this ticket. Issue #1730 is also probably solved, but I have
no way to verify.

There are two problems with how PhpSpreadsheet is handling things now.
Although the first problem is much less severe, and isn't really a factor
in the issues named above, it is helpful to get it out of the way first.
If you define a named range in Excel, and then delete the sheet where
the range exists, Excel saves the range as #REF!. If there is a cell which
references the range, it will similarly have the value #REF! when you open
the Excel file.
Currently, PhpSpreadsheet discards the #REF! definition, so a cell which
references the range will appear as #NAME? rather than #REF!.
This PR changes the behavior so that PhpSpreadsheet retains the #REF!
definition, and cells which reference it will appear as #REF!.

The second problem is the more severe, and is, I believe, responsible
for the 3 issues identified above.
If you define a named range and the sheet on which the range is defined
does not exist at the time, Excel will save the range as something like:

'[1]Unknown Sheet'!$A$1

If a cell references such a range, Excel will again display #REF!.
PhpSpreadsheet currently throws an Exception when it encounters
such a definition while reading the file. This PR changes
the behavior so that PhpSpreadsheet saves the definition as #REF!,
and cells which reference it will behave similarly.

For the record, I will note that Excel does not magically recalculate when a
missing sheet is subsequently added, despite the fact that the reference
might now become resolvable. PhpSpreadsheet behaves likewise.

* Remove Dead Code in Test

Identified it after push but before merge.
2020-12-10 18:08:10 +01:00
oleibman 1741766a9c
Improving Coverage for Excel2003 XML Reader (#1557)
* Improving Coverage for Excel2003 XML Reader

Reader/Xml is now 100% covered.

File templates/Excel2003XMLTest.xml, used in some tests, is *not*
readable by a current version of Excel. I have substituted a new file
excel2003.xml to be used in its place. I have not deleted the original
in case someone in future (possibly me) wants to see what it needs to
make it usable.

There are minimal code changes.
- Unused protected functions pixel2WidthUnits and widthUnits2Pixel
  are deleted.
- One regex looking to convert hex characters is changed from a-z to a-f,
  and made case insensitive.
- No calculation performed for "error" cell (previously calculation
  was attempted and threw exception).
- Empty relative row/cell is now handled correctly.
- Style applied to empty cell when appropriate.
- Support added for textRotation.
- Support added for border styles.
- Support added for diagonal borders.
- Support added for superscript and subscript.
- Support added for fill patterns.

In theory, encodings other than UTF-8 were supported.
In fact, I was unable to get SecurityScanner to pass *any* xml which is
not UTF-8. Eliminating the assumption that strings might not be UTF-8
allowed much of the code to be greatly simplified.
After that, I added some code that would permit the use of
some ASCII-compatible encodings (there is a test of ISO-8859-1).
It would be more difficult to handle other encodings (such as UTF-16).
I am not convinced that even the ISO-8859 effort is worth it,
but am willing to investigate either expanding or eliminating
non-UTF8 support.

I added a number of tests, creating an Xml directory, and moving
XmlTest to that directory.

Pull Request had problems reading old invalid sample in the code
coverage phase, not in any of the other test phases, and not in
the code coverage phase on my local machine.
As it turns out, aside from being invalid, the sample
is much larger than any of the other samples. Tests have been
adjusted accordingly.

* Smaller Test File

Should eliminate need to avoid test during xml coverage.

* Break Up Style Test into Multiple Tests

Per suggestion from Mark Baker.

* Integrate AddressHelper Change

The introduction of AddressHelper introduced a conflict which needed to
be resolved. I wanted to test it locally before resolving. This required
me to add (unchanged) AddressHelper to my local copy. I hope this is
an okay manner of resolving the conflict.

* Weird Travis Error

XmlOddTest works just fine on my local machine, but Travis failed it.
Even worse, the lines which Travis flags don't even make any sense
(one was the empty line between two methods!).
This test is not essential to the rest of the change. I am removing
it from the package, and will attempt to re-add it when I have a chance
to sync up my fork with the main project.
2020-10-11 13:26:56 +02:00
Mark Baker 9683e5be18
More unit tests for statistical functions, including a bugfix to LARGE() (#1601)
* More unit tests for statistical functions, including a bugfix to LARGE() that was identified in testing
2020-07-29 23:56:37 +02:00
Mark Baker a9c8470b3b
Identify HYPGEOM.DIST() as a separate Excel function, and additional unit tests (including unhappy path) (#1595) 2020-07-26 22:10:53 +02:00
Mark Baker 8b0aaf7ecf
Named formula implementation, and improved handling of Defined Names generally (#1535)
* Initial work modifying the way named ranges are stored, and handled by the calculation engine
This should provide better support for:
  - both union and intersection operators in composite named range values
  - MS Excel implementation of the union operator duplicating values
  - named formulae
  - named ranges and formulae that reference other named ranges and formulae
  - ranges and formulae that reference multiple ranges across multiple worksheets

* Initial work on handling defined names (named ranges and named formulae) correctly
 - UTF-8 names (already extracted as a separate PR and merged)
 - distinction between named ranges and named formulae
 - correct handling of union and intersection operators in named ranges
 - correct evaluation of named range operators in calculations
 - calculation support for named formulae
 - support for nested ranges and formulae (named ranges and formulae that reference other named ranges/formulae) in calculations

* Minor tweaks before resolving merge conflicts

* Fix extractSheetTitle() method to work on the last ! in a cell reference rather than the first

* Throw exception if a the reference to a defined name in a formula doesn't exist as a defined name

* Properly assess scope for defined names in calculation engine

* Elimination of some redundant code

* Minor tweaks to simplify entries o the stack where we need to check type

* Ensure correct scoping rules are applied when evaluating named ranges and formulae

* Adjustments to Gnumeric Reader for new defined names structure

* Initial work modifying the Ods Reader to handle named ranges, they weren't actually supported previously... this is still ongoing work

* Handle Ranges formatted as 3-d ranges, as long as the references are both to the same worksheet

* Additional testing for Named Ranges formatted as 3-d ranges, as long as the references are both to the same worksheet

* Skip composite named range tests for the moment

* Clean handling for `undefined name` exception when thrown in the calculation engine. Catch and replace with `#NAME?`

* Adjust method we use to determine whether a defined name is a range or a formula

* PHPCS Recommendations

* PHP doesn't support `mixed` yet, at least not at the minium version that we're working with

* More phpcs fixes

* More phpcs appeasements

* Final phpcs fixes for the moment
Still have a lot of echo and var_dump() statements in the code that scrutinizer will hate, but they stay for the moment while this is still WIP

* Please let this be the last of the phpcs fixes

* Unit tests to determine whether a defined name value is a range value or a formula

* phpcs appeasement

* Named tests from provider

* Initial steps for named ranges and formulae in the Ods Reader

* Reading pseudo-3d range addresses in Ods; treat second sheet reference as being identical to the first, which is the majority of cases where this will occur

* Initial work on Gnumeric reader for named ranges and formulae

* Suppress debug logging again

* Remove more debugging displays

* Last minor tweaks before phase two

* Minor refinements

* And all for the want of a space

* A little tidying up

* More tidying up

* phpcs fix

* Modify defined names in rebindParent()

* Renaming variables

* Resolve an issue with locally scoped defined names that don't contain any worksheet reference

* Keep phpcs happy

* Fix quote handling in regexp

* Fix a couple of scrutinizer issues

* Fix a couple of scrutinizer issues

* Update Xlsx Writer to work with the new defined name internal definition
Additional validation checks

* When adding new defined names through the readers, worksheet may not exist if we're only loading selected sheets rather than the full spreadsheet

* If the only thing that phpcs can pickup on is strings in double quotes instead of single quotes, then I know I'm getting close to ready

* Refactor Defined Names logic for Xlsx Writer into its own class

* phpcs keeping me on my toes

* Restore a couple of files that I managed to change without intending to

* Initial work on Ods Write to provide support for saving named ranges and formulae

* Resolve commas to semi-colons s argument separator when writing named formulae for Ods

* Extract Named Expression Writer for Ods into its own class

* Keep phpcs happy

* Refactoring of formula conversion when reading SpreadsheetML; preparation for reading named ranges because they will also need to use the same conversion method

* First pass at reading Named Ranges/Formulae from SpreadsheetML format xml files

* Remove unused namespace reference

* Defined names being written correctly for Xls; but not yet writing cell formulae that reference those defined names... that's the next big step
And I anticipate that defined names that reference other defined names will also be a problem

* Just to keep phpcs happy
... and yes, I know that there are still diagnostic echo statements in the code

* I had to miss some of the phpcs issues didn't I

* Work on the Xls Writer's Parser Tree to identify named range tokens in a formula, and to distinguish them from function tokens

* Still working on packing that d*** defined name reference in the writer

* Throw an exception in the Parser for saving Xls output if we encounter a defined name in a formula... writer will simply write the calculated cell value, and not the formula as at present
Strip out diagnostic output

* Some phpcs appeasement

* Fix a couple of Scrutinizer issues

* Additional verifications to differentiate a formula from a range value
Add explicit getters/setters for named ranges, named formulae and defined names
Additional unit tests

* Styling for closures

* Remove redundant docblocks

* Spaces

* Gah! Namespace use complaints

* Consistency of making calls to DefinedName rather than NamedRange; NamedRange should now be used only for Named Ranges, and should exclude Named Formulae

* Styling

* spurious newline

* No need to test for variable === null when we're typing it in the function argument definition

* Additional unit tests for local/global scoped named ranges and formulae; and a fix to getNamedFormula()

* Fix silly typo that led to breaking test

* Void return signature for unit tests

* Why weren't these picked up in the last pass?

* Refactoring of getNamedRange()/getNamedFormula()

* Eliminate unused constants, and defaults for private method parameters when always called with a value

* Use strict comparisons when comparing object hash codes

* Initial update to documentation for working with named formulae

* Fix for calculation of relative cell references in named ranges/formulae

* Fix current named range tests, because we should be using absolute references; tests for relative named ranges to be added later

* Fix for calculation of relative cell references in named ranges/formulae

* Updates to changelog and documentation for handling of absolute/relative references in named ranges

* Fix last remaining unit test with a named range reference

* Refactor formula conversion for Ods into a separate class; I hadn't realised that it previously wrote formulae as the MS Excel syntax without any conversion to Ods format

* Fix Ods Writer test xml to reflect Ods-native format for formula

* Docblocks

* Drop dollar prefix from Ods formulae and ranges unless it's necessary

* Set the formula convertor in the content writer constructor

* Documentation update

* Minor updates

* Remove var_dumps from file

* Fix the spurious single quote that was breaking named expressions in the Ods Writer... big sigh of relief that I finally spotted it

* Starting work on documentation for Defined Names, and some examples of using Named Ranges and Formulae

* Starting work on documentation for Defined Names, and some examples of using Named Ranges and Formulae

* Example of a relative named range for the documentation

* Mustn't have phpcs problems in sample code either

* More updates to the documentation

* That should conclude the documentation for Named Ranges, now time to move on to documenting Named Formulae

* That should conclude the documentation for Named Ranges, now time to move on to documenting Named Formulae

* PHPCS appeasement in sample code

* Initial documentation on Named Formulae

* PHPCS appeasements

* Additional comments in the documentation, and modify the named range name validation to support a \ as the first character in a name

* Fix breaking build

* Make defined names case-insensitive

* Fix case-insensitivity

* Improved documentation, and additional unit tests

* Additional unit tests, and a fix for removing a globally scoped defined name even if a worksheet is specified in the method call

* Fix unit test for removing named formulae

* Use assertCount instead of assertSame

* Forgotten voids

* Fix arguments for assertCount

* Unit tests for removing defined names, and a fix for removing locally scoped names

* Unit tests for absolute and relative named ranges in calculation engine, and fix an issue with worksheet name in the offset adjustments for relative references

* PHPCS Appeasement

* Additional unit tests, more documentation, and a fix to the calculation engine when no worksheet reference is provided with a named formula

* PHPCS appeasements

* Additional documentation and examples of using Named Formulae

* Additional examples to go with documentation

* A few minor phpcs appeasements

* Minor refactor of updateFormulaReferencesAnyWorksheet() method

* Discard an unused method argument

* Additional unit tests

* Additional unit tests

* Remove unused argument

* Stricter typing

* Fix return typehinting from remove named range/formula; should return the Spreadsheet object

* Use return typehint of self rather than explicit object type

* Redundant code just to keep scrutinizer happy

* Minor change to handle merge conflict

* phpcs fixes after merge

* Namespace usage ordering

* Please let this be the last phpcs fix needed

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2020-07-26 12:00:06 +02:00
Adrien Crivelli 4739f8b2e7
Merge branch 'readhtml' 2020-07-26 13:11:15 +09:00
oleibman 735103c120
Improve Coverage for ODS Reader (#1545)
* Improve Coverage for ODS Reader

Reader/ODS/Properties is now 100% covered.
Reader/ODS is covered except for 1 statement. As the original author
put it, "table-header-rows TODO: figure this out ... I'm not sure that
PhpExcel has an API for this". I'm still thinking about it, but, so far,
I agree with the author.

There are minimal code changes.
- Several places test !zip->open() to see whether the test failed.
  However, zip->open() returns true or a string, so the test never
  detects failure. Change to zip->open() !== true. No previous tests.
- Suppress warning messages from simplexml_load_string (there had
  been no tests for invalid xml).
- One document property was misnamed, and one non-existent property
  was tested for.

I added a number of tests, creating an ODS directory, and moving
OdsTest to that directory.

* Scrutinizer Recommendation

Unused variable in one test.

* Update CHANGELOG

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2020-07-26 12:40:49 +09:00
Adrien Crivelli 0489e785d2
Merge branch 'master' into Page-Setup-Page-Order 2020-07-26 10:50:41 +09:00
MarkBaker 16a9ff14d4 Experiment 2020-07-25 23:17:26 +02:00
Mark Baker fe121e8f7a
Additional statistical unit tests for non-happy path (#1594)
* Additional statistical unit tests for non-happy path
2020-07-25 21:58:08 +02:00
Mark Baker 57213deb64
Implementation of MS Excel's LOGNORM.DIST(), NORM.S.DIST(), F.DIST(), GAUSS() and GAMMA() functions (#1588)
* `GAUSS()` and `GAMMA()`, `NORM.S.DIST()`, `LOGNORM.DIST()` and `F.DIST()` function implementations, and further unit tests for a number of the statistical functions

Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
2020-07-25 12:44:51 +02:00
Mark Baker 5233e9caaf
Merge branch 'master' into Page-Setup-Page-Order 2020-07-19 12:57:48 +02:00
Adrien Crivelli 7cb4884b96
WEBSERVICE is HTTP client agnostic
HTTP client must be configured via `Settings::setHttpClient()`. This is
a small breaking change, but only for the very few people who started using
WEBSERVICE from last version.

Fixes #1562
Closes #1568
2020-07-19 11:33:01 +09:00
Mark Baker b89968d206
Additional Unit Tests (#1582) 2020-07-14 10:58:50 +02:00
MarkBaker d009347e25 Forgot to check in the test files for the unit tests 2020-07-05 16:28:46 +02:00
paulkned 7f23ccb69d
Added support for the WEBSERVICE function (#1409)
Co-authored-by: Paul Kievits <kievits@rsm.nl>
2020-06-29 10:17:58 +09:00
Mark Baker a264cafe4c
Helper class for the conversion of cell addresses between A1 and R1C1 formats, and vice-versa (#1558)
* Helper class for the conversion of cell addresses between A1 and R1C1 formats, and vice-versa
2020-06-27 23:03:25 +02:00
Owen Leibman 6080c4561d Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.

It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
  Helper/Html function colourNameLookup was changed from protected
  to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
  <ul><li>A</li><li>B</li><li>C</li></ul>
  had formerly caused a wrapped cell to be created with 2 empty lines
  followed by A, B, and C on separate lines; it will now just have the
  3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.

Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.

I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-25 22:42:38 -07:00
Dawid Warmuz 859bef1901
Add support for IFS() logical function (#1442)
* Add support for IFS() logical function

* Use Exception as false value in IFS logical function, so it never collides with string in spreadsheet
2020-06-20 18:21:19 +02:00
Christoph Ziegenberg ca506ba87f
Corrected date time detection (#1492)
* Corrected date time detection

German and Swiss ZIP codes (special formats provided in German Excel versions) were detected as date time value, because the regular expression for date time formats falsely matched their formats ("\C\H\-00000" and "\D-00000").
2020-06-20 17:15:38 +02:00
Arne Jørgensen a5a0268050
Fix HLOOKUP on single row (#1512)
Fixes a bug when doing a HLOOKUP on a single row.

```php
<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Spreadsheet;

$spreadsheet = new Spreadsheet();
$sheet = $spreadsheet->getActiveSheet();

/**
 * Single row.
 */
$singleRow = "=HLOOKUP(10, {5, 10, 15}, 1, 0)";
$sheet->getCell('A1')->setValue($singleRow);

// Should echo 10, but echos '#N/A' and some PHP notices and warnings.
echo $sheet->getCell('A1')->getCalculatedValue() . PHP_EOL;

/**
 * Multiple rows.
 */
$multipleRows = "=HLOOKUP(10, {5, 10, 15; 20, 25, 30}, 1, 0)";
$sheet->getCell('A2')->setValue($multipleRows);

// Should echo: 10 and also does.
echo $sheet->getCell('A2')->getCalculatedValue() . PHP_EOL;
```

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2020-06-19 21:06:41 +02:00
oleibman 38fab4e632
Fix for #1505 (#1525)
This problem is the same as #1238, which was resolved by #1239.
For that issue, the fix was to check in one place whether
$this->mapCellXfIndex[$xfIndex] was set before using it.
The sample spreadsheet supplied as a description for this
problem had exactly the same problem in 2 other places in the code.
In addition, there were 7 other places in the code where that
particular item was used unchecked. This fix corrects all 9 locations.
The spreadsheet supplied with the problem is used as the basis
for some new tests, which particularly test column dimensions
and styles, the problems involved in this case.
2020-06-19 21:01:18 +02:00
Arne Jørgensen 1a44ef9109
Fix MATCH when comparing different numeric types (#1521)
Let MATCH compare numerics of different type (e.g. integers and floats).

```php
<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Spreadsheet;

$spreadsheet = new Spreadsheet();
$sheet = $spreadsheet->getActiveSheet();

// Row: 1, 2, 3, 4, 5. MATCH for 4.6.
$sheet->getCell('A1')->setValue(1);
$sheet->getCell('A2')->setValue(2);
$sheet->getCell('A3')->setValue(3);
$sheet->getCell('A4')->setValue(4);
$sheet->getCell('A5')->setValue(5);

$sheet->getCell('B1')->setValue('=MATCH(4.6, A1:A5, 1)');

// Should echo 4, but echos '#N/A'.
echo $sheet->getCell('B1')->getCalculatedValue() . PHP_EOL;

// Row: 1, 2, 3, 3.8, 5. MATCH for 4.
$sheet->getCell('C1')->setValue(1);
$sheet->getCell('C2')->setValue(2);
$sheet->getCell('C3')->setValue(3);
$sheet->getCell('C4')->setValue(3.8);
$sheet->getCell('C5')->setValue(5);

$sheet->getCell('D1')->setValue('=MATCH(4, C1:C5, 1)');

// Should echo 4, but echos 3.
echo $sheet->getCell('D1')->getCalculatedValue() . PHP_EOL;
```

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
2020-06-19 20:54:04 +02:00
Arne Jørgensen 73c336ac96
Fix exact MATCH on ranges with empty cells (#1520)
Fixes a bug when doing exact match on ranges with empty cells.

```php
<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Spreadsheet;

$spreadsheet = new Spreadsheet();
$sheet = $spreadsheet->getActiveSheet();

// Row: 1, null, 4, null, 8.
$sheet->getCell('A1')->setValue(1);
$sheet->getCell('A3')->setValue(4);
$sheet->getCell('A5')->setValue(8);

$sheet->getCell('B1')->setValue('=MATCH(4, A1:A5, 1)');

// Should echo 3, but echos '#N/A'.
echo $sheet->getCell('B1')->getCalculatedValue() . PHP_EOL;

// Row: 1, null, 4, null, null.
$sheet->getCell('C1')->setValue(1);
$sheet->getCell('C3')->setValue(4);

$sheet->getCell('D1')->setValue('=MATCH(5, C1:C5, 1)');

// Should echo 3, but echos '#N/A'.
echo $sheet->getCell('D1')->getCalculatedValue() . PHP_EOL;
```
2020-06-19 20:51:46 +02:00
Mark Baker 5c18bb5798
Range operator tests (#1501)
* Improved handling of named ranges, although there are still some issues (names ranges using a union type with an overlap don't handle the overlap twice, which as the MS Excel approach to set overlaps as opposed to the mathematical approach which only applies overlap values once)

* Fix tests that misused space and comma as simple separators in cell ranges
2020-06-02 07:38:35 +02:00
Reijn dfa6f77178
Add support protection of worksheet by a specific hash algorithm 2020-05-31 20:29:20 +09:00
Alban Duval 7ed96e0be1
Calcualtion - DATEDIF - fix result for Y & YM units (#1466)
Bugfix for negative results and too small results

2000-02-02 => 2001-02-01
 > DATEDIF with Y unit: 0 year (returned -1 before fix)
 > DATEDIF with YM unit: 11 months (returned -1 before fix)
2020-05-25 21:33:48 +02:00
oleibman 5dd7e883c6
Fix Issue 1441 (isDateTime and Formulas) (#1480)
* Fix Issue 1441 (isDateTime and Formulas)

When you have a date-field which is a formula, isDateTime returns false.
https://github.com/PHPOffice/PhpSpreadsheet/issues/1441

Report makes sense; fixed as suggested. Also fixed a few minor
related issues, and added tests so that Shared/Date and Shared/TimeZone
are now completely covered.

Date/setDefaultTimeZone and TimeZone/setTimeZone were not consistent
about what to do in event of failure - return false or throw.
They will now both return false, which is what Date's function
said it would do in its doc block anyhow. Date/validateTimeZone will
continue to throw; it was protected, but was never called outside
Date, so I changed it to private.

TimeZone/getTimeZoneAdjustment checked for 'UST' when it probably
meant 'UTC', and, as it turns out, the check is not even needed.

The most serious problem was that TimeZone/validateTimeZone does not
check the backwards-compatible time zones. The timezone project
aggressively, and very controversially, "demotes" timezones;
such timezones eventually wind up in the PHP backwards-compatible list.
We want to make sure to check that list so that our applications do not
break when this happens.
2020-05-24 20:02:39 +02:00
oleibman 41b95c1542
CSV Sample File Was Miscoded (#1489)
File author erroneously assumed that backslash was used to escape
quotes in CSV; in fact, doubling the quote is used for escape.
The test still worked, but mainly because the content of the cell
with the escape wasn't tested. The file is now fixed, and
a new test added.
2020-05-24 19:57:39 +09:00
oleibman 84e03da5c7
Code Coverage for Shared\CodePage (#1491)
While investigating something else in Shared, I noticed that CodePage
had poor test coverage and a high complexity rating. This change
addresses both; Scrutinizer would love it, although its interface on
GitHub seems broken at the moment (all PRs show "Waiting for External
Code Coverage").
2020-05-24 19:51:28 +09:00
Vagir 3446bb0ef7
Fix saving XLSX with drawings (#1462)
* Fix incorrect behaviour when saving XLSX file with drawings
2020-05-23 13:09:10 +02:00
Gianni Genovesi 7b1957f996
fix: issue #1476 crash with numeric string value terminating with new line (#1481)
* fix: issue #1476 crash with numeric string value terminating with new line
* test: provided tests for issue #1476
2020-05-23 12:49:54 +02:00
Adrien Crivelli fcd9f10663
Update PHP-CS-Fixer rules 2020-05-18 13:49:57 +09:00
oleibman 97a80f383c
Improve HTML Writer (#1464)
There are a number of situations where HTML write was producing
HTML which could not be validated. These include:

  - inconsistent use of backslash terminating META, IMG, and COL tags
  - @page style tags in body rather than header. Aside from being
    non-standard, HTML Reader treats those as spreadsheet data.
  - <div style="page-break-before:always" />, a construct which is
    usually better handled through css anyhow.
  - no alt tag for images (drawings and charts)

Other problems:

  - Windows file names not handled correctly for images
  - Memory drawings not handled in extendRowsForChartsAndImages
  - No handling of different values for showing gridlines
    for screen and print
  - Mpdf and Dompdf do not require the use of inline css.
    Tcpdf remains a holdout in the use of this inferior approach.
  - no need to chunk base64 encoding of embedded images
  - support for colors in number format was buggy (html tags
    run through htmlspecialchars)

Code has been refactored when practical to reduce the number of
very large functions.

Coverage is now 100% for the entire HTML Writer module,
from 75% lines and 39% methods beforehand.

All functions dealing only with charts
are bypassed for coverage because the version of Jpgraph available in
Composer is not suitable for PHP7. The code will, nevertheless,
run successfully, but with warning messages. I have confirmed that
the code is entirely covered, without warnings, when the current
version of Jpgraph is used in lieu of the one available in Composer.
I will be glad to revisit this when the Jpgraph problem is resolved.

Directory PhpSpreadsheetTests/Writer/Html was created to house
the new tests. It seemed logical to move HtmlCommentsTest to
the new directory from PhpSpreadsheetTests/Functional.

A function to generate all the HTML is useful, especially for testing,
but also in lieu of the multiple other generate* functions. I have
added and documented generateHTMLAll.

The documentation for the generate* functions (a) produces invalid html,
(b) produces html which cannot be handled correctly by HTML reader,
and (c) even if those were correct, does not actually affect
the display of the spreadsheet. The documentation has been replaced
by a valid, and more instructive, example.

The (undocumented) useEmbeddedCss property, and the functions
to test and set it are no longer needed. Rather than breaking
existing code by deleting them, I marked the functions deprecated.

This change borrows a change to LocaleFloatsTest from
pull request 1456, submitted a little over a week before this one.


## Improve NumberFormat Support

First phase of this change included correcting NumberFormat handling
in HTML Writer. Certain complex formats could not be handled without
changes to Style/NumberFormat, and I did not wish to combine those changes.

Once the original change had been pushed, I took this part of it back up.
HTML Writer can now handle conditions in formats like:
[Blue][>=3000.5]$#,##0.00;[Red][<0]$#,##0.00;$#,##0.00
In testing, I discovered several errors and omissions
in handling of some other formats.
These are now corrected, and tests added.
2020-05-18 12:43:18 +09:00
Owen Leibman 4f6d4af396
Save Excel 2010+ Functions Properly
For functions introduced in Excel 2010 and beyond, Excel saves them
in formulas with the xlfn_ prefix. PhpSpreadsheet does not do this;
as a result, when a spreadsheet so created is opened, the cells
which use the new functions display a #NAME? error.
This the cause of bug report 1246:
https://github.com/PHPOffice/PhpSpreadsheet/issues/1246
This change corrects that problem when the Xlsx writer encounters
a 2010+ formula for a cell or a conditional style. A new class
Writer/Xlsx/Xlfn, with 2 static methods,
is introduced to facilitate this change.

As part of the testing for this, I found some additional problems.
When an unknown function name is used, Excel generates a #NAME? error.
However, when an unknown function is used in PhpSpreadsheet:
  - if there are no parameters, it returns #VALUE!, which is wrong
  - if there are parameters, it throws an exception, which is horrible
Both of these situations will now return #NAME?
Tests have been added for these situations.

The MODE (and MODE.SNGL) function is not quite in alignment with Excel.
MODE(3, 3, 4, 4) returns 3 in both Excel and PhpSpreadsheet.
However, MODE(4, 3, 3, 4) returns 4 in Excel, but 3 in PhpSpreadsheet.
Both situations will now match Excel's result.
Also, Excel allows its parameters for MODE to be an array,
but PhpSpreadsheet did not; it now will.
There had not been any tests for MODE. Now there are.

The SHEET and SHEETS functions were introduced in Excel 2013,
but were not introduced in PhpSpreadsheet. They are now introduced
as DUMMY functions so that they can be parsed appropriately.

Finally, in common with the "rate" changes for which I am
creating a pull request at the same time as this one:
samples/Basic/13_CalculationCyclicFormulae
PhpUnit started reporting an error like "too much regression".
The test deals with an infinite cyclic formula, and allowed
the calculation engine to run for 100 cycles. The actual number of cycles
seems irrelevant for the purpose of this test. I changed it to 15,
and PhpUnit no longer complains.
2020-05-18 12:37:35 +09:00
oleibman 9ae521cdd4
Fix RATE, PRICE, XIRR, and XNPV Functions (#1456)
There were about 20 skipped tests for RATE and PRICE marked
"This test should be fixed". This change does that by fixing
the code for those functions, validating the existing tests,
and adding new ones. XIRR and XNPV are also substantially changed.
As part of this change, the following functions also have minor changes:

  - isValidFrequency
  - COUPDAYBS
  - COUPNUM (additional tests)
  - DB
  - DDB

PhpUnit reports 100% coverage for all the changed functions.

Since I was dealing with skipped tests, I also fixed
tests/PhpSpreadsheetTests/Writer/Xlsx/LocaleFloatsTest,
which was being skipped in Windows. I also delete the temporary
file which it creates.
There is now only one remaining test which is skipped -
ODS Reader is not complete enough to run some tests against it.
Unfortunately, that test is too complicated for me to deal with now.

In researching this change, I found several places in the code where special code was added for Gnumeric claiming:

   - Gnumeric does not handle free-format string dates
   - Gnumeric adds extra options, not available in Excel,
     for the frequency parameter for functions such as YIELD
   - Gnumeric rounds the results for DB and DDB to 2 decimal places

None of these claims is true, at least not on a recent version
of Gnumeric, and the code which supports these differences is removed.
There did not appear to be any tests targeted for
these supposed properties of Gnumeric.

The PRICE function needed relatively minor changes - mostly
additional tests for invalid input. The main problem with the PRICE
tests is that Excel appears to have a bug. The algorithm is published:
https://support.office.com/en-us/article/price-function-3ea9deac-8dfa-436f-a7c8-17ea02c21b0a
The results that Excel returns for basis codes 2 and 3 appear to be
incorrect in many cases. I have segregated these tests into a
new test PRICE3. The results of these tests agree with the published
algorithm, and to the results for LibreOffice and Gnumeric.
The results returned by Excel do not agree with them.
The tests which remain in the test PRICE all use basis codes other
than 2 or 3, and all agree with Excel, LibreOffice, and Gnumeric.

For the RATE function, there appears to be a problem with how the
secant method was implemented. I studied the implementation of RATE
in Python numpy, and adapted its implementation of secant method.
The results now agree with numpy, and, more important, with Excel.

XIRR, which calls XNPV, permits its dates to be earlier than the
start date, whereas XNPV does not. I dealt with this by renaming
the existing XNPV function to xnpvOrdered, adding a parameter to
indicate whether start date has to be earliest. XNPV calls the new
function with that parameter set to TRUE, and XIRR calls it with
the parameter set to FALSE. Some additional error checking was
added to xnpvOrdered, and also to XIRR. XIRR tests benefited
from increasing the value of FINANCIAL_MAX_ITERATIONS.

Finally, since this change is very test-related:
samples/Basic/13_CalculationCyclicFormulae
PhpUnit started reporting an error like "too much regression".
The test deals with an infinite cyclic formula, and allowed
the calculation engine to run for 100 cycles. The actual number of cycles
seems irrelevant for the purpose of this test. I changed it to 15,
and PhpUnit no longer complains.
2020-05-17 19:50:01 +09:00
oleibman 7517cdd008
Improve Coverage for CSV (#1475)
I believe that both CSV Reader and Writer are 100% covered now.

There were some errors uncovered during development.

The reader specifically permits encodings other than UTF-8 to be used.
However, fgetcsv will not properly handle other encodings.
I tried replacing it with fgets/iconv/strgetcsv, but that could not
handle line breaks within a cell, even for UTF-8.
This is, I'm sure, a very rare use case.
I eventually handled it by using php://memory to hold the translated
file contents for non-UTF8. There were no tests for this situation,
and now there are (probably too many).

"Contiguous" read was not handle correctly. There is a file
in samples which uses it. It was designed to read a large sheet,
and split it into three. The first sheet was corrrect, but the
second and third were almost entirely empty. This has been corrected,
and the sample code was adapted into a formal test with assertions
to confirm that it works as designed.

I made a minor documentation change. Unlike HTML, where you never
need a BOM because you can declare the encoding in the file,
a CSV with non-ASCII characters must explicitly include a BOM
for Excel to handle it correctly. This was explained in the Reading CSV
section, but was glossed over in the Writing CSV section, which I
have updated.
2020-05-17 18:15:18 +09:00
n-longcape f9f9f4cacf
Fix ROUNDUP and ROUNDDOWN for negative number
Closes #1417
2020-04-27 17:03:07 +09:00
Paul Kievits a6c56d0f81
Added support for the FLOOR.MATH and FLOOR.PRECISE functions 2020-04-26 22:19:33 +09:00
Owen Leibman c4895b9468
MATCH with a static array should return the position of the found value based on the values submitted.
Returns #N/A, unless the element searched for is at the end of the array.

The problem is in Calculation.php line 4231:
                    if (!is_array($functionCall)) {
                        foreach ($args as &$arg) {
                            $arg = Functions::flattenSingleValue($arg);
                        }
                        unset($arg);
                    }

I believe this code is intended to handle functions where PhpSpreadsheet just passes
the call on to PHP without implementing the code on its own, e.g. for atan or acos.
In the bug report, the following code fails:
  $flat_rate = "=MATCH(6,{4,5,6,2}, 0)";
  $sheet->getCell('A1')->setValue($flat_rate);
The expected value is 3, but the actual result is "#N/A".
The reason for this result is that the parser replaces the braces with calls
to the MKMATRIX internal function, whose value for functioncall was:
'self::MKMATRIX'. Since this isn't an array, the flattening code is executed,
and the unintended result occurs. The fix is to change the definition for
functioncall in that case to [__CLASS__, 'mkMatrix'], avoiding the flattening.

However, there is also another part to this bug. The flattening should be
returning the first entry in the array, but is in fact returning the last.
This explains why the bug report specified "unless ... end of the array".
I confirmed that Excel does use the first item in the array rather than the last,
e.g. =atan({1,2,3}) entered into a cell will return atan(1), not atan(3).
The problem here is that flattenSingleValue, which says in its comments that
it is supposed to be returning the first item, uses array_pop rather than array_shift.
I have changed that as well. The same mistake was also present in
Cell.php function getCalculatedValue. The correct behavior can be verified
by entering =minverse({-2.5,1.5;2,-1}) into an Excel cell'
Excel flattens the result ({2,3;4,5}) to 2, and so should PhpSpreadsheet.

Fixes #1271
Closes #1332
2020-04-26 22:09:31 +09:00
youkan a79b344d53
Fix ROUNDUP and ROUNDDOWN for floating-point rounding error (#1404)
Closes #1404
2020-03-07 12:48:54 +07:00
Paul Kievits a08415a7b5 Improved the ARABIC function to handle short-form roman numerals 2020-03-07 12:43:59 +07:00
oleibman cb18163a1d
Changes to WEEKNUM and YEARFRAC (#1316)
* Changes to WEEKNUM and YEARFRAC

The optional second parameter for WEEKNUM can take any of 10 values
(1, 2, 11-17, and 21), but currently only 1 and 2 are supported.
This change adds support for the other 8 possibilities.

YEARFRAC in Excel does not require that end date be before start date,
but PhpSpreadsheet was returning an error in that situation.

YEARFRAC third parameter (method) of 1 was not correctly implemented.
I was able to find a description of the algorithm, and documented
that location in the code, and implemented according to that spec.
PHPExcel had a (failing) test to assert the result of
YEARFRAC("1960-12-19", "2008-06-28", 1). This test had been dropped
from PhpSpreadsheet, and is now restored; several new tests have
been added, and verified against Excel.

* Add YEARFRAC Tests

Scrutinizer reported a very mysterious failure with no details.
project.metric_change("scrutinizer.test_coverage", < 0),
without even a link to explain what it is reporting.
It is possible that it was a complaint about code coverage.
If so, I have added some tests which will, I hope, eliminate the problem.

* Make Array Constant

Responding to review from Mark Baker.

* Merge with PR 1362 Bugfix 1161

Travis CI reported problem with Calculation.php (which is not part
  of this change).
That was changed in master a few days ago
(delete some unused code).
Perhaps the lack of that change is the problem here.
Merging it manually.
2020-02-19 20:22:31 +01:00
paulkned 0c52f173aa
Added support for the base function (#1344) 2020-02-19 20:12:30 +01:00
paulkned 25e3e45eb6
Added support for the ARABIC excel function (#1343)
Updated changelog

Updated docprops

Fixed stylci
2020-02-11 22:59:19 +01:00
oleibman 082266aacd Conditionals - Extend Support for (NOT)CONTAINSBLANKS (#1278)
Support for the CONTAINSBLANKS conditional style was added a while ago.
However, that support was on write only; any cells which used
CONTAINSBLANKS on a file being read would drop that style.

I am also adding support for NOTCONTAINSBLANKS, on read and write.
2020-01-04 18:50:04 +01:00
oleibman afd070a756 Handle ConditionalStyle NumberFormat When Reading Xlsx File (#1296)
* Handle ConditionalStyle NumberFormat When Reading Xlsx File

ReadStyle in Reader/Xlsx/Styles.php expects numberFormat to be a string.
However, when reading conditional style in Xlsx file, NumberFormat
   is actually a SimpleXMLElement, so is not handled correctly.
While testing this change, it turned out that reader always expects
   that there is a "SharedString" portion of the XML, which is not
   true for spreadsheets with no string data, which causes a
   run-time message.
Likewise, when conditional number format is not one of the built-in
   formats, a run-time message is issued because 'isset' is used
   to determine existence rather than 'array_key_exists'.
The new workbook added to the testing data demonstrates both those
   problems (prior to the code changes).

* Move Comment to Resolve Conflict

Github reports conflict involving placement of one comment statement.

* Respond to Scrutinizer Style Suggestion

Change detection for empty SimpleXMLElement.
2020-01-04 00:10:41 +01:00
Mark Baker e19228ecb0
Additional cell datatype unit tests (#1301) 2020-01-03 23:29:53 +01:00
Mark Baker 0417c8cc2b
Fix date tests withut specified year for current year 2020 (#1302) 2020-01-03 23:24:45 +01:00
Ikko Ashimine cc92c6648e
FLOOR() function accept negative number and negative significance
Closes #1245
2019-11-30 15:18:04 +01:00