2016-10-23 19:18:59 +00:00
|
|
|
<?php
|
|
|
|
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
namespace PhpOffice\PhpSpreadsheetTests\Reader\Html;
|
2016-10-23 19:18:59 +00:00
|
|
|
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
use PhpOffice\PhpSpreadsheet\Reader\Exception as ReaderException;
|
2017-01-22 08:39:23 +00:00
|
|
|
use PhpOffice\PhpSpreadsheet\Reader\Html;
|
2019-02-16 22:11:16 +00:00
|
|
|
use PhpOffice\PhpSpreadsheet\Style\Alignment;
|
2020-07-26 04:51:13 +00:00
|
|
|
use PhpOffice\PhpSpreadsheet\Style\Border;
|
2019-02-16 22:11:16 +00:00
|
|
|
use PhpOffice\PhpSpreadsheet\Style\Font;
|
2017-11-08 15:48:01 +00:00
|
|
|
use PHPUnit\Framework\TestCase;
|
2016-10-23 19:18:59 +00:00
|
|
|
|
2017-11-08 15:48:01 +00:00
|
|
|
class HtmlTest extends TestCase
|
2016-10-23 19:18:59 +00:00
|
|
|
{
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCsvWithAngleBracket(): void
|
2016-10-23 19:18:59 +00:00
|
|
|
{
|
2020-05-17 09:35:55 +00:00
|
|
|
$filename = 'tests/data/Reader/HTML/csv_with_angle_bracket.csv';
|
2017-08-02 21:13:08 +00:00
|
|
|
$reader = new Html();
|
2017-09-20 05:55:42 +00:00
|
|
|
self::assertFalse($reader->canRead($filename));
|
2016-10-23 19:18:59 +00:00
|
|
|
}
|
2017-12-11 02:08:41 +00:00
|
|
|
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
public function testBadHtml(): void
|
|
|
|
{
|
|
|
|
$this->expectException(ReaderException::class);
|
|
|
|
$filename = 'tests/data/Reader/HTML/badhtml.html';
|
|
|
|
$reader = new Html();
|
|
|
|
self::assertTrue($reader->canRead($filename));
|
2020-06-26 06:11:30 +00:00
|
|
|
$reader->load($filename);
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
self::assertTrue(false);
|
|
|
|
}
|
|
|
|
|
|
|
|
public function testNonHtml(): void
|
|
|
|
{
|
|
|
|
$this->expectException(ReaderException::class);
|
|
|
|
$filename = __FILE__;
|
|
|
|
$reader = new Html();
|
|
|
|
self::assertFalse($reader->canRead($filename));
|
2020-06-26 06:11:30 +00:00
|
|
|
$reader->load($filename);
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
self::assertTrue(false);
|
|
|
|
}
|
|
|
|
|
|
|
|
public function testInvalidFilename(): void
|
|
|
|
{
|
|
|
|
$reader = new Html();
|
|
|
|
self::assertEquals(0, $reader->getSheetIndex());
|
|
|
|
self::assertFalse($reader->canRead(''));
|
|
|
|
}
|
|
|
|
|
2017-12-11 02:08:41 +00:00
|
|
|
public function providerCanReadVerySmallFile()
|
|
|
|
{
|
|
|
|
$padding = str_repeat('a', 2048);
|
|
|
|
|
|
|
|
return [
|
|
|
|
[true, ' <html> ' . $padding . ' </html> '],
|
|
|
|
[true, ' <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html>' . $padding . '</html>'],
|
|
|
|
[true, '<html></html>'],
|
|
|
|
[false, ''],
|
|
|
|
];
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @dataProvider providerCanReadVerySmallFile
|
|
|
|
*
|
2019-02-11 12:06:39 +00:00
|
|
|
* @param bool $expected
|
2017-12-11 02:08:41 +00:00
|
|
|
* @param string $content
|
|
|
|
*/
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanReadVerySmallFile($expected, $content): void
|
2017-12-11 02:08:41 +00:00
|
|
|
{
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($content);
|
2017-12-11 02:08:41 +00:00
|
|
|
$reader = new Html();
|
|
|
|
$actual = $reader->canRead($filename);
|
|
|
|
|
|
|
|
self::assertSame($expected, $actual);
|
2019-02-16 22:11:16 +00:00
|
|
|
|
|
|
|
unlink($filename);
|
2017-12-11 02:08:41 +00:00
|
|
|
}
|
2018-12-22 19:32:02 +00:00
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testBackgroundColorInRanding(): void
|
2018-12-22 19:32:02 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
<td style="background-color: #0000FF;color: #FFFFFF">Blue background</td>
|
|
|
|
<td style="background-color: unknown1;color: unknown2">Unknown fore/background</td>
|
2018-12-22 19:32:02 +00:00
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2018-12-22 19:32:02 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
$style = $firstSheet->getCell('A1')->getStyle();
|
|
|
|
self::assertEquals('FFFFFF', $style->getFont()->getColor()->getRGB());
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
self::assertEquals('0000FF', $style->getFill()->getStartColor()->getRGB());
|
|
|
|
self::assertEquals('0000FF', $style->getFill()->getEndColor()->getRGB());
|
2019-02-16 22:11:16 +00:00
|
|
|
$style = $firstSheet->getCell('B1')->getStyle();
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
self::assertEquals('000000', $style->getFont()->getColor()->getRGB());
|
|
|
|
self::assertEquals('000000', $style->getFill()->getEndColor()->getRGB());
|
|
|
|
self::assertEquals('FFFFFF', $style->getFill()->getstartColor()->getRGB());
|
2019-02-16 22:11:16 +00:00
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyInlineFontStyles(): void
|
2019-02-16 22:11:16 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td style="font-size: 16px;">16px</td>
|
|
|
|
<td style="font-family: \'Times New Roman\'">Times New Roman</td>
|
|
|
|
<td style="font-weight: bold;">Bold</td>
|
|
|
|
<td style="font-style: italic;">Italic</td>
|
|
|
|
<td style="text-decoration: underline;">Underline</td>
|
|
|
|
<td style="text-decoration: line-through;">Line through</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-02-16 22:11:16 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('A1')->getStyle();
|
|
|
|
self::assertEquals(16, $style->getFont()->getSize());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('B1')->getStyle();
|
|
|
|
self::assertEquals('Times New Roman', $style->getFont()->getName());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('C1')->getStyle();
|
|
|
|
self::assertTrue($style->getFont()->getBold());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('D1')->getStyle();
|
|
|
|
self::assertTrue($style->getFont()->getItalic());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('E1')->getStyle();
|
|
|
|
self::assertEquals(Font::UNDERLINE_SINGLE, $style->getFont()->getUnderline());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('F1')->getStyle();
|
|
|
|
self::assertTrue($style->getFont()->getStrikethrough());
|
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyInlineWidth(): void
|
2019-02-16 22:11:16 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td width="50">50px</td>
|
|
|
|
<td style="width: 100px;">100px</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-02-16 22:11:16 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$dimension = $firstSheet->getColumnDimension('A');
|
|
|
|
self::assertEquals(50, $dimension->getWidth());
|
|
|
|
|
|
|
|
$dimension = $firstSheet->getColumnDimension('B');
|
|
|
|
self::assertEquals(100, $dimension->getWidth());
|
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyInlineHeight(): void
|
2019-02-16 22:11:16 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td height="50">1</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td style="height: 100px;">2</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-02-16 22:11:16 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$dimension = $firstSheet->getRowDimension(1);
|
|
|
|
self::assertEquals(50, $dimension->getRowHeight());
|
|
|
|
|
|
|
|
$dimension = $firstSheet->getRowDimension(2);
|
|
|
|
self::assertEquals(100, $dimension->getRowHeight());
|
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyAlignment(): void
|
2019-02-16 22:11:16 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td align="center">Center align</td>
|
|
|
|
<td valign="center">Center valign</td>
|
|
|
|
<td style="text-align: center;">Center align</td>
|
|
|
|
<td style="vertical-align: center;">Center valign</td>
|
|
|
|
<td style="text-indent: 10px;">Text indent</td>
|
|
|
|
<td style="word-wrap: break-word;">Wraptext</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-02-16 22:11:16 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('A1')->getStyle();
|
|
|
|
self::assertEquals(Alignment::HORIZONTAL_CENTER, $style->getAlignment()->getHorizontal());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('B1')->getStyle();
|
|
|
|
self::assertEquals(Alignment::VERTICAL_CENTER, $style->getAlignment()->getVertical());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('C1')->getStyle();
|
|
|
|
self::assertEquals(Alignment::HORIZONTAL_CENTER, $style->getAlignment()->getHorizontal());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('D1')->getStyle();
|
|
|
|
self::assertEquals(Alignment::VERTICAL_CENTER, $style->getAlignment()->getVertical());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('E1')->getStyle();
|
|
|
|
self::assertEquals(10, $style->getAlignment()->getIndent());
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('F1')->getStyle();
|
|
|
|
self::assertTrue($style->getAlignment()->getWrapText());
|
|
|
|
}
|
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyInlineDataFormat(): void
|
2019-02-16 22:11:16 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td data-format="mmm-yy">2019-02-02 12:34:00</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-02-16 22:11:16 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$style = $firstSheet->getCell('A1')->getStyle();
|
|
|
|
self::assertEquals('mmm-yy', $style->getNumberFormat()->getFormatCode());
|
2018-12-22 19:32:02 +00:00
|
|
|
}
|
2019-02-16 22:11:16 +00:00
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testCanApplyCellWrapping(): void
|
2019-07-12 05:52:03 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td>Hello World</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>Hello<br />World</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>Hello<br>World</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-10-17 19:04:18 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
|
|
|
|
$cellStyle = $firstSheet->getStyle('A1');
|
|
|
|
self::assertFalse($cellStyle->getAlignment()->getWrapText());
|
|
|
|
|
|
|
|
$cellStyle = $firstSheet->getStyle('A2');
|
|
|
|
self::assertTrue($cellStyle->getAlignment()->getWrapText());
|
|
|
|
$cellValue = $firstSheet->getCell('A2')->getValue();
|
2020-05-18 04:49:57 +00:00
|
|
|
self::assertStringContainsString("\n", $cellValue);
|
2019-10-17 19:04:18 +00:00
|
|
|
|
|
|
|
$cellStyle = $firstSheet->getStyle('A3');
|
|
|
|
self::assertTrue($cellStyle->getAlignment()->getWrapText());
|
|
|
|
$cellValue = $firstSheet->getCell('A3')->getValue();
|
2020-05-18 04:49:57 +00:00
|
|
|
self::assertStringContainsString("\n", $cellValue);
|
2019-02-16 22:11:16 +00:00
|
|
|
}
|
2019-02-11 12:06:39 +00:00
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testRowspanInRendering(): void
|
2019-02-11 12:06:39 +00:00
|
|
|
{
|
2020-05-17 09:35:55 +00:00
|
|
|
$filename = 'tests/data/Reader/HTML/rowspan.html';
|
2019-02-11 12:06:39 +00:00
|
|
|
$reader = new Html();
|
|
|
|
$spreadsheet = $reader->load($filename);
|
|
|
|
|
|
|
|
$actual = $spreadsheet->getActiveSheet()->getMergeCells();
|
|
|
|
self::assertSame(['A2:C2' => 'A2:C2'], $actual);
|
|
|
|
}
|
2019-11-21 10:04:55 +00:00
|
|
|
|
2020-05-18 04:49:57 +00:00
|
|
|
public function testTextIndentUseRowspan(): void
|
2019-11-21 10:04:55 +00:00
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td>1</td>
|
|
|
|
<td rowspan="2" style="vertical-align: center;">Center Align</td>
|
|
|
|
<td>Row</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td>2</td>
|
|
|
|
<td style="text-indent:10px">Text Indent</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
Improve Coverage for HTML Reader
Reader/Html is now covered except for 1 statement.
There is some coverage of RichText when you know in advance that the
html will expand into a single cell.
It is a tougher nut, one that I have not yet cracked,
to try to handle rich text while converting unkown html to multiple cells.
The original author left this as a TODO, and so for now must I.
It made sense to restructure some of the code. There are some changes.
- Issue #1532 is fixed (links are now saved when using rowspan).
- Colors can now be specified as html color name. To accomplish this,
Helper/Html function colourNameLookup was changed from protected
to public, and changed to static.
- Superfluous empty lines were eliminated in a number of places, e.g.
<ul><li>A</li><li>B</li><li>C</li></ul>
had formerly caused a wrapped cell to be created with 2 empty lines
followed by A, B, and C on separate lines; it will now just have the
3 A/B/C lines, which seems like a more sensible interpretation.
- Img alt tag, which had been cast to float, is now used as a string.
Private member "encoding" is not used. Functions getEncoding and setEncoding
have therefore been marked deprecated. In fact, I was unable to get
SecurityScanner to pass *any* html which is not UTF-8. There are
possibly ways of getting around this (in Reader/Html - I have no
intention of messing with Security Scanner), as can be seen in my
companion pull request for Excel2003 Xml Reader. Doing this would be
easier for ASCII-compatible character sets (like ISO-8859-1),
than for non-compatible charsets (like UTF-16). I am not
convinced that the effort is worth it, but am willing to investigate
further.
I added a number of tests, creating an Html directory, and moving
HtmlTest to that directory.
2020-06-26 05:42:38 +00:00
|
|
|
$filename = HtmlHelper::createHtml($html);
|
|
|
|
$spreadsheet = HtmlHelper::loadHtmlIntoSpreadsheet($filename, true);
|
2019-11-21 10:04:55 +00:00
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
$style = $firstSheet->getCell('C2')->getStyle();
|
|
|
|
self::assertEquals(10, $style->getAlignment()->getIndent());
|
|
|
|
}
|
2020-05-13 08:26:49 +00:00
|
|
|
|
|
|
|
public function testBorderWithRowspanAndColspan(): void
|
|
|
|
{
|
|
|
|
$html = '<table>
|
|
|
|
<tr>
|
|
|
|
<td style="border: 1px solid black;">NOT SPANNED</td>
|
|
|
|
<td rowspan="2" colspan="2" style="border: 1px solid black;">SPANNED</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td style="border: 1px solid black;">NOT SPANNED</td>
|
|
|
|
</tr>
|
|
|
|
</table>';
|
|
|
|
|
|
|
|
$reader = new Html();
|
|
|
|
$spreadsheet = $reader->loadFromString($html);
|
|
|
|
$firstSheet = $spreadsheet->getSheet(0);
|
|
|
|
$style = $firstSheet->getStyle('B1:C2');
|
|
|
|
|
|
|
|
$borders = $style->getBorders();
|
|
|
|
|
|
|
|
$totalBorders = [
|
|
|
|
$borders->getTop(),
|
|
|
|
$borders->getLeft(),
|
|
|
|
$borders->getBottom(),
|
|
|
|
$borders->getRight(),
|
|
|
|
];
|
|
|
|
|
|
|
|
foreach ($totalBorders as $border) {
|
|
|
|
self::assertEquals(Border::BORDER_THIN, $border->getBorderStyle());
|
|
|
|
}
|
|
|
|
}
|
2016-10-23 19:18:59 +00:00
|
|
|
}
|