Skip to content

Commit 97a80f3

Browse files
authored
Improve HTML Writer (#1464)
There are a number of situations where HTML write was producing HTML which could not be validated. These include: - inconsistent use of backslash terminating META, IMG, and COL tags - @page style tags in body rather than header. Aside from being non-standard, HTML Reader treats those as spreadsheet data. - <div style="page-break-before:always" />, a construct which is usually better handled through css anyhow. - no alt tag for images (drawings and charts) Other problems: - Windows file names not handled correctly for images - Memory drawings not handled in extendRowsForChartsAndImages - No handling of different values for showing gridlines for screen and print - Mpdf and Dompdf do not require the use of inline css. Tcpdf remains a holdout in the use of this inferior approach. - no need to chunk base64 encoding of embedded images - support for colors in number format was buggy (html tags run through htmlspecialchars) Code has been refactored when practical to reduce the number of very large functions. Coverage is now 100% for the entire HTML Writer module, from 75% lines and 39% methods beforehand. All functions dealing only with charts are bypassed for coverage because the version of Jpgraph available in Composer is not suitable for PHP7. The code will, nevertheless, run successfully, but with warning messages. I have confirmed that the code is entirely covered, without warnings, when the current version of Jpgraph is used in lieu of the one available in Composer. I will be glad to revisit this when the Jpgraph problem is resolved. Directory PhpSpreadsheetTests/Writer/Html was created to house the new tests. It seemed logical to move HtmlCommentsTest to the new directory from PhpSpreadsheetTests/Functional. A function to generate all the HTML is useful, especially for testing, but also in lieu of the multiple other generate* functions. I have added and documented generateHTMLAll. The documentation for the generate* functions (a) produces invalid html, (b) produces html which cannot be handled correctly by HTML reader, and (c) even if those were correct, does not actually affect the display of the spreadsheet. The documentation has been replaced by a valid, and more instructive, example. The (undocumented) useEmbeddedCss property, and the functions to test and set it are no longer needed. Rather than breaking existing code by deleting them, I marked the functions deprecated. This change borrows a change to LocaleFloatsTest from pull request 1456, submitted a little over a week before this one. ## Improve NumberFormat Support First phase of this change included correcting NumberFormat handling in HTML Writer. Certain complex formats could not be handled without changes to Style/NumberFormat, and I did not wish to combine those changes. Once the original change had been pushed, I took this part of it back up. HTML Writer can now handle conditions in formats like: [Blue][>=3000.5]$#,##0.00;[Red][<0]$#,##0.00;$#,##0.00 In testing, I discovered several errors and omissions in handling of some other formats. These are now corrected, and tests added.
1 parent 4f6d4af commit 97a80f3

19 files changed

+1835
-628
lines changed

docs/topics/reading-and-writing-to-file.md

+9-18
Original file line numberDiff line numberDiff line change
@@ -681,35 +681,26 @@ Supported methods:
681681
- `generateStyles()`
682682
- `generateSheetData()`
683683
- `generateHTMLFooter()`
684+
- `generateHTMLAll()`
684685

685686
Here's an example which retrieves all parts independently and merges
686687
them into a resulting HTML page:
687688

688689
``` php
689-
<?php
690690
$writer = new \PhpOffice\PhpSpreadsheet\Writer\Html($spreadsheet);
691-
echo $writer->generateHTMLHeader();
692-
?>
693-
694-
<style>
695-
<!--
691+
$hdr = $writer->generateHTMLHeader();
692+
$sty = $writer->generateStyles(false); // do not write <style> and </style>
693+
$newstyle = <<<EOF
694+
<style type='text/css'>
695+
$sty
696696
html {
697-
font-family: Times New Roman;
698-
font-size: 9pt;
699-
background-color: white;
697+
background-color: yellow;
700698
}
701-
702-
<?php
703-
echo $writer->generateStyles(false); // do not write <style> and </style>
704-
?>
705-
706-
-->
707699
</style>
708-
709-
<?php
700+
EOF;
701+
echo preg_replace('@</head>@', "$newstyle\n</head>", $hdr);
710702
echo $writer->generateSheetData();
711703
echo $writer->generateHTMLFooter();
712-
?>
713704
```
714705

715706
#### Writing UTF-8 HTML files

samples/Basic/17a_Html.php

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
<?php
2+
3+
use PhpOffice\PhpSpreadsheet\Writer\Html;
4+
5+
require __DIR__ . '/../Header.php';
6+
$spreadsheet = require __DIR__ . '/../templates/sampleSpreadsheet.php';
7+
8+
$filename = $helper->getFilename(__FILE__, 'html');
9+
$writer = new Html($spreadsheet);
10+
11+
$callStartTime = microtime(true);
12+
$writer->setEmbedImages(true);
13+
$writer->save($filename);
14+
$helper->logWrite($writer, $filename, $callStartTime);

samples/Basic/25_In_memory_image.php

+1
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
$drawing->setMimeType(MemoryDrawing::MIMETYPE_DEFAULT);
3636
$drawing->setHeight(36);
3737
$drawing->setWorksheet($spreadsheet->getActiveSheet());
38+
$drawing->setCoordinates('C5');
3839

3940
// Save
4041
$helper->write($spreadsheet, __FILE__, ['Xlsx', 'Html']);

src/PhpSpreadsheet/Style/NumberFormat.php

+100-44
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,15 @@ private static function formatAsFraction(&$value, &$format)
537537
$adjustedDecimalPart = $decimalPart / $GCD;
538538
$adjustedDecimalDivisor = $decimalDivisor / $GCD;
539539

540-
if ((strpos($format, '0') !== false) || (strpos($format, '#') !== false) || (substr($format, 0, 3) == '? ?')) {
540+
if ((strpos($format, '0') !== false)) {
541+
$value = "$sign$integerPart $adjustedDecimalPart/$adjustedDecimalDivisor";
542+
} elseif ((strpos($format, '#') !== false)) {
543+
if ($integerPart == 0) {
544+
$value = "$sign$adjustedDecimalPart/$adjustedDecimalDivisor";
545+
} else {
546+
$value = "$sign$integerPart $adjustedDecimalPart/$adjustedDecimalDivisor";
547+
}
548+
} elseif ((substr($format, 0, 3) == '? ?')) {
541549
if ($integerPart == 0) {
542550
$integerPart = '';
543551
}
@@ -653,9 +661,12 @@ private static function formatStraightNumericValue($value, $format, array $match
653661

654662
private static function formatAsNumber($value, $format)
655663
{
656-
if ($format === self::FORMAT_CURRENCY_EUR_SIMPLE) {
657-
return 'EUR ' . sprintf('%1.2f', $value);
658-
}
664+
// The "_" in this string has already been stripped out,
665+
// so this test is never true. Furthermore, testing
666+
// on Excel shows this format uses Euro symbol, not "EUR".
667+
//if ($format === self::FORMAT_CURRENCY_EUR_SIMPLE) {
668+
// return 'EUR ' . sprintf('%1.2f', $value);
669+
//}
659670

660671
// Some non-number strings are quoted, so we'll get rid of the quotes, likewise any positional * symbols
661672
$format = str_replace(['"', '*'], '', $format);
@@ -717,6 +728,89 @@ private static function formatAsNumber($value, $format)
717728
return $value;
718729
}
719730

731+
private static function splitFormatCompare($value, $cond, $val, $dfcond, $dfval)
732+
{
733+
if (!$cond) {
734+
$cond = $dfcond;
735+
$val = $dfval;
736+
}
737+
switch ($cond) {
738+
case '>':
739+
return $value > $val;
740+
741+
case '<':
742+
return $value < $val;
743+
744+
case '<=':
745+
return $value <= $val;
746+
747+
case '<>':
748+
return $value != $val;
749+
750+
case '=':
751+
return $value == $val;
752+
}
753+
754+
return $value >= $val;
755+
}
756+
757+
private static function splitFormat($sections, $value)
758+
{
759+
// Extract the relevant section depending on whether number is positive, negative, or zero?
760+
// Text not supported yet.
761+
// Here is how the sections apply to various values in Excel:
762+
// 1 section: [POSITIVE/NEGATIVE/ZERO/TEXT]
763+
// 2 sections: [POSITIVE/ZERO/TEXT] [NEGATIVE]
764+
// 3 sections: [POSITIVE/TEXT] [NEGATIVE] [ZERO]
765+
// 4 sections: [POSITIVE] [NEGATIVE] [ZERO] [TEXT]
766+
$cnt = count($sections);
767+
$color_regex = '/\\[(' . implode('|', Color::NAMED_COLORS) . ')\\]/';
768+
$cond_regex = '/\\[(>|>=|<|<=|=|<>)([+-]?\\d+([.]\\d+)?)\\]/';
769+
$colors = ['', '', '', '', ''];
770+
$condops = ['', '', '', '', ''];
771+
$condvals = [0, 0, 0, 0, 0];
772+
for ($idx = 0; $idx < $cnt; ++$idx) {
773+
if (preg_match($color_regex, $sections[$idx], $matches)) {
774+
$colors[$idx] = $matches[0];
775+
$sections[$idx] = preg_replace($color_regex, '', $sections[$idx]);
776+
}
777+
if (preg_match($cond_regex, $sections[$idx], $matches)) {
778+
$condops[$idx] = $matches[1];
779+
$condvals[$idx] = $matches[2];
780+
$sections[$idx] = preg_replace($cond_regex, '', $sections[$idx]);
781+
}
782+
}
783+
$color = $colors[0];
784+
$format = $sections[0];
785+
$absval = $value;
786+
switch ($cnt) {
787+
case 2:
788+
$absval = abs($value);
789+
if (!self::splitFormatCompare($value, $condops[0], $condvals[0], '>=', 0)) {
790+
$color = $colors[1];
791+
$format = $sections[1];
792+
}
793+
794+
break;
795+
case 3:
796+
case 4:
797+
$absval = abs($value);
798+
if (!self::splitFormatCompare($value, $condops[0], $condvals[0], '>', 0)) {
799+
if (self::splitFormatCompare($value, $condops[1], $condvals[1], '<', 0)) {
800+
$color = $colors[1];
801+
$format = $sections[1];
802+
} else {
803+
$color = $colors[2];
804+
$format = $sections[2];
805+
}
806+
}
807+
808+
break;
809+
}
810+
811+
return [$color, $format, $absval];
812+
}
813+
720814
/**
721815
* Convert a value in a pre-defined format to a PHP string.
722816
*
@@ -745,50 +839,12 @@ public static function toFormattedString($value, $format, $callBack = null)
745839
// Get the sections, there can be up to four sections, separated with a semi-colon (but only if not a quoted literal)
746840
$sections = preg_split('/(;)(?=(?:[^"]|"[^"]*")*$)/u', $format);
747841

748-
// Extract the relevant section depending on whether number is positive, negative, or zero?
749-
// Text not supported yet.
750-
// Here is how the sections apply to various values in Excel:
751-
// 1 section: [POSITIVE/NEGATIVE/ZERO/TEXT]
752-
// 2 sections: [POSITIVE/ZERO/TEXT] [NEGATIVE]
753-
// 3 sections: [POSITIVE/TEXT] [NEGATIVE] [ZERO]
754-
// 4 sections: [POSITIVE] [NEGATIVE] [ZERO] [TEXT]
755-
switch (count($sections)) {
756-
case 1:
757-
$format = $sections[0];
758-
759-
break;
760-
case 2:
761-
$format = ($value >= 0) ? $sections[0] : $sections[1];
762-
$value = abs($value); // Use the absolute value
763-
break;
764-
case 3:
765-
$format = ($value > 0) ?
766-
$sections[0] : (($value < 0) ?
767-
$sections[1] : $sections[2]);
768-
$value = abs($value); // Use the absolute value
769-
break;
770-
case 4:
771-
$format = ($value > 0) ?
772-
$sections[0] : (($value < 0) ?
773-
$sections[1] : $sections[2]);
774-
$value = abs($value); // Use the absolute value
775-
break;
776-
default:
777-
// something is wrong, just use first section
778-
$format = $sections[0];
779-
780-
break;
781-
}
842+
[$colors, $format, $value] = self::splitFormat($sections, $value);
782843

783844
// In Excel formats, "_" is used to add spacing,
784845
// The following character indicates the size of the spacing, which we can't do in HTML, so we just use a standard space
785846
$format = preg_replace('/_./', ' ', $format);
786847

787-
// Save format with color information for later use below
788-
$formatColor = $format;
789-
// Strip colour information
790-
$color_regex = '/\[(' . implode('|', Color::NAMED_COLORS) . ')\]/';
791-
$format = preg_replace($color_regex, '', $format);
792848
// Let's begin inspecting the format and converting the value to a formatted string
793849

794850
// Check for date/time characters (not inside quotes)
@@ -809,7 +865,7 @@ public static function toFormattedString($value, $format, $callBack = null)
809865
// Additional formatting provided by callback function
810866
if ($callBack !== null) {
811867
[$writerInstance, $function] = $callBack;
812-
$value = $writerInstance->$function($value, $formatColor);
868+
$value = $writerInstance->$function($value, $colors);
813869
}
814870

815871
return $value;

0 commit comments

Comments
 (0)