Skip to content

Commit a0a9b2b

Browse files
authored
HyperlinkBase Property, and Html Handling of Properties (#3589)
* HyperlinkBase Property, and Html Handling of Properties Fix #3573. The original issue concerned non-support of Document Properties in Xml spreadsheets. However, most of the Properties mentioned there were already supported. But the investigation revealed some gaps in Html coverage. HyperlinkBase is the one property mentioned in the issue that was not supported for Xml, nor indeed for any other format. All the other document properties are 'meta'; but HyperlinkBase is functional - if you supply a relative address for a link, Excel will use HyperlinkBase, if supplied, to convert to an absolute address. (Default is directory where spreadsheet is located.) Here's a summary of how this PR will handle this property for various formats: - Support is added for Xlsx read and write. - Support is added for Xml read (there is no Xml writer). Ironically, Excel messes up this processing when reading an Xml spreadsheet; however, PhpSpreadsheet will get it right. - HyperlinkBase is supported for Xls, but I have no idea how to read or write this property. For now, when writing hyperlinked cells, PhpSpreadsheet will be changed to convert any relative addresses that it can detect to absolute references by adding HyperlinkBase to the relative address. In a similar vein, Xls supports custom properties, but PhpSpreadsheet does not know how to read or write those. - Gnumeric has no equivalent property, so nothing needs to be done to its reader. Since we don't have a Gnumeric writer, that's not really a problem for us. - Odt has no equivalent property, so nothing needs to be done to its reader. The Odt writer does not have any special logic for hyperlinks, so, at least for now, will remain unchanged. - Csv has no equivalent property, so nothing needs to be done to its reader. The Csv writer does not have any special logic for hyperlinks, so, at least for now, will remain unchanged. - Html allows for an equivalent `base` tag in the head section. Support for this is added to Html reader and writer. Html Writer was only handling 8 of the 11 'core' properties. Support is added for `created`, `modified`, and `lastModifiedBy`. Custom properties were not supported at all, and now are. Html Reader did not support any properties. It will now support all of them. * Scrutinizer Remove one dead reference.
1 parent 3aab263 commit a0a9b2b

File tree

9 files changed

+445
-10
lines changed

9 files changed

+445
-10
lines changed

src/PhpSpreadsheet/Document/Properties.php

+14
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,8 @@ class Properties
107107
*/
108108
private $customProperties = [];
109109

110+
private string $hyperlinkBase = '';
111+
110112
/**
111113
* Create a new Document Properties instance.
112114
*/
@@ -534,4 +536,16 @@ public static function convertPropertyType(string $propertyType): string
534536
{
535537
return self::PROPERTY_TYPE_ARRAY[$propertyType] ?? self::PROPERTY_TYPE_UNKNOWN;
536538
}
539+
540+
public function getHyperlinkBase(): string
541+
{
542+
return $this->hyperlinkBase;
543+
}
544+
545+
public function setHyperlinkBase(string $hyperlinkBase): self
546+
{
547+
$this->hyperlinkBase = $hyperlinkBase;
548+
549+
return $this;
550+
}
537551
}

src/PhpSpreadsheet/Reader/Html.php

+88-1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
use DOMText;
99
use PhpOffice\PhpSpreadsheet\Cell\Coordinate;
1010
use PhpOffice\PhpSpreadsheet\Cell\DataType;
11+
use PhpOffice\PhpSpreadsheet\Document\Properties;
1112
use PhpOffice\PhpSpreadsheet\Helper\Dimension as CssDimension;
1213
use PhpOffice\PhpSpreadsheet\Reader\Security\XmlScanner;
1314
use PhpOffice\PhpSpreadsheet\Spreadsheet;
@@ -685,10 +686,94 @@ public function loadIntoExisting($filename, Spreadsheet $spreadsheet)
685686
if ($loaded === false) {
686687
throw new Exception('Failed to load ' . $filename . ' as a DOM Document', 0, $e ?? null);
687688
}
689+
self::loadProperties($dom, $spreadsheet);
688690

689691
return $this->loadDocument($dom, $spreadsheet);
690692
}
691693

694+
private static function loadProperties(DOMDocument $dom, Spreadsheet $spreadsheet): void
695+
{
696+
$properties = $spreadsheet->getProperties();
697+
foreach ($dom->getElementsByTagName('meta') as $meta) {
698+
$metaContent = (string) $meta->getAttribute('content');
699+
if ($metaContent !== '') {
700+
$metaName = (string) $meta->getAttribute('name');
701+
switch ($metaName) {
702+
case 'author':
703+
$properties->setCreator($metaContent);
704+
705+
break;
706+
case 'category':
707+
$properties->setCategory($metaContent);
708+
709+
break;
710+
case 'company':
711+
$properties->setCompany($metaContent);
712+
713+
break;
714+
case 'created':
715+
$properties->setCreated($metaContent);
716+
717+
break;
718+
case 'description':
719+
$properties->setDescription($metaContent);
720+
721+
break;
722+
case 'keywords':
723+
$properties->setKeywords($metaContent);
724+
725+
break;
726+
case 'lastModifiedBy':
727+
$properties->setLastModifiedBy($metaContent);
728+
729+
break;
730+
case 'manager':
731+
$properties->setManager($metaContent);
732+
733+
break;
734+
case 'modified':
735+
$properties->setModified($metaContent);
736+
737+
break;
738+
case 'subject':
739+
$properties->setSubject($metaContent);
740+
741+
break;
742+
case 'title':
743+
$properties->setTitle($metaContent);
744+
745+
break;
746+
default:
747+
if (preg_match('/^custom[.](bool|date|float|int|string)[.](.+)$/', $metaName, $matches) === 1) {
748+
switch ($matches[1]) {
749+
case 'bool':
750+
$properties->setCustomProperty($matches[2], (bool) $metaContent, Properties::PROPERTY_TYPE_BOOLEAN);
751+
752+
break;
753+
case 'float':
754+
$properties->setCustomProperty($matches[2], (float) $metaContent, Properties::PROPERTY_TYPE_FLOAT);
755+
756+
break;
757+
case 'int':
758+
$properties->setCustomProperty($matches[2], (int) $metaContent, Properties::PROPERTY_TYPE_INTEGER);
759+
760+
break;
761+
case 'date':
762+
$properties->setCustomProperty($matches[2], $metaContent, Properties::PROPERTY_TYPE_DATE);
763+
764+
break;
765+
default: // string
766+
$properties->setCustomProperty($matches[2], $metaContent, Properties::PROPERTY_TYPE_STRING);
767+
}
768+
}
769+
}
770+
}
771+
}
772+
if (!empty($dom->baseURI)) {
773+
$properties->setHyperlinkBase($dom->baseURI);
774+
}
775+
}
776+
692777
private static function replaceNonAscii(array $matches): string
693778
{
694779
return '&#' . mb_ord($matches[0], 'UTF-8') . ';';
@@ -719,8 +804,10 @@ public function loadFromString($content, ?Spreadsheet $spreadsheet = null): Spre
719804
if ($loaded === false) {
720805
throw new Exception('Failed to load content as a DOM Document', 0, $e ?? null);
721806
}
807+
$spreadsheet = $spreadsheet ?? new Spreadsheet();
808+
self::loadProperties($dom, $spreadsheet);
722809

723-
return $this->loadDocument($dom, $spreadsheet ?? new Spreadsheet());
810+
return $this->loadDocument($dom, $spreadsheet);
724811
}
725812

726813
/**

src/PhpSpreadsheet/Reader/Xlsx/Properties.php

+3
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,9 @@ public function readExtendedProperties(string $propertyData): void
7373
if (isset($xmlCore->Manager)) {
7474
$this->docProps->setManager((string) $xmlCore->Manager);
7575
}
76+
if (isset($xmlCore->HyperlinkBase)) {
77+
$this->docProps->setHyperlinkBase((string) $xmlCore->HyperlinkBase);
78+
}
7679
}
7780
}
7881

src/PhpSpreadsheet/Reader/Xml/Properties.php

+11-8
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,10 @@ protected function processStandardProperty(
9292
case 'Manager':
9393
$docProps->setManager($stringValue);
9494

95+
break;
96+
case 'HyperlinkBase':
97+
$docProps->setHyperlinkBase($stringValue);
98+
9599
break;
96100
case 'Keywords':
97101
$docProps->setKeywords($stringValue);
@@ -110,17 +114,10 @@ protected function processCustomProperty(
110114
?SimpleXMLElement $propertyValue,
111115
SimpleXMLElement $propertyAttributes
112116
): void {
113-
$propertyType = DocumentProperties::PROPERTY_TYPE_UNKNOWN;
114-
115117
switch ((string) $propertyAttributes) {
116-
case 'string':
117-
$propertyType = DocumentProperties::PROPERTY_TYPE_STRING;
118-
$propertyValue = trim((string) $propertyValue);
119-
120-
break;
121118
case 'boolean':
122119
$propertyType = DocumentProperties::PROPERTY_TYPE_BOOLEAN;
123-
$propertyValue = (bool) $propertyValue;
120+
$propertyValue = (bool) (string) $propertyValue;
124121

125122
break;
126123
case 'integer':
@@ -134,9 +131,15 @@ protected function processCustomProperty(
134131

135132
break;
136133
case 'dateTime.tz':
134+
case 'dateTime.iso8601tz':
137135
$propertyType = DocumentProperties::PROPERTY_TYPE_DATE;
138136
$propertyValue = trim((string) $propertyValue);
139137

138+
break;
139+
default:
140+
$propertyType = DocumentProperties::PROPERTY_TYPE_STRING;
141+
$propertyValue = trim((string) $propertyValue);
142+
140143
break;
141144
}
142145

src/PhpSpreadsheet/Writer/Html.php

+41-1
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,11 @@
77
use PhpOffice\PhpSpreadsheet\Cell\Cell;
88
use PhpOffice\PhpSpreadsheet\Cell\Coordinate;
99
use PhpOffice\PhpSpreadsheet\Chart\Chart;
10+
use PhpOffice\PhpSpreadsheet\Document\Properties;
1011
use PhpOffice\PhpSpreadsheet\RichText\RichText;
1112
use PhpOffice\PhpSpreadsheet\RichText\Run;
1213
use PhpOffice\PhpSpreadsheet\Settings;
14+
use PhpOffice\PhpSpreadsheet\Shared\Date;
1315
use PhpOffice\PhpSpreadsheet\Shared\Drawing as SharedDrawing;
1416
use PhpOffice\PhpSpreadsheet\Shared\File;
1517
use PhpOffice\PhpSpreadsheet\Shared\Font as SharedFont;
@@ -342,13 +344,21 @@ public function writeAllSheets()
342344

343345
private static function generateMeta(?string $val, string $desc): string
344346
{
345-
return $val
347+
return ($val || $val === '0')
346348
? (' <meta name="' . $desc . '" content="' . htmlspecialchars($val, Settings::htmlEntityFlags()) . '" />' . PHP_EOL)
347349
: '';
348350
}
349351

350352
public const BODY_LINE = ' <body>' . PHP_EOL;
351353

354+
private const CUSTOM_TO_META = [
355+
Properties::PROPERTY_TYPE_BOOLEAN => 'bool',
356+
Properties::PROPERTY_TYPE_DATE => 'date',
357+
Properties::PROPERTY_TYPE_FLOAT => 'float',
358+
Properties::PROPERTY_TYPE_INTEGER => 'int',
359+
Properties::PROPERTY_TYPE_STRING => 'string',
360+
];
361+
352362
/**
353363
* Generate HTML header.
354364
*
@@ -374,6 +384,36 @@ public function generateHTMLHeader($includeStyles = false)
374384
$html .= self::generateMeta($properties->getCategory(), 'category');
375385
$html .= self::generateMeta($properties->getCompany(), 'company');
376386
$html .= self::generateMeta($properties->getManager(), 'manager');
387+
$html .= self::generateMeta($properties->getLastModifiedBy(), 'lastModifiedBy');
388+
$date = Date::dateTimeFromTimestamp((string) $properties->getCreated());
389+
$date->setTimeZone(Date::getDefaultOrLocalTimeZone());
390+
$html .= self::generateMeta($date->format(DATE_W3C), 'created');
391+
$date = Date::dateTimeFromTimestamp((string) $properties->getModified());
392+
$date->setTimeZone(Date::getDefaultOrLocalTimeZone());
393+
$html .= self::generateMeta($date->format(DATE_W3C), 'modified');
394+
395+
$customProperties = $properties->getCustomProperties();
396+
foreach ($customProperties as $customProperty) {
397+
$propertyValue = $properties->getCustomPropertyValue($customProperty);
398+
$propertyType = $properties->getCustomPropertyType($customProperty);
399+
$propertyQualifier = self::CUSTOM_TO_META[$propertyType] ?? null;
400+
if ($propertyQualifier !== null) {
401+
if ($propertyType === Properties::PROPERTY_TYPE_BOOLEAN) {
402+
$propertyValue = $propertyValue ? '1' : '0';
403+
} elseif ($propertyType === Properties::PROPERTY_TYPE_DATE) {
404+
$date = Date::dateTimeFromTimestamp((string) $propertyValue);
405+
$date->setTimeZone(Date::getDefaultOrLocalTimeZone());
406+
$propertyValue = $date->format(DATE_W3C);
407+
} else {
408+
$propertyValue = (string) $propertyValue;
409+
}
410+
$html .= self::generateMeta($propertyValue, "custom.$propertyQualifier.$customProperty");
411+
}
412+
}
413+
414+
if (!empty($properties->getHyperlinkBase())) {
415+
$html .= ' <base href="' . $properties->getHyperlinkBase() . '" />' . PHP_EOL;
416+
}
377417

378418
$html .= $includeStyles ? $this->generateStyles(true) : $this->generatePageDeclarations(true);
379419

src/PhpSpreadsheet/Writer/Xls/Worksheet.php

+7
Original file line numberDiff line numberDiff line change
@@ -503,6 +503,8 @@ public function close(): void
503503
$this->writeMergedCells();
504504

505505
// Hyperlinks
506+
$phpParent = $phpSheet->getParent();
507+
$hyperlinkbase = ($phpParent === null) ? '' : $phpParent->getProperties()->getHyperlinkBase();
506508
foreach ($phpSheet->getHyperLinkCollection() as $coordinate => $hyperlink) {
507509
[$column, $row] = Coordinate::indexesFromString($coordinate);
508510

@@ -513,6 +515,11 @@ public function close(): void
513515
$url = str_replace('sheet://', 'internal:', $url);
514516
} elseif (preg_match('/^(http:|https:|ftp:|mailto:)/', $url)) {
515517
// URL
518+
} elseif (!empty($hyperlinkbase) && preg_match('~^([A-Za-z]:)?[/\\\\]~', $url) !== 1) {
519+
$url = "$hyperlinkbase$url";
520+
if (preg_match('/^(http:|https:|ftp:|mailto:)/', $url) !== 1) {
521+
$url = 'external:' . $url;
522+
}
516523
} else {
517524
// external (local file)
518525
$url = 'external:' . $url;

src/PhpSpreadsheet/Writer/Xlsx/DocProps.php

+3
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,9 @@ public function writeDocPropsApp(Spreadsheet $spreadsheet)
9393
// SharedDoc
9494
$objWriter->writeElement('SharedDoc', 'false');
9595

96+
// HyperlinkBase
97+
$objWriter->writeElement('HyperlinkBase', $spreadsheet->getProperties()->getHyperlinkBase());
98+
9699
// HyperlinksChanged
97100
$objWriter->writeElement('HyperlinksChanged', 'false');
98101

0 commit comments

Comments
 (0)