You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Xlsx Reader Better Namespace Handling Phase 1 Try2
This is a replacement for #2088, which has run into merge conflicts. I will close that PR in the near future, however the comments in that PR may prove useful for this one. While that PR has been in draft status all along, I am marking this one as ready. I will gladly add additional tests (and, of course, make code changes) that anyone has to suggest, but, with my most recent test files which I will describe in a separate comment, I have no further ideas on useful additions.
As mentioned in the earlier ticket, this is a risky change. But, as has been demonstrated, delaying it comes with its own set of risks. It would be helpful to have a temporary moratorium on changes to Reader/Xlsx until this change is merged.
The original commit message follows.
There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment #860 (comment) in issue #860 by @IMSoP has triggered an idea about how to proceed.
Gnumeric Reader was recently changed to handle namespaces better. Using that as a model, this PR begins the process of doing the same for Xlsx. Xlsx is much larger and more complicated than Gnumeric, hence the need to tackle it in multiple phases. I believe that this PR handles all of:
- listWorkSheetNames
- listWorkSheetInfo. Note that there was a bug in this function which would cause it to count only used columns rather than all columns. That bug is corrected.
- active sheet
- selected cell and top left cell
- cell content (formulas, numbers, text)
- hyperlinks
- comments (partial - see below)
This PR does not address:
- styles
- images and charts
- VBA and ribbons
- many other items, I'm sure
The issue for non-standard namespacing till now has been the use of unexpected prefixes. While I was working on this change, @Lambik introduced issue #2067 PR #2068 which introduced a completely different problem - the use of unexpected URLs. That PR and the issue associated with it were quite well documented, including the supplying of a test file and tests for it. I asked if I could take a look to see if it could be integrated with my change, and the result seems to be yes, so those changes are also part of this PR.
While adding a comment to my test file, I discovered that Microsoft had added "threaded comments" as a new feature. I believe these are not yet supported by PhpSpreadsheet, and I am not going to add it, at least not now. I believe that, among other things, this will make identifying the author of a comment more difficult.
Although there are a number of Phpstan baseline changes as part of this PR, I did not attempt to resolve all Phpstan reports for Reader/Xlsx. Nor did I do anything to increase coverage. This change is already large and complex enough without those efforts.
message:"#^Cannot call method addChart\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\Worksheet\\|null\\.$#"
3130
3025
count:1
@@ -3265,11 +3160,6 @@ parameters:
3265
3160
count:1
3266
3161
path:src/PhpSpreadsheet/Reader/Xlsx.php
3267
3162
3268
-
-
3269
-
message:"#^Offset 'id' does not exist on SimpleXMLElement\\|null\\.$#"
3270
-
count:1
3271
-
path:src/PhpSpreadsheet/Reader/Xlsx.php
3272
-
3273
3163
-
3274
3164
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\:\\:readFormControlProperties\\(\\) has parameter \\$dir with no typehint specified\\.$#"
3275
3165
count:1
@@ -3700,66 +3590,6 @@ parameters:
3700
3590
count:1
3701
3591
path:src/PhpSpreadsheet/Reader/Xlsx/PageSetup.php
3702
3592
3703
-
-
3704
-
message:"#^Property PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:\\$securityScanner has no typehint specified\\.$#"
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:extractPropertyData\\(\\) has parameter \\$propertyData with no typehint specified\\.$#"
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:readCoreProperties\\(\\) has parameter \\$propertyData with no typehint specified\\.$#"
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:readExtendedProperties\\(\\) has parameter \\$propertyData with no typehint specified\\.$#"
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:readCustomProperties\\(\\) has parameter \\$propertyData with no typehint specified\\.$#"
message:"#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Xlsx\\\\Properties\\:\\:getArrayItem\\(\\) has parameter \\$key with no typehint specified\\.$#"
0 commit comments