When using Mpdf as a writer exported PDF contains CSS #2432

sime15 · 2021-12-02T15:57:33Z

This is:

- [x] a bug report

What is the expected behavior?

When exporting to PDF and using Mpdf as a writer, exported file should contain just data which needs to be exported, not CSS or any other HTML.

What is the current behavior?

Exported file contains CSS.

What are the steps to reproduce?

Try to export about 1000, or more, rows to PDF. CSS will be generated per row and it will contain more than 1000 lines, so exported file will have broken CSS included and is broken because of this(Mpdf class):

$html = $this->generateHTMLAll();
foreach (\array_chunk(\explode(PHP_EOL, $html), 1000) as $lines) {
    $pdf->WriteHTML(\implode(PHP_EOL, $lines));
}

Which versions of PhpSpreadsheet and PHP are affected?

1.20.0

The text was updated successfully, but these errors were encountered:

oleibman · 2021-12-02T17:43:36Z

What exactly is the problem you are seeing? I regularly test by generating a file which consists of > 4000 lines of html (so 5 chunks are passed in turn to Mpdf), and don't see a problem with "broken css" or the look of the PDF file. The css (and html) absolutely needs to be passed to Mpdf so that its WriteHtml method can format the output correctly.

sime15 · 2021-12-03T11:40:48Z

Of course that CSS and HTML needs to be passed to Mpdf, but in my opinion issue is with chunks with length of 1000 lines of code. If CSS contains more than 1000 lines it will be split into two chunks and WriteHTML method won't process it properly. When exported file contains table with around 1000 rows(in my case 961 row) methods generateHtmlAll(), buildCss(), buildCssPerSheet() and buildCssRowHeights() will produce more than 1000 rows of CSS.

This is how the first page of my exported PDF look like:

This is how generated HTML look like:

oleibman · 2021-12-03T16:23:13Z

Interesting - thank you for the extra detail. When I generate a spreadsheet whose head section extends past the first chunk, I do see a problem. Chunking anywhere else does not seem to be a problem. This seems like a bug in Mpdf to me. At any rate, I can investigate further to see if there's anything that can be done. I do not know why the chunking is needed, although I suspect it's memory-related - I already had to override my default memory_limit just to get my script to run, even with chunking.

sime15 · 2021-12-03T16:39:52Z

I also believe that chunking is there because of memory. Maybe solution might be to first pass the whole <head> to Mpdf(explode by </head> or something like that) and than do the chunking of everything else. I don't see why CSS would break the memory no matter how long it is. Just a suggestion, I don't suspect that you will get to the best solution. Thank you for your cooperation.

oleibman · 2021-12-03T16:40:51Z

This appears to be the relevant section in the Mpdf documentation

If <body> tags are found, all $html outside these tags are discarded, and the rest is parsed as content for the document.

If no <body> tags are found, all remaining $html is parsed as content.

So, Mpdf is operating as designed here. It might be possible to get things working within that constraint.

Fix PHPOffice#2432. Probably for memory reasons, PhpSpreadsheet divides its data into chunks when writing to Mpdf. However, if the first chunk has so many styles that the `body` tag is not included in the chunk, Mpdf will not handle it correctly. Code is changed to ensure that the first chunk always contains the body tag. Because this error becomes evident only when opening the PDF file itself, it is difficult to write a test case. I have instead added a new sample file which meets the conditions which would have led to the error, and which can be examined to show that it is created correctly.

Fix #2432. Probably for memory reasons, PhpSpreadsheet divides its data into chunks when writing to Mpdf. However, if the first chunk has so many styles that the `body` tag is not included in the chunk, Mpdf will not handle it correctly. Code is changed to ensure that the first chunk always contains the body tag. Because this error becomes evident only when opening the PDF file itself, it is difficult to write a test case. I have instead added a new sample file which meets the conditions which would have led to the error, and which can be examined to show that it is created correctly.

oleibman mentioned this issue Dec 4, 2021

Mpdf With Very Many Styles #2434

Merged

5 tasks

oleibman closed this as completed in #2434 Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When using Mpdf as a writer exported PDF contains CSS #2432

When using Mpdf as a writer exported PDF contains CSS #2432

sime15 commented Dec 2, 2021

oleibman commented Dec 2, 2021

sime15 commented Dec 3, 2021

oleibman commented Dec 3, 2021

sime15 commented Dec 3, 2021 •

edited

Loading

oleibman commented Dec 3, 2021

When using Mpdf as a writer exported PDF contains CSS #2432

When using Mpdf as a writer exported PDF contains CSS #2432

Comments

sime15 commented Dec 2, 2021

What is the expected behavior?

What is the current behavior?

What are the steps to reproduce?

Which versions of PhpSpreadsheet and PHP are affected?

oleibman commented Dec 2, 2021

sime15 commented Dec 3, 2021

oleibman commented Dec 3, 2021

sime15 commented Dec 3, 2021 • edited Loading

oleibman commented Dec 3, 2021

sime15 commented Dec 3, 2021 •

edited

Loading