-
Notifications
You must be signed in to change notification settings - Fork 3.6k
"Allowed memory size of XXXXXXXXX bytes exhausted" when reading some XLS files #2781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Maybe your file is created by wpf or other office softwares? |
The exported file can be opened without problems with any Excel version, so I assume this library must open it also without problems. Anyway, the attached example is a XLS file created with Excel, and it fails. |
I have a similar issue, I am able to load between 10K and 20K rows but not more. |
Can you explain how you are using the filter? I created a filter but didn't worked for me. Whole spreadsheet is readed... |
I start reading chunks from row 2, so the first chunk goes as it is, // create reader /** Loop to read our worksheet in "chunk size" blocks / |
Hi have same issue. I've build 2 functionallity, one for export some data and another for import this data. If I remove one line from the file and try to import, when I call to $worksheet->getActiveSheet()->toArray() I get the same error and not make sense for me because I have only 5 rows, not a big large file. It's a bit weird this error because if I edit the file adding some values, or adding new rows and import, everything work fine, the error olny happen when I remove rows. Any ideas? |
@jkdc
This tells PhpSpreadsheet that when you call Rather than using Using the following code to load the file and copy the data into a PHP array: $inputFileType = 'Xlsx';
$inputFileName = __DIR__ . '/../medidas.xlsx';
$callStartTime = microtime(true);
$reader = IOFactory::createReader($inputFileType);
$spreadsheet = $reader->load($inputFileName);
$callEndTime = microtime(true);
$loadCallTime = $callEndTime - $callStartTime;
echo PHP_EOL;
echo 'Call time to load spreadsheet file was ' , sprintf('%.4f', $loadCallTime) , ' seconds' , PHP_EOL;
$worksheet = $spreadsheet->getActiveSheet();
echo "Worksheet Name: {$worksheet->getTitle()}", PHP_EOL;
echo "Range Dimensions specified in the xlsx file {$worksheet->calculateWorksheetDimension()} that will be used by the toArray() method", PHP_EOL;
echo "Range of cells that contain actual data {$worksheet->calculateWorksheetDataDimension()} that can be passed to the rangeToArray() method", PHP_EOL;
$data = $worksheet->rangeToArray($worksheet->calculateWorksheetDataDimension(), null, true, true, true);
// Echo memory usage
echo ' Current memory usage: ' , (memory_get_usage(true) / 1024) , ' KB' , PHP_EOL;
echo ' Peak memory usage: ' , (memory_get_peak_usage(true) / 1024) , ' KB' , PHP_EOL; The result:
|
@MarkBaker thanks a lot!! |
In my case i had no issues with data range detection, see full code for preview of 500 rows of data $inputFileType=''; // default
$rowsPreview=500;
$chunkSize = $rowsPreview;
$startRow=0;
$tstart=microtime(true);
$worksheetname='';
if(file_exists($file)){
echo "date modified: " . date ("F d Y H:i:s.", filemtime($file)). '</br>';
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($file);
echo '<p>Detected file type is <kbd>'.$inputFileType.'</kbd></p>';
if ($inputFileType=='Xlsx' || $inputFileType=='Xls') {
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
//$reader->setReadDataOnly(true);
$reader->getReadEmptyCells(false);
}
else if ($inputFileType=='Csv'){
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);
}
else {
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); //$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($file);
}
$s=0;
//$reader->setReadDataOnly(true);
$reader->getReadEmptyCells(false);
echo 'File data information<br>';
$worksheetData = $reader->listWorksheetInfo($file);
foreach ($worksheetData as $worksheet) {
echo "Tab $s - <kbd>".$worksheet['worksheetName']. '</kbd> Rows: <kbd>', $worksheet['totalRows'],
'</kbd> Columns: <kbd>', $worksheet['totalColumns'], '</kbd> Cell Range: <kbd>A1:', $worksheet['lastColumnLetter'], $worksheet['totalRows'].'</kbd>'. ($s==$tab?'<- active':'') .'<br>';
if($s==$tab){$worksheetname=$worksheet['worksheetName'];}
$s++;
}
$reader->setLoadSheetsOnly($worksheetname);
/** Instantiate a new Spreadsheet object manually **/
//$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
$chunkFilter = new \PhpOffice\PhpSpreadsheet\Reader\ChunkReadFilter();
$reader->setReadFilter($chunkFilter);//->setContiguous(true);
$chunkFilter->setRows($startRow,$chunkSize);
//$spreadsheet = $reader->loadIntoExisting($file,$spreadsheet);
$spreadsheet = $reader->load($file);
if(is_object($spreadsheet)){
$sheetData = $spreadsheet->getActiveSheet()->toArray(null, false, true, true);
//$sheetData = $spreadsheet->getActiveSheet()->rangeToArray($spreadsheet->getActiveSheet()->calculateWorksheetDataDimension(),null, true, true, true); // not working, leads to timeout
$count = count($sheetData) - 1;
echo "<p> Preview <kbd>$worksheetname</kbd> sheet. A number of lines: <kbd> $rowsPreview </kbd></p>";
echo '</div>';
$valuesData='';
$recordCount=0;
echo '<table class="table table-striped table-sm table-hover table-responsive table-bordered" >';
echo '<thead class="thead-light">'; // a header
$counter=0;
foreach($sheetData as $k => $v) {
$counter=$counter+1;
if ($counter>$rowsPreview) {break 1;}
if($k == 1) {// table data
echo '<tr>';
echo '<th>row</th>';
foreach($v as $key => $value) {
echo '<th>' . $key . '</th>';
}
echo '</tr>';
}
echo '<tr><td>'.$k.'</td>';
foreach($v as $key => $value) {
if($k > 1) {
echo '<td>' . $value . '</td>';
}
else {
echo '<th>' . $value . '</th>';
}
}
echo '</tr>';
if($k == 1) {// table data
echo '</thead>';
}
} // end for each row
echo '</table>';
}// sheet is object
else {echo 'Failed reading';}
$tend=microtime(true);
echo 'It took '.round($tend - $tstart,2).' seconds <br>'; |
Whether you're using If you want to iterate over the rows and columns of worksheet data, and are hitting memory limits doing so, then I'd recommend using iterators rather than pulling it all into an array; and the row/column iterators allow you to specify start and end row/column. |
The conversation has led to a different problem than the one originally reported. I'm reporting an error when calling |
@vaites |
@MarkBaker, of course the library should trust There is anything more I can do to provide more information? |
There's no logical reason why using |
It doesn't make sense to me about Let me evaluate the problem further to try to determine the exact point where it fails and if I can suggest any changes. |
I know that I could do a refactoring of how we store the data validations, so that it's stored it at the worksheet level rather than for each individual cell; but that would be a major bc break, and would entail changes to every Reader and Writer for formats that supports data validation. |
This also caused issues for me, with this simple excel file: |
$inputFileType = 'Xlsx';
$inputFileName = __DIR__ . '/../Bad.File.xlsx';
$callStartTime = microtime(true);
$reader = IOFactory::createReader($inputFileType);
$spreadsheet = $reader->load($inputFileName);
$callEndTime = microtime(true);
$loadCallTime = $callEndTime - $callStartTime;
echo PHP_EOL;
echo 'Call time to load spreadsheet file was ' , sprintf('%.4f', $loadCallTime) , ' seconds' , PHP_EOL;
echo ' Current memory usage: ' , (memory_get_usage(true) / 1024) , ' KB' , PHP_EOL;
$worksheet = $spreadsheet->getActiveSheet();
echo "Worksheet Name: {$worksheet->getTitle()}", PHP_EOL;
echo "Range Dimensions specified in the xlsx file {$worksheet->calculateWorksheetDimension()} that will be used by the toArray() method", PHP_EOL;
echo "Range of cells that contain actual data {$worksheet->calculateWorksheetDataDimension()} that can be passed to the rangeToArray() method", PHP_EOL;
$data = $worksheet->rangeToArray($worksheet->calculateWorksheetDataDimension(), null, true, true, true);
// Echo memory usage
echo ' Current memory usage: ' , (memory_get_usage(true) / 1024) , ' KB' , PHP_EOL;
echo ' Peak memory usage: ' , (memory_get_peak_usage(true) / 1024) , ' KB' , PHP_EOL; Result:
I don't know what you're doing to "import" rows from the file because you don't explain; but check the actual data size determined by |
I am using the Laravel-Excel extension which employs PhpSpreadsheet, so given your example above I assume they just use the |
But there are column styles defined up to column WVI <col customWidth="1" bestFit="1" width="15.88671875" max="16129" min="16129"/> The code that populates the stored value retrieved by a call to For most spreadsheet files, using this value isn't a problem; the value is loaded with the file load, and stored in the Worksheet object, and is generally accurate enough (it might be inaccurate as a measure of cells containing data by a few rows or columns, but rarely enough to create any problem)... files where the value is extremely incorrect and high are the exception. It doesn't take "chunking" into account, and It isn't changed if you add/delete rows/columns either. |
That is abnormal, I am importing files with 20K plus rows in 10 sec.
You can Send me your file I ll test on my setup.
On Mon, 20 Jun 2022 at 05:54, Max ***@***.***> wrote:
This also caused issues for me, with this simple excel file:
Bad File.xlsx
<https://github.com/PHPOffice/PhpSpreadsheet/files/8937383/Bad.File.xlsx>
The entire file is formatted in Arial/10 and for each row you add it seems
to get exponentially worse. Just trying to import 20 rows from this file
was 5-10 seconds.
—
Reply to this email directly, view it on GitHub
<#2781 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH7OOBRJ2D72YQGRNHTLIOLVP7TPJANCNFSM5ULV2K4A>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Met vriendelijke groet,
Maxim
+31 6228 015 28 | ***@***.*** | Gele Rijdersplein 17E - 2R
| 6811 AP Arnhem | The Netherlands
|
Is
Don't worry @maximnl, @MarkBaker has explained everything. |
It should be, but isn't. The problem file in this thread posted by jkdc had a |
I have a similar issue. The culprit is the auto filter. I have a spreadsheet with ~10 worksheets each of them ~1000 rows. With autofilter the memory usage is ~2000MB, without ~40MB. I do not have yet enough internal knowledge of this project to find the reason why the memory is not released after the filter evaluation. |
If you're doing an autofilter evaluation, and memory isn't being released, then that's a very specific issue. Can you provide an example spreadsheet that demonstrates the problem for testing? |
i did investigate a little more. the issue is that my range is defined as that allocates a |
|
Calculating the highest row solves the issue. I think that it should be mentioned somewhere (docs?) that |
I've read better the description for this ticket and seems something different, however I have created the fix for the issue I've encountered #3527 |
Uh oh!
There was an error while loading. Please reload this page.
This is:
I'm working with a file generated by another software that I need to fill: read the file, fill some cells and save it again. I use a reader to get the
Spreadsheet
instance (so I can set the desired cell values) and a writer to save it.What is the expected behavior?
Excel file loaded without issues
What is the current behavior?
PHP runs ou of memory
What are the steps to reproduce?
Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:
After digging into the reader code I found that the range cell is huge (function
shrinkRangeToFit
returns E9:GU5420) so the reader thinks the file big (hundred/thousands of empty cells) but the error can be avoided if:$reader->setReadDataOnly(true)
is used (need the formatting, so no valid for me)strlen($this->data)
is replaced withmb_strlen($this->data)
onsrc/PhpSpreadsheet/Reader/Xls.php
(don't understand why)I tried this, without success:
(seems to be ignored)
$reader->setReadEmptyCells(false)
(no effect)I even but the
What features do you think are causing the issue
Does an issue affect all spreadsheet file formats? If not, which formats are affected?
Tested with XLS and XLSX (attached file)
Which versions of PhpSpreadsheet and PHP are affected?
As far as I know, all of v1 versions (tried with 1.21, 1.22 and 1.23). I tried both on Windows 10 and Ubuntu 20.04 LTS using PHP 8.1.
The text was updated successfully, but these errors were encountered: