-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Memory leak #2092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A real memory leak would be a bug; but PHPSpreadsheet doesn't suffer from that. Rather, an "in memory" spreadsheet uses a lot of memory, and just because a file that you load is small, doesn't mean that its "in memory" representation is small..... it all depends on the number of worksheets/rows/columns We provide a series of approaches for working with larger spreadsheets, from loading only specific worksheets, to loading only part of a worksheet; to loading only the raw data in the cells without formatting information; and also provide options to allow caching of cell data, reducing the memory requirements. These are all documented |
the memory requirement is not a problem. What is a problem is that each opened excel file increase the memory usage without never releasing the memory edit : it seems to stick at around 400mb of ram usage while everything should be cleared from memory so it should go back to the same amount that at the beginning
at the line of the memory_get_usage, the spreadsheet is not "in memory" |
Looks to me your PHP garbage collection is not directly taking place. (Some details: https://stackoverflow.com/a/584982/4712173) Maybe it's something else, but worth a look. |
i tried with that after unset, but it still use more and more memory, until reaching around 400mB. It must be the size necessary to load my biggest file. it reached 400mb when my xlsx file is 20mb gc_collect_cycles();
sleep(1); |
This also better support image cloning with a proper clone of the GD resource. #2092
Three things:
I used the following script to load pretty much all files existing in, or generated by the project. It read the file two times. First to warm up the cache, and second for real measurement. Before running the script, normal unit tests should be run once to generate all files in <?php
use PhpOffice\PhpSpreadsheet\IOFactory;
require 'vendor/autoload.php';
function memory(): int
{
gc_collect_cycles();
return memory_get_usage();
}
function load(string $fileName): void
{
$spreadsheet = IOFactory::load($fileName);
$spreadsheet->disconnectWorksheets();
}
function monitorMemory(callable $callable): int
{
$before = memory();
$callable();
$after = memory();
$diff = $after - $before;
echo number_format($before) . PHP_EOL;
echo number_format($after) . PHP_EOL;
echo '+ ' . number_format($diff) . PHP_EOL;
echo PHP_EOL;
return $diff;
}
$files = [
...glob('/tmp/phpspreadsheet/*.xls'),
...glob('/tmp/phpspreadsheet/*.xlsx'),
...glob('/tmp/phpspreadsheet/*.csv'),
...glob('tests/data/Reader/CSV/*'),
...glob('tests/data/Reader/Gnumeric/*'),
...glob('tests/data/Reader/Ods/*'),
...glob('tests/data/Reader/XLS/*'),
...glob('tests/data/Reader/XLSX/*'),
];
$except = [
'tests/data/Reader/CSV/empty.csv',
'tests/data/Reader/Ods/corruptMeta.ods',
'tests/data/Reader/XLSX/double_attr_drawing.xlsx',
];
$files = array_diff($files, $except);
foreach ($files as $fileName) {
$do = function () use ($fileName): void {
load($fileName);
};
echo $fileName . PHP_EOL;
// Warmup caches
$do();
// Do it for real while monitoring memory
$diff = monitorMemory($do);
if ($diff > 0) {
throw new Exception('Memory leak found !');
}
}
echo count($files) . ' files loaded' . PHP_EOL;
echo 'with a peak memory usage of : ' . number_format(memory_get_peak_usage()) . PHP_EOL;
echo 'with a final memory usage of: ' . number_format(memory()) . PHP_EOL; This would end up with:
|
how to clear the internal caches ? I don't think any script is supposed to keep things in memory once everything have been unset. thanks for the fixes, once it's published I will test if I see more leaks |
|
When creating a spreadsheet, and writing to Xlsx, then to Xls, then reading the Xls, it would leak memory during reading. Fixes #2092
we use chunkreadfilter , this loads only the range of rows and lets u limit the arrays still , i am able to load about 1K rows per few sec, so large files just time out after 40K+ rows. (2-3 minutes) |
All but 6 chart samples can be rendered by Sample35. Of those 6, 3 of the problems are because the script runs out of memory processing them. Adopting a suggestion from @MAKS-dev in issue PHPOffice#2092, adding a call to gc_collect_cycles after the charts from each spreadsheet is rendered appears to make it possible to include those 3 spreadsheets in Sample35 after all. Also take advantage of this opportunity to correct a number (hopefully all) of Scrutinizer problems with JpgraphRendererBases.
* Memory Leak in Sample35 All but 6 chart samples can be rendered by Sample35. Of those 6, 3 of the problems are because the script runs out of memory processing them. Adopting a suggestion from @MAKS-dev in issue #2092, adding a call to gc_collect_cycles after the charts from each spreadsheet is rendered appears to make it possible to include those 3 spreadsheets in Sample35 after all. Also take advantage of this opportunity to correct a number (hopefully all) of Scrutinizer problems with JpgraphRendererBases. * Minor Fix Problem running 8.1 unit tests. * Resolve Problems with Pie 3D Charts Minor fix, leaving only one spreadsheet unusable in Sample35. The reasons for its unusability are now documented in the code. * Mitoteam Made Changes Discussing this problem with them, they decided they should make a change for Pie3D rather than forcing us to use the workaround pushed earlier. Change to require mitoteam 10.2.3, revert workaround.
This is a pretty frustrating behaviour if you're using PhpSpreadsheet in a webserver thread that gets reused to process excel file uploads. Users are unlikely to upload the same excel file as each other so the cache is useless, but the memory can't be cleared. |
I'm having similar problems...I'm using a chunkreadfilter to process a large (250k rows) spreadsheet in chunks. However, memory continues to grow even as I unset variables when I am done with them. here's a code snippet: // self::$_CHUNK_SIZE = 25000;
for ($row = 2; $row <= $nb_rows; $row += self::$_CHUNK_SIZE) {
$chunk_filter->setRows($row, self::$_CHUNK_SIZE);
$reader->setReadFilter($chunk_filter);
$sheet = $reader->load($filepath);
$raw_data = $sheet->getActiveSheet()->toArray();
unset($sheet);
$upload_string = $this->prepareDataForBulkUpload($raw_data);
unset($raw_data);
// do something with $upload_string
unset($upload_string);
} I would expect that the @PowerKiKi - What can be done to free up the memory usage? |
Suggestion #1 Suggestion #2 |
@MarkBaker thanks for the prompt reply! the |
This is:
What is the expected behavior?
No memory leak
What is the current behavior?
Memory leak
What are the steps to reproduce?
The memory doesn't stop growing. My files are xls and xlsx, varies from 30kB to 21MB
The text was updated successfully, but these errors were encountered: