即时通讯使用
PHPExcel读取.xls文件.我见面的时间很短
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 730624 bytes) in Excel\PHPExcel\Shared\OLERead.PHP on line 93
经过一些谷歌搜索,我尝试chunkReader来防止这种情况(甚至在PHPExcel主页上提到),但我仍然坚持这个错误.
我的想法是,通过大块阅读器,我将逐个阅读文件,我的记忆不会溢出.但是必须有一些严重的记忆漏洞?或者我释放一些记忆力不好?我甚至试图将服务器ram提升到1GB.我试图阅读的文件大小约为700k,这不是那么多(我也读取~20MB pdf,xlsx,docx,doc等文件没有问题).所以我假设我可能会忽略一些小的巨魔.
代码看起来像这样
function parseXLS($fileName){ require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.PHP'; require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.PHP'; $inputFileType = 'Excel5'; /** Create a new Reader of the type defined in $inputFileType **/ $objReader = PHPExcel_IOFactory::createReader($inputFileType); /** Define how many rows we want to read for each "chunk" **/ $chunkSize = 20; /** Create a new Instance of our Read Filter **/ $chunkFilter = new chunkReadFilter(); /** Tell the Reader that we want to use the Read Filter that we've Instantiated **/ $objReader->setReadFilter($chunkFilter); /** Loop to read our worksheet in "chunk size" blocks **/ /** $startRow is set to 2 initially because we always read the headings in row #1 **/ for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) { /** Tell the Read Filter,the limits on which rows we want to read this iteration **/ $chunkFilter->setRows($startRow,$chunkSize); /** Load only the rows that match our filter from $inputFileName to a PHPExcel Object **/ $objPHPExcel = $objReader->load($fileName); // Do some processing here // Free up some of the memory $objPHPExcel->disconnectWorksheets(); unset($objPHPExcel); } }
这里是chunkReader的代码
class chunkReadFilter implements PHPExcel_Reader_IReadFilter { private $_startRow = 0; private $_endRow = 0; /** Set the list of rows that we want to read */ public function setRows($startRow,$chunkSize) { $this->_startRow = $startRow; $this->_endRow = $startRow + $chunkSize; } public function readCell($column,$row,$worksheetName = '') { // Only read the heading row,and the rows that are configured in $this->_startRow and $this->_endRow if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) { return true; } return false; } }
所以我在
How to read large worksheets from large Excel files (27MB+) with PHPExcel?找到了有趣的解决方案
作为附录3的问题
edit1:也有了这个解决方案,我用我最喜欢的errr消息来阻塞,但我发现了一些关于缓存的东西,所以我实现了这个
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_PHPTemp; $cacheSettings = array(' memoryCacheSize ' => '8MB'); PHPExcel_Settings::setCacheStorageMethod($cacheMethod,$cacheSettings);
最近我测试它仅适用于小于10MB的xls文件,但它似乎工作(我也设置$objReader-> setReadDataOnly(true);)它似乎足够平衡以实现速度和内存消耗. (如果可能的话,我将更多地遵循我的棘手路径)
EDIT2:
所以我做了一些进一步的研究,发现我的方式不需要大块阅读器. (对我而言,内存问题与大块阅读器相同,没有它.)所以我对我的问题的最终答案是这样的,它读取.xls文件(只有来自单元格的数据,没有格式化,甚至过滤公式).当我使用cache_tp_PHP_temp我能够在几秒钟内读取xls文件(测试到10MB)和大约10k行和多列并且没有内存问题
function parseXLS($fileName){ /** PHPExcel_IOFactory */ require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/IOFactory.PHP'; require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel/ChunkReadFilter.PHP'; require_once dirname(__FILE__) . './sphider_design/include/Excel/PHPExcel.PHP'; $inputFileName = $fileName; $fileContent = ""; //get inputFileType (most of time Excel5) $inputFileType = PHPExcel_IOFactory::identify($inputFileName); //initialize cache,so the PHPExcel will not throw memory overflow $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_PHPTemp; $cacheSettings = array(' memoryCacheSize ' => '8MB'); PHPExcel_Settings::setCacheStorageMethod($cacheMethod,$cacheSettings); //initialize object reader by file type $objReader = PHPExcel_IOFactory::createReader($inputFileType); //read only data (without formating) for memory and time performance $objReader->setReadDataOnly(true); //load file into PHPExcel object $objPHPExcel = $objReader->load($inputFileName); //get worksheetIterator,so we can loop sheets in workbook $worksheetIterator = $objPHPExcel->getWorksheetIterator(); //loop all sheets foreach ($worksheetIterator as $worksheet) { //use worksheet rowIterator,to get content of each row foreach ($worksheet->getRowIterator() as $row) { //use cell iterator,to get content of each cell in row $cellIterator = $row->getCellIterator(); //dunno $cellIterator->setIterateOnlyExistingCells(false); //iterate each cell foreach ($cellIterator as $cell) { //check if cell exists if (!is_null($cell)) { //get raw value (without formating,and all unnecessary trash) $rawValue = $cell->getValue(); //if cell isnt empty,print its value if ((trim($rawValue) <> "") and (substr(trim($rawValue),1) <> "=")){ $fileContent .= $rawValue . " "; } } } } } return $fileContent; }