Parsing Large Files in Node.js
JavaScript heap out of memory. This is the kind of error you will be getting when you try to parse a large file by reading it entirely into memory in Node.js. The default memory limit for Node.js is around 1.5 GB, and if your file exceeds that, your process will crash.
The Wrong Way
const fs = require('fs');
// Don't do this with large files!
const data = fs.readFileSync('huge-file.csv', 'utf-8');
const lines = data.split('\n');
lines.forEach(line => {
// process each line
});
This reads the entire file into memory at once. For a 2GB CSV file, this will crash.
The Right Way: Use Streams
const fs = require('fs');
const readline = require('readline');
const fileStream = fs.createReadStream('huge-file.csv');
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity,
});
rl.on('line', (line) => {
// Process each line here
// Only one line is in memory at a time
});
rl.on('close', () => {
console.log('File processing complete');
});
Using readline with createReadStream, Node.js reads the file in small chunks and processes one line at a time. Memory usage stays constant regardless of file size.
If you absolutely need more memory for other reasons, you can increase the Node.js heap size with the --max-old-space-size flag:
node --max-old-space-size=4096 your-script.js
But streams are almost always the better solution for large file processing.