Resolving OOM Errors and Build Timeouts When Optimizing Static Site Generation for 10k+ Pages
When scaling internal documentation platforms beyond 10,000 markdown files, static site generators frequently encounter heap exhaustion (FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed) or CI/CD pipeline timeouts. This guide provides a targeted resolution path for platform engineers facing these bottlenecks. By adjusting runtime memory allocation, implementing incremental compilation, and restructuring asset pipelines, teams can reduce build times by 60–80% while stabilizing memory consumption. This approach aligns with established patterns in modern Developer Portal Architecture & Frameworks and ensures predictable deployment cycles for enterprise-scale knowledge bases.
Context: Identifying the SSG Bottleneck at Scale
Static site generation (SSG) pipelines typically load the entire document graph into memory before rendering. At 10k+ pages, AST traversal, link validation, and template hydration phases trigger garbage collection thrashing. Symptoms include exponential build time growth, OOM crashes on standard CI runners (4GB–8GB), and failed incremental deployments.
Before applying fixes, isolate the bottleneck:
NODE_OPTIONS=--trace-gc npm run build
Monitor peak RSS during the compilation phase. For teams standardizing on Python-based generators, reviewing MkDocs for Internal Docs performance profiles reveals similar memory mapping constraints that require parallelized processing and strict cache invalidation rules.
Exact Fix & Configuration
Apply the following three-tier configuration to stabilize builds and enforce memory boundaries.
1. Runtime Memory & Concurrency Tuning
Override default V8 limits to prevent premature OOM while capping allocation to avoid host machine thrashing.
export NODE_OPTIONS="--max-old-space-size=4096 --max-semi-space-size=64"
export CI=true
npm run build -- --incremental --concurrency=4
2. Framework Configuration
Disable full-graph rebuilds. Configure your SSG to track file hashes and only re-render modified nodes. Enable worker threads for template hydration and disable source map generation in production builds to reduce I/O overhead.
{
"build": {
"workers": 4,
"sourceMaps": false,
"cacheDir": ".ssg-cache",
"linkValidation": "warn",
"parallelHydration": true
}
}
3. Asset Pipeline Optimization
Pre-bundle static assets and serve them via CDN. Flatten nested directory structures to reduce path resolution overhead. Implement a routing manifest instead of dynamic route generation at build time to decouple compilation from filesystem traversal.
Validation & Performance Metrics
After applying the configuration, validate the pipeline using deterministic benchmarks.
- Capture Build Metrics:
time npm run build
ps -o rss,vsz -p $!
- Target Thresholds:
- Build time:
< 8 minuteson a 4-core runner - Peak RSS:
< 3.5GB - GC Pauses:
0pauses exceeding500ms
- Incremental Verification:
Run a link validator against the output directory. Touch a single markdown file and trigger a rebuild. A properly configured pipeline should complete in
< 10 secondsand only output modified HTML files. - Rollback & Audit: If metrics deviate, audit the worker thread pool size and ensure the cache directory resides on a high-throughput SSD or
tmpfsmount. Revert to--concurrency=1temporarily to isolate CPU contention.
Edge Cases & CI/CD Constraints
Large-scale SSG pipelines often fail due to runner timeouts or network I/O limits.
- Strict CI Timeouts (e.g., 15 min): Split the build into parallel matrix jobs by directory prefix.
- Orphaned Links: Configure the SSG to log warnings instead of failing the pipeline.
- Monorepo Contention: Isolate documentation builds from application builds to prevent resource contention.
- Disk I/O Bottlenecks: Monitor write latency on ephemeral runners; switch to
tmpfsfor intermediate build artifacts if latency exceeds50ms. - CDN Staleness: Configure cache headers to bypass
stale-while-revalidateduring deployment windows to prevent partial HTML delivery.
Common Pitfalls
- Leaving source maps enabled in production, which doubles I/O and memory overhead during asset bundling.
- Relying on default Node.js heap limits (typically ~1.5GB) for large graph traversals.
- Running full rebuilds on every commit instead of implementing hash-based incremental caching.
- Ignoring CI runner disk I/O limits when writing thousands of HTML files simultaneously.
- Failing to pre-compile or CDN-host static assets, causing template hydration bottlenecks.
Frequently Asked Questions
Why does my SSG crash with ‘Ineffective mark-compacts near heap limit’ at 10k pages?
The default V8 heap limit (~1.5GB) is insufficient for loading and traversing a 10k+ node AST. The garbage collector cannot reclaim memory fast enough during template hydration, triggering an OOM. Increase --max-old-space-size and enable incremental builds to reduce peak memory pressure.
Can I run parallel builds on a standard GitHub Actions runner? Yes, but you must cap concurrency to 2–4 workers on a 2-core runner to avoid CPU thrashing. Use matrix strategies to split documentation directories across separate jobs if build times exceed runner timeouts.
How do I verify that incremental builds are actually working? Enable verbose logging and check the cache directory for hash mismatches. Touch a single markdown file and run the build command. A properly configured incremental pipeline should complete in under 10 seconds and only output modified HTML files.