Back to all posts

How to Take Full Page Screenshots with Puppeteer

Jonathan Geiger
puppeteerscreenshotsweb-automationjavascriptfull-page

While Puppeteer offers a simple fullPage: true option for taking full page screenshots, the reality is often more complex. In this guide, we'll explore how to reliably capture full-page screenshots with Puppeteer, addressing common challenges and offering practical solutions.

Basic Full Page Screenshot with Puppeteer

Let's start with the simplest approach – using Puppeteer's built-in fullPage option:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  try {
    const page = await browser.newPage();
    await page.setViewport({ width: 1280, height: 800 });
    
    await page.goto('https://github.com', { 
      waitUntil: ['load', 'domcontentloaded'] 
    });
    
    await page.screenshot({
      path: 'full-page-screenshot.png',
      fullPage: true
    });
    
    console.log('Screenshot captured successfully!');
  } catch (error) {
    console.error('Error capturing screenshot:', error);
  } finally {
    await browser.close();
  }
})();

This code should work for simple websites, but modern sites often present challenges that this basic approach doesn't handle.

Common Challenges with Full Page Screenshots

When taking full-page screenshots, you might encounter several issues:

  1. Lazy-loaded images and content - Elements that load only when they enter the viewport
  2. Sticky headers and footers - Elements that may appear multiple times in the screenshot
  3. Animations triggered by scrolling - Visual effects that only appear during user interaction
  4. Very long pages - Websites where Puppeteer's built-in fullPage option might fail
  5. Dynamic height content - Pages that change height as they load or as the user interacts

Let's address these challenges one by one.

Handling Lazy-Loaded Content

The most common issue with full-page screenshots is lazy-loaded images and content that only load when scrolled into view. Here's a technique to trigger lazy loading by scrolling through the entire page before capturing:

const puppeteer = require('puppeteer');

async function scrollPageToBottom(page) {
	await page.evaluate(async () => {
		await new Promise((resolve) => {
			let totalHeight = 0;
			const distance = 100;
			const timer = setInterval(() => {
				const scrollHeight = document.body.scrollHeight;
				window.scrollBy(0, distance);
				totalHeight += distance;

				if (totalHeight >= scrollHeight) {
					clearInterval(timer);
					window.scrollTo(0, 0); // Scroll back to top
					resolve();
				}
			}, 100);
		});
	});
}

(async () => {
	const browser = await puppeteer.launch();
	try {
		const page = await browser.newPage();
		await page.setViewport({ width: 1280, height: 800 });

		await page.goto('https://github.com', {
			waitUntil: 'networkidle2',
		});

		// Scroll to trigger lazy loading
		await scrollPageToBottom(page);

		// Wait a moment for images to finish loading
		await page.waitForTimeout(1000);

		// Now take the full page screenshot
		await page.screenshot({
			path: 'full-page-with-lazy-content.png',
			fullPage: true,
		});

		console.log('Full page screenshot with lazy-loaded content captured!');
	} catch (error) {
		console.error('Error:', error);
	} finally {
		await browser.close();
	}
})();

This approach scrolls the page gradually, triggering lazy-loaded content to appear, and then captures the full page.

For more details on ensuring pages fully load before capturing, check out our guide on how to wait for page to load in Puppeteer.

Advanced Technique: Section-by-Section Capture

For more complex cases, especially with animations that play during scrolling or very long pages, a more reliable approach is to capture the page in sections and merge them together:

const puppeteer = require('puppeteer');
const merge = require('merge-img');
const Jimp = require('jimp');

(async () => {
	const browser = await puppeteer.launch();
	try {
		const page = await browser.newPage();
		await page.goto('https://github.com');

		const path = 'sectioned-full-page.png';

		// Calculate sections needed based on page height
		const { pages, extraHeight, viewport } = await page.evaluate(() => {
			window.scrollTo(0, 0);
			const pageHeight = Math.max(
				document.body.scrollHeight,
				document.documentElement.scrollHeight,
				document.body.offsetHeight,
				document.documentElement.offsetHeight,
				document.body.clientHeight,
				document.documentElement.clientHeight
			);

			return {
				pages: Math.ceil(pageHeight / window.innerHeight),
				extraHeight:
					(pageHeight % window.innerHeight) * window.devicePixelRatio,
				viewport: {
					height: window.innerHeight * window.devicePixelRatio,
					width: window.innerWidth * window.devicePixelRatio,
				},
			};
		});

		console.log(`Taking screenshot in ${pages} sections`);

		// Take screenshots of each section
		const sectionScreenshots = [];
		for (let i = 0; i < pages; i++) {
			// Scroll to position
			await page.evaluate(
				(i, innerHeight) => {
					window.scrollTo(0, i * innerHeight);
				},
				i,
				await page.evaluate(() => window.innerHeight)
			);

			// Wait for any animations or content to settle
			await page.waitForTimeout(400);

			// Capture this section
			const screenshot = await page.screenshot({
				type: 'png',
				captureBeyondViewport: false,
			});

			sectionScreenshots.push(screenshot);
		}

		// Handle single-section case
		if (pages === 1) {
			const screenshot = await Jimp.read(sectionScreenshots[0]);
			screenshot.write(path);
			console.log('Single section screenshot saved');
			return;
		}

		// Handle final section if it's partial
		if (extraHeight > 0) {
			const cropped = await Jimp.read(sectionScreenshots.pop())
				.then((image) =>
					image.crop(
						0,
						viewport.height - extraHeight,
						viewport.width,
						extraHeight
					)
				)
				.then((image) => image.getBufferAsync(Jimp.AUTO));

			sectionScreenshots.push(cropped);
		}

		// Merge all screenshots
		const result = await merge(sectionScreenshots, { direction: true });

		// Save the final image
		await new Promise((resolve) => {
			result.write(path, () => {
				console.log('Full sectioned screenshot saved successfully!');
				resolve();
			});
		});
	} catch (error) {
		console.error('Error:', error);
	} finally {
		await browser.close();
	}
})();

This approach:

  1. Divides the page into viewport-sized sections
  2. Captures each section separately
  3. Processes the final section if it's not a complete viewport
  4. Merges all sections into a single image

You'll need to install the following packages:

npm install puppeteer merge-img jimp

Ultimate Technique: Smart Scrolling with Viewport Measurement

For the most reliable results across a wide range of websites, we can implement a comprehensive solution that:

  1. Precisely measures the total page height
  2. Implements gradual scrolling to trigger all lazy-loaded content
  3. Waits for animations to complete
  4. Returns to the top before capturing the full screenshot
const puppeteer = require('puppeteer');

(async () => {
	const browser = await puppeteer.launch({ headless: true });
	try {
		const page = await browser.newPage();

		// Set a reasonable viewport
		await page.setViewport({ width: 1280, height: 800 });

		// Navigate to the target page with appropriate wait conditions
		await page.goto('https://github.com', {
			waitUntil: ['load', 'domcontentloaded', 'networkidle2'],
		});

		// Calculate the true height of the page
		const pageHeight = await page.evaluate(() => {
			return Math.max(
				document.body.scrollHeight,
				document.documentElement.scrollHeight,
				document.body.offsetHeight,
				document.documentElement.offsetHeight,
				document.body.clientHeight,
				document.documentElement.clientHeight
			);
		});

		// Get the viewport height
		const viewportHeight = await page.evaluate(() => window.innerHeight);

		console.log(
			`Page height: ${pageHeight}px, Viewport height: ${viewportHeight}px`
		);

		// Implement smart scrolling to trigger lazy loading
		console.log('Scrolling page to trigger lazy loading...');
		await page.evaluate(
			async (pageHeight, scrollDuration) => {
				// Start at the top
				window.scrollTo(0, 0);
				await new Promise((resolve) => setTimeout(resolve, 100));

				// Scroll to bottom gradually to trigger lazy loading
				const steps = 20;
				const stepSize = pageHeight / steps;
				const stepDelay = scrollDuration / steps;

				for (let i = 1; i <= steps; i++) {
					window.scrollTo(0, i * stepSize);
					await new Promise((resolve) => setTimeout(resolve, stepDelay));
				}

				// Ensure we reach the bottom
				window.scrollTo(0, pageHeight);
				await new Promise((resolve) => setTimeout(resolve, 300));

				// Return to top
				window.scrollTo(0, 0);
				await new Promise((resolve) => setTimeout(resolve, 300));
			},
			pageHeight,
			2000 // Scroll duration in ms
		);

		// Wait for everything to settle after scrolling
		console.log('Waiting for page to settle...');
		await page.waitForTimeout(1000);

		// Finally, capture the full page screenshot
		console.log('Taking full page screenshot...');
		const screenshot = await page.screenshot({
			path: 'advanced-full-page.png',
			fullPage: true,
		});

		console.log('Full page screenshot captured successfully!');
	} catch (error) {
		console.error('Error capturing screenshot:', error);
	} finally {
		await browser.close();
	}
})();

This approach provides the most reliable results for most websites, but it may still struggle with extremely long pages or complex layouts.

For a simpler approach to taking screenshots with Puppeteer, check out our basic guide on how to take screenshots with Puppeteer.

Simplified Solution: Using a Screenshot API

Setting up Puppeteer for reliable full-page screenshots requires handling numerous edge cases. If you need a faster, more reliable solution without the complexity, CaptureKit Screenshot API offers a simple alternative:

curl "https://api.capturekit.dev/capture?url=https://github.com&full_page=true&access_key=YOUR_ACCESS_KEY"

With CaptureKit, you can control the lazy-loading behavior and scroll duration:

curl "https://api.capturekit.dev/capture?url=https://github.com&full_page=true&full_page_scroll=true&full_page_scroll_duration=800&access_key=YOUR_ACCESS_KEY"

Benefits of Using CaptureKit for Full-Page Screenshots

  • No browser management - Forget about setting up and maintaining Puppeteer
  • Smart lazy-loading - Automatically handles lazy-loaded content and animations
  • Reliable capture - Works with even the most complex websites and layouts
  • Fine-tuned control - Configure scroll behavior and timing as needed

Available Options for Full-Page Screenshots

ParameterTypeDescription
full_pagebooleanCapture the entire page instead of just the visible viewport (default: false)
full_page_scrollbooleanScroll the page to fully load lazy-loaded elements before capturing (default: true)
full_page_scroll_durationnumberTime in milliseconds for scrolling before capturing (default: 400)

Request:

https://api.capturekit.dev/capture?url=https://github.com&full_page=true

Response:

Github Full Page Screenshot

Conclusion

Taking reliable full-page screenshots with Puppeteer requires understanding and addressing several challenges, especially with modern websites that use lazy loading and dynamic content. The techniques in this guide will help you capture complete, accurate screenshots in most scenarios.

For production use cases where reliability and simplicity are priorities, consider using CaptureKit API to eliminate the complexities of browser automation and focus on your core tasks instead.

Happy capturing! 📸

If you encountered this post and found it useful, I write a lot about scraping and Puppeteer. Maybe you will find these scraping tutorials with Puppeteer useful as well: