Back to all posts

Markdown Extraction Support

Jonathan Geiger
product-updates

We've just released a new feature in CaptureKit based on user feedback: Markdown extraction. This update enhances CaptureKit's content extraction capabilities, allowing you to retrieve clean, structured text from web pages.

📝 Markdown Extraction (in /content API)

You can now extract the Markdown representation of web pages using the /content endpoint. This makes it easier to work with the textual content of web pages in a format that's both human-readable and machine-processable.

Example Request

GET https://api.capturekit.dev/content?access_key=<your-access-key>&url=https://capturekit.dev&include_markdown=true

Example Response

{
  "success": true,
  "data": {
    "metadata": { ... },
    "links": { ... },
    "html": "<html><body><h1>Hello, world!</h1></body></html>",
    "markdown": "# Hello, world!"
  }
}

Parameters

  • url (string, required): URL of the webpage
  • access_key (string, required): Your API key
  • include_markdown (boolean, optional): Set to true to include Markdown data (defaults to false)

Why Markdown?

Markdown provides several advantages over raw HTML:

  1. Readability: Markdown is cleaner and easier to read than HTML
  2. Simplicity: It removes unnecessary styling and formatting
  3. Portability: Easy to use in various applications and platforms
  4. Text Processing: Ideal for content analysis, summarization, and AI processing

Use Cases

  • Content Management: Import web content directly into your CMS
  • AI Processing: Feed web content to LLMs and other AI systems in a clean format
  • Documentation: Extract documentation from websites for offline use
  • Knowledge Bases: Build internal knowledge repositories from web content

Final Notes

This feature was developed in direct response to user feedback. We're committed to building CaptureKit to meet your real-world needs.

Have ideas for more features? Let us know! We're actively developing CaptureKit based on user input.

Thanks for being part of the journey!