Puppeteer vision

Created 9 months ago

MCP server for scraping webpages and converting them to markdown using Puppeteer.

development location documentation public scraping AI

What is Puppeteer vision?

Use Puppeteer to browse a webpage and return a high quality Markdown. Use AI vision capabilities to handle cookies, captchas, and other interactive elements automatically.

Documentation

Puppeteer vision MCP Server

This Model Context Protocol (MCP) server provides a tool for scraping webpages and converting them to markdown format using Puppeteer, Readability, and Turndown. It features AI-driven interaction capabilities to handle cookies, captchas, and other interactive elements automatically.

Features

Scrapes webpages using Puppeteer with stealth mode
Uses AI-powered interaction to automatically handle:
Cookie consent banners
CAPTCHAs
Newsletter or subscription prompts
Paywalls and login walls
Age verification prompts
Interstitial ads
Any other interactive elements blocking content
Extracts main content with Mozilla's Readability
Converts HTML to well-formatted Markdown
Special handling for code blocks, tables, and other structured content
Accessible via the Model Context Protocol
Option to view browser interaction in real-time by disabling headless mode
Easily consumable as an npx package.

Quick Start with NPX

The recommended way to use this server is via npx, which ensures you're running the latest version without needing to clone or manually install.

Prerequisites: Ensure you have Node.js and npm installed.
Environment Setup: The server requires an OPENAI_API_KEY. You can provide this and other optional configurations in two ways:

.env file: Create a .env file in the directory where you will run the npx command.
Shell Environment Variables: Export the variables in your terminal session.

Run the Server: Open your terminal and run:
```
npx -y puppeteer-vision-mcp-server
```

Using as an MCP Tool with NPX

This server is designed to be integrated as a tool within an MCP-compatible LLM orchestrator.

Environment Configuration Details

Regardless of how you run the server (NPX or local development), it uses the following environment variables:

OPENAI_API_KEY: (Required) Your API key for accessing the vision model.
VISION_MODEL: (Optional) The model to use for vision analysis.
API_BASE_URL: (Optional) Custom API endpoint URL.
TRANSPORT_TYPE: (Optional) The transport protocol to use.
PORT: (Optional) The port for the HTTP server in SSE or HTTP mode.
DISABLE_HEADLESS: (Optional) Set to true to run the browser in visible mode.

Communication Modes

The server supports three communication modes:

stdio (Default): Communicates via standard input/output.
SSE mode: Communicates via Server-Sent Events over HTTP.
HTTP mode: Communicates via Streamable HTTP transport with session management.

Tool Usage (MCP Invocation)

The server provides a scrape-webpage tool.

How It Works

The system uses vision-capable AI models to analyze screenshots of web pages and decide on actions like clicking, typing, or scrolling to bypass overlays and consent forms.

Installation & Development (for Modifying the Code)

If you wish to contribute, modify the server, or run a local development version:

Clone the Repository:

git clone https://github.com/djannot/puppeteer-vision-mcp.git
cd puppeteer-vision-mcp

Install Dependencies:
```
npm install
```
Build the Project:
```
npm run build
```
Set Up Environment: Create a .env file in the project's root directory with your OPENAI_API_KEY and any other desired configurations.
Run for Development:
```
npm start
```
Or, for automatic rebuilding on changes:
```
npm run dev
```

Customization (for Developers)

You can modify the behavior of the scraper by editing:

src/ai/vision-analyzer.ts
src/ai/page-interactions.ts
src/scrapers/webpage-scraper.ts
src/utils/markdown-formatters.ts

Dependencies

Key dependencies include:

@modelcontextprotocol/sdk
puppeteer, puppeteer-extra
@mozilla/readability, jsdom
turndown, sanitize-html
openai (or compatible API for vision models)
express (for SSE mode)
zod

Server Config

{
  "mcpServers": {
    "puppeteer-vision-server": {
      "command": "npx",
      "args": [
        "puppeteer-vision"
      ]
    }
  }
}

Links & Status

Repository: github.com

Hosted: No

Global: No

Official: No

Project Info

Hosted Featured

Created At: May 23, 2025

Updated At: Aug 07, 2025

Author: djannot

Category: community

License: MIT

Tags:

development location documentation