What is ComputerVision-based sorcery of local image recognition and editing tools for AI assistants.?
ImageSorcery empowers AI assistants with powerful image processing capabilities, allowing users to crop, resize, and rotate images, draw text and shapes, add logos and watermarks, detect objects, and extract text from images using OCR, all locally without sending images to servers.
Documentation
🪄 ImageSorcery MCP
ComputerVision-based 🪄 sorcery of local image recognition and editing tools for AI assistants
🪄 ImageSorcery empowers AI assistants with powerful image processing capabilities:
✅ Crop, resize, and rotate images with precision
✅ Draw text and shapes on images
✅ Add logos and watermarks
✅ Detect objects using state-of-the-art models
✅ Extract text from images with OCR
✅ Use a wide range of pre-trained models for object detection, OCR, and more
✅ Do all of this locally, without sending your images to any servers
Just ask your AI to help with image tasks:
"copy photos with pets from frolder photos to folder pets"
"Find a cat at the photo.jpg and crop the image in a half in height and width to make the cat be centerized"
😉 Hint: Use full path to your files".
"Numerate form fields on this form.jpg with foduucom/web-form-ui-field-detection model and fill the form.md with a list of described fields"
😉 Hint: Specify the model and the confidence".
😉 Hint: Add "use imagesorcery" to make sure it will uses propper tool".
Your tool will combine multiple tools listed below to achieve your goal.
🛠️ Available Tools
Tool
Description
Example Prompt
blur
Blurs specified rectangular or polygonal areas of an image using OpenCV. Can also invert the provided areas f.e. to blur background.
"Blur the area from (150, 100) to (250, 200) with a blur strength of 21 in my image 'test_image.png' and save it as 'output.png'"
change_color
Changes the color palette of an image
"Convert my image 'test_image.png' to sepia and save it as 'output.png'"
crop
Crops an image using OpenCV's NumPy slicing approach
"Crop my image 'input.png' from coordinates (10,10) to (200,200) and save it as 'cropped.png'"
detect
Detects objects in an image using models from Ultralytics. Can return segmentation masks/polygons.
"Detect objects in my image 'photo.jpg' with a confidence threshold of 0.4"
draw_arrows
Draws arrows on an image using OpenCV
"Draw a red arrow from (50,50) to (150,100) on my image 'photo.jpg'"
draw_circles
Draws circles on an image using OpenCV
"Draw a red circle with center (100,100) and radius 50 on my image 'photo.jpg'"
draw_lines
Draws lines on an image using OpenCV
"Draw a red line from (50,50) to (150,100) on my image 'photo.jpg'"
draw_rectangles
Draws rectangles on an image using OpenCV
"Draw a red rectangle from (50,50) to (150,100) and a filled blue rectangle from (200,150) to (300,250) on my image 'photo.jpg'"
draw_texts
Draws text on an image using OpenCV
"Add text 'Hello World' at position (50,50) and 'Copyright 2023' at the bottom right corner of my image 'photo.jpg'"
fill
Fills specified rectangular or polygonal areas of an image with a color and opacity, or makes them transparent. Can also invert the provided areas f.e. to remove background.
"Fill the area from (150, 100) to (250, 200) with semi-transparent red in my image 'test_image.png'"
find
Finds objects in an image based on a text description. Can return segmentation masks/polygons.
"Find all dogs in my image 'photo.jpg' with a confidence threshold of 0.4"
get_metainfo
Gets metadata information about an image file
"Get metadata information about my image 'photo.jpg'"
ocr
Performs Optical Character Recognition (OCR) on an image using EasyOCR
"Extract text from my image 'document.jpg' using OCR with English language"
overlay
Overlays one image on top of another, handling transparency
"Overlay 'logo.png' on top of 'background.jpg' at position (10, 10)"
resize
Resizes an image using OpenCV
"Resize my image 'photo.jpg' to 800x600 pixels and save it as 'resized_photo.jpg'"
rotate
Rotates an image using imutils.rotate_bound function
"Rotate my image 'photo.jpg' by 45 degrees and save it as 'rotated_photo.jpg'"
😉 Hint: detailed information and usage instructions for each tool can be found in the tool's /src/imagesorcery_mcp/tools/README.md.
📚 Available Resources
Resource URI
Description
Example Prompt
models://list
Lists all available models in the models directory
"Which models are available in ImageSorcery?"
😉 Hint: detailed information and usage instructions for each resource can be found in the resource's /src/imagesorcery_mcp/resources/README.md.
🚀 Getting Started# Requirements
Python 3.10 or higher
ffmpeg, libsm6, libxext6, libgl1-mesa-glx - system libraries required by OpenCV
Claude.app, Cline, or another MCP client
These dependencies are typically included with OpenCV installation and don't require separate installation. But they might be missing in some virtual environments like Docker.
1. Load MCP documentation if you have the tool for it. It will give you path to *MCP-configuration file* and other needed instructions.
2. Run command `python -m venv imagesorcery-mcp`
3. Run command `source imagesorcery-mcp/bin/activate`
For windows bash run command `source imagesorcery-mcp/Scripts/activate`
For windows cmd run command `imagesorcery-mcp\Scripts\activate`
4. Run command `pip install imagesorcery-mcp`
5. Run command `imagesorcery-mcp --post-install`
6. Create or modify *MCP-configuration file*.
7. Add the `imagesorcery-mcp` server configuration to the `mcpServers` object in the *MCP-configuration file*
`"imagesorcery-mcp": {"command": "/full/path/to/venv/bin/imagesorcery-mcp","timeout": 100}`
8. Access available models using `models://list` resource from `imagesorcery-mcp`
9. Attempt completion, indicating that the installation and configuration are complete.
Create and activate a virtual environment (Strongly Recommended):
For reliable installation of all components, especially the clip package (installed via the post-install script), it is strongly recommended to use Python's built-in venv module instead of uv venv.
python -m venv imagesorcery-mcp
source imagesorcery-mcp/bin/activate # For Linux/macOS
# source imagesorcery-mcp\Scripts\activate # For Windows
Install the package into the activated virtual environment:
You can use pip or uv pip.
pip install imagesorcery-mcp
# OR, if you prefer using uv for installation into the venv:
# uv pip install imagesorcery-mcp
Run the post-installation script:
This step is crucial. It downloads the required models and attempts to install the clip Python package from GitHub into the active virtual environment.
imagesorcery-mcp --post-install
You can run this process anytime to restore the default models and attempt clip installation.
⚙️ Configuration MCP client
Add to your MCP client these settings.
If imagesorcery-mcp is in your system's PATH after installation, you can use imagesorcery-mcp directly as the command. Otherwise, you'll need to provide the full path to the executable.
"mcpServers": {
"imagesorcery-mcp": {
"command": "imagesorcery-mcp", // Or /full/path/to/venv/bin/imagesorcery-mcp if installed in a venv
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
"mcpServers": {
"imagesorcery-mcp": {
"url": "http://127.0.0.1:8000/mcp", // Use your custom host, port, and path if specified
"transportType": "http",
"autoApprove": ["blur", "change_color", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
"mcpServers": {
"imagesorcery-mcp": {
"command": "imagesorcery-mcp.exe", // Or C:\\full\\path\\to\\venv\\Scripts\\imagesorcery-mcp.exe if installed in a venv
"transportType": "stdio",
"autoApprove": ["blur", "change_color", "crop", "detect", "draw_arrows", "draw_circles", "draw_lines", "draw_rectangles", "draw_texts", "fill", "find", "get_metainfo", "ocr", "overlay", "resize", "rotate"],
"timeout": 100
}
}
📦 Additional Models
Some tools require specific models to be available in the models directory:
When downloading models, the script automatically updates the models/model_descriptions.json file:
For Ultralytics models: Descriptions are predefined in src/imagesorcery_mcp/scripts/create_model_descriptions.py and include detailed information about each model's purpose, size, and characteristics.
For Hugging Face models: Descriptions are automatically extracted from the model card on Hugging Face Hub. The script attempts to use the model name from the model index or the first line of the description.
After downloading models, it's recommended to check the descriptions in models/model_descriptions.json and adjust them if needed to provide more accurate or detailed information about the models' capabilities and use cases.
Running the Server
ImageSorcery MCP server can be run in different modes:
STDIO - default
Streamable HTTP - for web-based deployments
Server-Sent Events (SSE) - for web-based deployments that rely on SSE
STDIO Mode (Default) - This is the standard mode for local MCP clients:
--transport: Choose between "stdio" (default), "streamable-http", or "sse"
--host: Specify host for HTTP-based transports (default: 127.0.0.1)
--port: Specify port for HTTP-based transports (default: 8000)
--path: Specify endpoint path for HTTP-based transports (default: /mcp)
🤝 Contributing
Directory Structure
This repository is organized as follows:
.
├── .gitignore # Specifies intentionally untracked files that Git should ignore.
├── pyproject.toml # Configuration file for Python projects, including build system, dependencies, and tool settings.
├── pytest.ini # Configuration file for the pytest testing framework.
├── README.md # The main documentation file for the project.
├── setup.sh # A shell script for quick setup (legacy, for reference or local use).
├── models/ # This directory stores pre-trained models used by tools like `detect` and `find`. It is typically ignored by Git due to the large file sizes.
│ ├── model_descriptions.json # Contains descriptions of the available models.
│ ├── settings.json # Contains settings related to model management and training runs.
│ └── *.pt # Pre-trained model.
├── src/ # Contains the source code for the 🪄 ImageSorcery MCP server.
│ └── imagesorcery_mcp/ # The main package directory for the server.
│ ├── README.md # High-level overview of the core architecture (server and middleware).
│ ├── __init__.py # Makes `imagesorcery_mcp` a Python package.
│ ├── __main__.py # Entry point for running the package as a script.
│ ├── logging_config.py # Configures the logging for the server.
│ ├── server.py # The main server file, responsible for initializing FastMCP and registering tools.
│ ├── middleware.py # Custom middleware for improved validation error handling.
│ ├── logs/ # Directory for storing server logs.
│ ├── scripts/ # Contains utility scripts for model management.
│ │ ├── README.md # Documentation for the scripts.
│ │ ├── __init__.py # Makes `scripts` a Python package.
│ │ ├── create_model_descriptions.py # Script to generate model descriptions.
│ │ ├── download_clip.py # Script to download CLIP models.
│ │ ├── post_install.py # Script to run post-installation tasks.
│ │ └── download_models.py # Script to download other models (e.g., YOLO).
│ ├── tools/ # Contains the implementation of individual MCP tools.
│ │ ├── README.md # Documentation for the tools.
│ │ ├── __init__.py # Makes `tools` a Python package.
│ │ └── *.py # Implements the tool.
│ └── resources/ # Contains the implementation of individual MCP resources.
│ ├── README.md # Documentation for the resources.
│ ├── __init__.py # Makes `resources` a Python package.
│ └── *.py # Implements the resource.
└── tests/ # Contains test files for the project.
├── test_server.py # Tests for the main server functionality.
├── data/ # Contains test data, likely image files used in tests.
├── tools/ # Contains tests for individual tools.
└── resources/ # Contains tests for individual resources.
Development Setup
Clone the repository:
git clone https://github.com/sunriseapps/imagesorcery-mcp.git # Or your fork
cd imagesorcery-mcp
(Recommended) Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # For Linux/macOS\n\n# venv\Scripts\activate # For Windows
Install the package in editable mode along with development dependencies:
pip install -e ".[dev]"
This will install imagesorcery-mcp and all dependencies from [project.dependencies] and [project.optional-dependencies].dev (including build and twine).
Rules
These rules apply to all contributors: humans and AI.
Read all the README.md files in the project. Understand the project structure and purpose. Understand the guidelines for contributing. Think through how it's relate to you task, and how to make changes accordingly.
Read pyproject.toml.
Make attention to sections: [tool.ruff], [tool.ruff.lint], [project.optional-dependencies] and [project]dependencies.
Strictly follow code style defined in pyproject.toml.
Stick to the stack defined in pyproject.toml dependencies and do not add any new dependencies without a good reason.
Write your code in new and existing files.
If new dependencies needed, update pyproject.toml and install them via pip install -e . or pip install -e ".[dev]". Do not install them diirectly via pip install.
Check out exixisting source codes for examples (e.g. src/imagesorcery_mcp/server.py, src/imagesorcery_mcp/tools/crop.py). Stick to the code style, naming conventions, input and outpput data formats, codeode structure, arcchitecture, etc. of the existing code.
Update related README.md files with your changes.
Stick to the format and structure of the existing README.md files.
Write tests for your code.
Check out existing tests for examples (e.g. tests/test_server.py, tests/tools/test_crop.py).
Stick to the code style, naming conventions, input and outpput data formats, codeode structure, arcchitecture, etc. of the existing tests.
Run tests and linter to ensure everything works:
pytest
ruff check .
In case of fails - fix the code and tests. It is strictly required to have all new code to comply with the linter rules and pass all tests.
Coding hints
Use type hints where appropriate
Use pydantic for data validation and serialization
📝 Questions?
If you have any questions, issues, or suggestions regarding this project, feel free to reach out to:
You can also open an issue in the repository for bug reports or feature requests.
📜 License
This project is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License.