development
location
documentation
public
AI
academic research
What is A streamlined Model Context Protocol (MCP) server for author disambiguation and academic research using the OpenAlex.org API.?
This MCP server is specifically designed for AI agents with optimized data structures and enhanced functionality, focusing on author disambiguation, institution resolution, academic work retrieval, citation analysis, and ORCID integration.
Documentation
OpenAlex Author Disambiguation MCP Server
A streamlined Model Context Protocol (MCP) server for author disambiguation and academic research using the OpenAlex.org API. Specifically designed for AI agents with optimized data structures and enhanced functionality.
๐ฏ Key Features# ๐ Core Capabilities
Advanced Author Disambiguation: Handles complex career transitions and name variations
Institution Resolution: Current and past affiliations with transition tracking
Academic Work Retrieval: Journal articles, letters, and research papers
Citation Analysis: H-index, citation counts, and impact metrics
ORCID Integration: Highest accuracy matching with ORCID identifiers
๐ AI Agent Optimized
Streamlined Data: Focused on essential information for disambiguation
Fast Processing: Optimized data structures for rapid analysis
Smart Filtering: Enhanced filtering options for targeted queries
Clean Output: Structured responses optimized for AI reasoning
๐ค Agent Integration
Multiple Candidates: Ranked results for automated decision-making
Structured Responses: Clean, parseable output optimized for LLMs
Error Handling: Graceful degradation with informative messages
Enhanced Filtering: Journal-only, citation thresholds, and temporal filters
๐๏ธ Professional Grade
MCP Best Practices: Built with FastMCP following official guidelines
Tool Annotations: Proper MCP tool annotations for optimal client integration
Resource Management: Efficient HTTP client management and cleanup
Rate Limiting: Respectful API usage with proper delays
๐ Quick Start# Prerequisites
Python 3.10 or higher
MCP-compatible client (e.g., Claude Desktop)
Email address (for OpenAlex API courtesy)
Installation
For detailed installation instructions, see INSTALL.md.
Clone the repository:
git clone https://github.com/drAbreu/alex-mcp.git
cd alex-mcp
Create a virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
Replace /path/to/alex-mcp with the actual path to the repository on your system.
๐ค Using with AI Agents# OpenAI Agents Integration
You can load this MCP server in your OpenAI agent workflow using the agents.mcp.MCPServerStdio interface:
from agents.mcp import MCPServerStdio
async with MCPServerStdio(
name="OpenAlex MCP For Author disambiguation and works",
cache_tools_list=True,
params={
"command": "uvx",
"args": [
"--from", "git+https://github.com/drAbreu/[email protected]",
"alex-mcp"
],
"env": {
"OPENALEX_MAILTO": "[email protected]"
}
},
client_session_timeout_seconds=10
) as alex_mcp:
await alex_mcp.connect()
tools = await alex_mcp.list_tools()
print(f"Available tools: {[tool.name for tool in tools]}")
Academic Research Agent Integration
This MCP server is specifically optimized for academic research workflows:
from alex_agent import run_author_research
# Enhanced functionality with streamlined data
result = await run_author_research(
"Find J. Abreu at EMBO with recent publications"
)
# Clean, structured output for AI processing
print(f"Success: {result['workflow_metadata']['success']}")
print(f"Quality: {result['research_result']['metadata']['result_analysis']['quality_score']}/100")
Features: Comprehensive work data with flexible filtering for targeted queries
๐ Data Optimization# Focused Information Architecture
This MCP server provides focused, structured data specifically designed for AI agent consumption:
Author Data Features
Identity Resolution: Names, ORCID, alternatives for disambiguation
Affiliation Tracking: Current and historical institutional connections
Impact Metrics: Citation counts, h-index, and scholarly impact
Research Context: Fields, concepts, and domain expertise
Career Analysis: Temporal affiliation changes and transitions
Work Data Features
Publication Metadata: Title, DOI, venue, and publication details
Impact Assessment: Citation counts and scholarly influence
Access Information: Open access status and availability
Authorship Details: Complete author lists and institutional affiliations
Research Classification: Topics, concepts, and domain categorization
Enhanced Filtering
works = await retrieve_author_works(
author_id="https://openalex.org/A123456789",
type="journal-article", # Focus on journal publications
open_access_is_oa=True, # Open access only
order_by="citations", # Most cited first
limit=15
)
# Career transition analysis
authors = await search_authors(
name="J. Abreu",
institution="EMBO", # Current institution
topic="Machine Learning", # Research focus
limit=10
)
๐งช Example Usage# Author Disambiguation
from alex_mcp.server import search_authors_core
# Comprehensive author search
results = search_authors_core(
name="J Abreu Vicente",
institution="EMBO",
topic="Machine Learning",
limit=20
)
print(f"Found {results.total_count} candidates")
for author in results.results:
print(f"- {author.display_name}")
if author.affiliations:
current_inst = author.affiliations[0].institution.display_name
print(f" Institution: {current_inst}")
print(f" Metrics: {author.cited_by_count} citations, h-index {author.summary_stats.h_index}")
if author.x_concepts:
fields = [c.display_name for c in author.x_concepts[:3]]
print(f" Research: {', '.join(fields)}")
Academic Work Analysis
from alex_mcp.server import retrieve_author_works_core
# Comprehensive work retrieval
works = retrieve_author_works_core(
author_id="https://openalex.org/A5058921480",
type="journal-article", # Academic focus
order_by="citations", # Impact-based ordering
limit=20
)
print(f"Found {works.total_count} publications")
for work in works.results:
print(f"- {work.title}")
if work.locations:
journal = work.locations[0].source.display_name
print(f" Published in: {journal} ({work.publication_year})")
print(f" Impact: {work.cited_by_count} citations")
if work.open_access and work.open_access.is_oa:
print(" โ Open Access")
Institution and Field Analysis
def analyze_career_path(author_result):
affiliations = author_result.affiliations
if len(affiliations) > 1:
print("Career path:")
for aff in sorted(affiliations, key=lambda x: min(x.years)):
years = f"{min(aff.years)}-{max(aff.years)}"
print(f" {years}: {aff.institution.display_name}")
# Research evolution
if author_result.x_concepts:
print("Research areas:")
for concept in author_result.x_concepts[:5]:
print(f" {concept.display_name} (score: {concept.score:.2f})")
# Usage
results = search_authors_core("Jorge Abreu Vicente")
if results.results:
analyze_career_path(results.results[0])
๐งโ๐ป Development & Testing# Project Structure
alex-mcp/
โโโ src/alex_mcp/
โ โโโ server.py # Main MCP server
โ โโโ data_objects.py # Data models and structures
โ โโโ utils.py # Utility functions
โโโ examples/
โ โโโ basic_usage.py # Simple examples
โ โโโ advanced_queries.py # Complex query examples
โ โโโ integration_demo.py # AI agent integration
โโโ tests/
โ โโโ test_server.py # Server functionality tests
โ โโโ test_integration.py # Integration tests
โโโ docs/
โโโ api_reference.md # Detailed API documentation
Running Tests
pip install -e ".[test]"
# Run functionality tests
pytest tests/test_server.py -v
# Test with real queries
python examples/basic_usage.py
# Test AI agent integration
python examples/integration_demo.py
Development Examples
python examples/basic_usage.py --query "J. Abreu" --institution "EMBO"
# Test work retrieval
python examples/advanced_queries.py --author-id "A123456789" --type "journal-article"
# Test integration patterns
python examples/integration_demo.py --workflow "career-analysis"
๐ Integration Examples# Academic Research Workflows
Perfect integration with AI-powered research analysis:
from alex_agent import AcademicResearchAgent
agent = AcademicResearchAgent(
mcp_servers=[alex_mcp], # Streamlined data processing
model="gpt-4.1-2025-04-14"
)
# Complex research queries with structured data
result = await agent.research_author(
"Find J. Abreu at EMBO with machine learning publications"
)
# Rich, structured output for AI reasoning
print(f"Quality Score: {result.quality_score}/100")
print(f"Author disambiguation: {result.confidence}")
print(f"Research fields: {result.research_domains}")
Multi-Agent Systems
async def research_collaboration_network(seed_author):
# Find primary author
authors = await alex_mcp.search_authors(seed_author)
primary = authors['results'][0]
# Get their works
works = await alex_mcp.retrieve_author_works(
primary['id'],
type="journal-article"
)
# Analyze co-authors and build network
collaborators = set()
for work in works['results']:
for authorship in work.get('authorships', []):
collaborators.add(authorship['author']['display_name'])
return {
'primary_author': primary,
'publication_count': len(works['results']),
'collaborator_network': list(collaborators),
'research_impact': sum(w['cited_by_count'] for w in works['results'])
}
๐ค Contributing
We welcome contributions to improve functionality and add new features:
Fork the repository
Create a feature branch: git checkout -b feature/enhanced-filtering
Add tests: Ensure your changes maintain data quality and structure
Submit a pull request: Include examples and documentation
Development Priorities
Enhanced filtering capabilities
Additional data enrichment
Performance optimizations
Integration examples
Documentation improvements
๐ License
This project is licensed under the MIT License. See LICENSE for details.