A peer-reviewed pub crawl through the literature

MCP server giving LLMs access to PubMed, FDA drug labelling, UK medicines data, and ClinicalTrials.gov. Built by PharmaTools.AI.

v2.1.1 TypeScript MCP Protocol MIT
12
MCP Tools
6
External APIs
~2.1k
Lines of Code
27
Type Definitions

Repository Structure

Source layout and module organisation

  pubcrawl/
  • πŸ“ src/
  • ⚑ index.ts 50 lines
  • πŸ”· types.ts 168 lines
  • πŸ“ lib/
  • πŸ“„ ncbi.ts 221 lines
  • πŸ“„ xml-parser.ts 296 lines
  • πŸ“„ clinicaltrials.ts 261 lines
  • πŸ“„ emc.ts 241 lines
  • πŸ“„ openfda.ts 117 lines
  • πŸ“„ dailymed.ts 78 lines
  • πŸ“„ cache.ts 66 lines
  • πŸ“„ label-mapping.ts 33 lines
  • πŸ“ tools/
  • πŸ”§ search.ts 83 lines
  • πŸ”§ abstract.ts 133 lines
  • πŸ”§ fulltext.ts 117 lines
  • πŸ”§ cite.ts 225 lines
  • πŸ”§ trending.ts 122 lines
  • πŸ”§ related.ts 76 lines
  • πŸ”§ uspi.ts 141 lines
  • πŸ”§ smpc.ts 66 lines
  • πŸ”§ compare-labels.ts 154 lines
  • πŸ”§ search-indication.ts 94 lines
  • πŸ”§ trials-search.ts 78 lines
  • πŸ”§ trials-detail.ts 51 lines
Config & Meta
  • πŸ“¦ package.json @pharmatools/pubcrawl
  • βš™οΈ tsconfig.json ES2022 / NodeNext
  • πŸ”Œ server.json MCP manifest
  • πŸ“„ README.md
  • πŸ“„ CLAUDE.md Agent hints
  • πŸ”’ .env.example NCBI_API_KEY
  • βš–οΈ LICENSE MIT
Dependencies
@modelcontextprotocol/sdk ^1.0.0
cheerio ^1.2.0
dotenv ^16.0.0
fast-xml-parser ^4.3.0
typescript ^5.3.0
zod via MCP SDK

Architecture

Module relationships and data flow through the system

Entry
index.ts β€” MCP Server init, registers 12 tools via StdioServerTransport
Tools
search
abstract
fulltext
related
cite
trending
uspi
smpc
compare-labels
search-indication
trials-search
trials-detail
Libraries
ncbi β€” esearch, efetch, elink, esummary, pmidToPmcid
xml-parser β€” PubMed XML, JATS, SPL parsing
emc β€” eMC search + SmPC HTML scraping
dailymed β€” SPL XML retrieval
openfda β€” Drug label search by indication
clinicaltrials β€” ClinicalTrials.gov v2 API
cache β€” LRU cache (500 entries, TTL-based)
label-mapping β€” US/UK section LOINC↔SmPC mapping
External
NCBI E-Utilities
PubMed Central OA
DailyMed / NLM
OpenFDA
eMC (medicines.org.uk)
ClinicalTrials.gov v2
Types
types.ts β€” 27 interfaces: PubMedArticle, FullAbstract, USPIResult, SmPCResult, ClinicalTrialDetail, …

Data Flow

How a request moves through PubCrawl

πŸ€–
LLM Client
Claude, Cursor, etc.
β†’ MCP β†’
⚑
StdioTransport
JSON-RPC over stdin/out
β†’
πŸ”§
Tool Handler
Zod schema validation
β†’
πŸ“š
Library
Rate-limited fetch + cache
β†’
🌐
External API
PubMed, FDA, eMC, CT.gov

Tools Reference

12 MCP tools across three domains

search_pubmed Literature

Search PubMed with filters for date range, article type, and sort order. Returns PMIDs, titles, authors, journals, and DOIs.

πŸ“„ tools/search.ts β†’ lib/ncbi.ts

get_abstract Literature

Full structured abstract β€” labeled sections (background, methods, results, conclusions) with keywords and MeSH terms.

πŸ“„ tools/abstract.ts β†’ lib/ncbi.ts, xml-parser.ts

get_full_text Literature

Full text of open-access PMC articles. Parsed sections, figure/table captions, reference counts.

πŸ“„ tools/fulltext.ts β†’ lib/ncbi.ts, xml-parser.ts

find_related Literature

Find similar articles using PubMed's neighbor algorithm, ranked by relevance score.

πŸ“„ tools/related.ts β†’ lib/ncbi.ts

format_citation Literature

Generate formatted citations in APA, Vancouver, Harvard, or BibTeX style from a PMID.

πŸ“„ tools/cite.ts β†’ lib/ncbi.ts, xml-parser.ts

trending_papers Literature

Recent papers on a topic, optionally filtered to high-impact journals (Nature, NEJM, Lancet, JAMA, etc.).

πŸ“„ tools/trending.ts β†’ lib/ncbi.ts

get_uspi Drug Label

US Prescribing Information from DailyMed β€” indications, dosing, warnings, contraindications. Parsed from FDA SPL XML.

πŸ“„ tools/uspi.ts β†’ lib/dailymed.ts, xml-parser.ts

get_smpc Drug Label

UK Summary of Product Characteristics from eMC β€” numbered SmPC sections via HTML scraping.

πŸ“„ tools/smpc.ts β†’ lib/emc.ts

compare_labels Drug Label

Side-by-side US (USPI) vs UK (SmPC) comparison. Spot regulatory differences in indications, warnings, dosing.

πŸ“„ tools/compare-labels.ts β†’ lib/dailymed.ts, emc.ts, label-mapping.ts

search_by_indication Drug Label

Find drugs approved for a medical condition. Searches FDA via OpenFDA, cross-references UK availability on eMC.

πŸ“„ tools/search-indication.ts β†’ lib/openfda.ts, emc.ts

search_trials Trials

Search ClinicalTrials.gov β€” filter by condition, intervention, status, phase. Returns NCT IDs, sponsors, enrollment.

πŸ“„ tools/trials-search.ts β†’ lib/clinicaltrials.ts

get_trial Trials

Full trial details β€” eligibility, study design, arms, primary/secondary outcomes, locations, associated PMIDs.

πŸ“„ tools/trials-detail.ts β†’ lib/clinicaltrials.ts

External Data Sources

Six APIs powering the tools, all rate-limited and cached

NCBI E-Utilities
PubMed search, fetch, link, summary
PubMed Central
Open-access full text (JATS XML)
DailyMed / NLM
FDA structured product labels (SPL)
OpenFDA
Drug label search by indication
eMC
UK medicines SmPC (HTML scraping)
ClinicalTrials.gov
Trial search & detail (v2 API)
Cache TTLs
SEARCH
1 hour
ABSTRACT
24 hours
FULLTEXT
24 hours
RELATED
1 hour
LABEL
24 hours
TRIAL
4 hours

Type Definitions

Key interfaces from types.ts

InterfacePurposeKey Fields
PubMedArticleSearch result summarypmid, title, authors[], journal, doi
FullAbstractComplete abstract with metadataabstract_sections[], keywords[], mesh_terms[]
FullTextResultPMC open-access full textsections[], figure_captions[], table_captions[]
RelatedArticleextends PubMedArticle+ relevance_score
USPIResultUS prescribing informationdrug_name, setid, sections: LabelSection[]
SmPCResultUK SmPCdrug_name, product_id, sections: LabelSection[]
CompareLabelsResultUS/UK comparisoncomparisons: LabelComparison[]
IndicationSearchResultDrugs by conditioncondition, drugs: DrugApprovalEntry[]
ClinicalTrialSummaryTrial search resultnct_id, status, phase, interventions[]
ClinicalTrialDetailextends Summary+ eligibility, design, arms[], outcomes[]
CacheEntry<T>LRU cache wrappervalue: T, timestamp, ttl

Rate Limiting

Per-API request throttling to respect service limits

APIDelayTimeoutNotes
NCBI300ms (100ms w/ key)15sAPI key β†’ 10 req/s vs 3 req/s
DailyMed200ms20sNLM services
eMC1000ms20sHTML scraping, polite crawling
OpenFDA300ms15sNo key required
ClinicalTrials.gov1200ms15s~50 req/min