Add New Faculty Member
Color guide
Pipeline — how each monthly digest is built
Models in use
Monthly digest — verification (S3) and summarization (S4). Fast and cheap; sufficient for structured fact-checking and short summaries.
Quarterly highlights — curation and narrative generation. More capable model used where output quality matters most.
All calls go through the Anthropic API. No fine-tuning; standard prompt engineering only.
Model versions are pinned in shared/digest_guardrails.py.
Data sources
- arXiv — preprints (cs, eess, q-bio, stat)
- Google Scholar via SerpAPI — papers, citations, news mentions
- PubMed — biomedical publications
- EurekAlert — university press releases
- Faculty websites — scraped per-person (author_specific)
- LinkedIn via Lix API — posts, announcements (credit-limited)
SerpAPI and Lix credits are consumed per run. LinkedIn enrichment may be skipped
to conserve credits (--skip-linkedin flag).
Item provenance badges
Each item in the digest carries a badge showing where it was found:
author_specific). Highest-trust source.
scraped_webpage).
general). Requires stronger LLM verification.
Hover any item's badges to see the LLM's verify_reason.
Known limitations
Pipeline — how each faculty profile is built and maintained
Models in use
All profile pipeline steps — web enrichment, theme extraction, keyword extraction, bio rewriting. Fast and cost-effective for structured extraction tasks.
Faculty profiles run on a weekly schedule and on-demand via "↺ Update Profile." Incremental mode skips LLM steps if the source data hasn't changed.
Known limitations
Pipeline — how a synergies match is computed
Models in use
Material compression when inputs are large. Fast, cheap, used only when needed.
Entity extraction and executive summary — the core matching intelligence. Highest-capability model; costs reflect this (~$0.05–$2 per project depending on materials size).
Cost and CO₂ estimates are shown on each project card after a run.