Design Lab Faculty Intelligence

Sign in to continue.

📄 First time? Read the 1-page quickstart
Design Lab Faculty Intelligence
UC San Diego Design Lab
📄 Quickstart 🔒
Faculty Profiles
All 58 Design Lab faculty — search, filter, and manage profiles
Loading…

Add New Faculty Member

🎨 Color guide
Faculty Highlights
Publications, news, and recognition across all Design Lab faculty
Loading…
Synergies Finder
Match Design Lab faculty against external collaborators, partners, and funding calls
AI Model Card
How AI is used across Faculty Profiles, Faculty Highlights, and Synergies Finder

Pipeline — how each monthly digest is built

S1 · Discover
rule-based · free
Crawls arXiv, Google Scholar (via SerpAPI), PubMed, EurekAlert press releases, faculty personal websites, and LinkedIn (via Lix API). Produces a raw candidate list for every faculty member.
S2 · Filter
rule-based · free
Keyword matching against faculty name, UCSD affiliation tokens, and topic keywords. Drops obvious false positives before any LLM call.
S3 · Verify
claude-haiku-4-5 · ~$0.0003/item
LLM checks each candidate: Is this actually about the right person? Is the date within the coverage window? Assigns high / medium / low confidence and writes a verify_reason (visible on item hover).
S4 · Summarize
claude-haiku-4-5 · ~$0.0005/item
LLM writes a 1–2 sentence plain-English description of each verified item. Results cached per-item so reruns skip already-summarized items.
Q · Quarterly
claude-sonnet-4-5 · higher quality
Aggregates 3 months of digest data. A second LLM pass (Sonnet) curates highlights, selects representative items, and generates the downloadable .docx report.

Models in use

claude-haiku-4-5

Monthly digest — verification (S3) and summarization (S4). Fast and cheap; sufficient for structured fact-checking and short summaries.

claude-sonnet-4-5

Quarterly highlights — curation and narrative generation. More capable model used where output quality matters most.

All calls go through the Anthropic API. No fine-tuning; standard prompt engineering only. Model versions are pinned in shared/digest_guardrails.py.

Data sources

  • arXiv — preprints (cs, eess, q-bio, stat)
  • Google Scholar via SerpAPI — papers, citations, news mentions
  • PubMed — biomedical publications
  • EurekAlert — university press releases
  • Faculty websites — scraped per-person (author_specific)
  • LinkedIn via Lix API — posts, announcements (credit-limited)

SerpAPI and Lix credits are consumed per run. LinkedIn enrichment may be skipped to conserve credits (--skip-linkedin flag).

Item provenance badges

Each item in the digest carries a badge showing where it was found:

📍 Faculty site Found on the faculty member's own website or profiles (author_specific). Highest-trust source.
🌐 Web Discovered by scraping a specific known page (scraped_webpage).
🔍 Search Found via general web or Scholar search (general). Requires stronger LLM verification.
⚠ Medium confidence LLM verified but flagged lower certainty. Hover for reasoning.

Hover any item's badges to see the LLM's verify_reason.

Known limitations

🔍
Coverage gaps. Items outside indexed sources (conference websites, institutional repositories, social media beyond LinkedIn) will be missed.
💳
LinkedIn is credit-limited. Not every faculty member is enriched every run; LinkedIn-sourced items may appear inconsistently.
📅
Date attribution. Items are dated by publication, not discovery. A paper published in March but indexed in May may appear in the May digest.
🤖
LLM summaries. Descriptions are AI-generated. Verify against the source link before citing or forwarding to faculty.
🔁
No deduplication across months. An item that spans a month boundary may appear in two consecutive digests.

Pipeline — how each faculty profile is built and maintained

Collect
rule-based · free
Pulls structured data from UCSD Profiles (bio, title, department), ORCID (publication IDs), and faculty personal websites. Produces a raw profile record for each faculty member.
Enrich
claude-haiku-4-5 · per-run
Scrapes and parses each faculty member's website. LLM extracts structured information: current projects, lab name, student roster, and recent highlights that aren't in UCSD Profiles.
Theme extraction
claude-haiku-4-5 · per-run
LLM reads the full profile text and assigns concise research theme tags (e.g., "Human-Computer Interaction", "AI for Health"). Used for the color-coded faculty map and cross-faculty filtering.
Keyword extraction
claude-haiku-4-5 · per-run
A second LLM pass extracts fine-grained research keywords from the bio, publications, and website content. Keywords power the search bar and Synergies matching.
Bio rewrite
claude-haiku-4-5 · per-run
LLM rewrites raw scraped bios into a consistent, readable style. Original source text is retained; the rewrite is used for display only.

Models in use

claude-haiku-4-5

All profile pipeline steps — web enrichment, theme extraction, keyword extraction, bio rewriting. Fast and cost-effective for structured extraction tasks.

Faculty profiles run on a weekly schedule and on-demand via "↺ Update Profile." Incremental mode skips LLM steps if the source data hasn't changed.

Known limitations

🌐
Website coverage. Faculty without a personal website or with difficult-to-parse pages (heavy JS, PDFs) will have thinner profiles.
🤖
LLM-generated content. Bios, themes, and keywords are AI-generated from scraped text. Verify before using in official communications.
📅
Staleness. Profiles reflect the last run date. Faculty who update their website between runs won't appear changed until the next run.
👤
UCSD Profiles dependency. Title, department, and affiliation fields come from UCSD Profiles. Errors there propagate here.

Pipeline — how a synergies match is computed

Ingest
rule-based · free
Reads uploaded materials (PDF, DOCX, PPTX) or URLs you provide. Extracts plain text from each source.
Compress
claude-haiku-4-5 · fast pre-pass
If materials exceed the context window, a fast LLM pass summarizes them down to key entities, goals, and technical terms before the main match.
Entity extraction
claude-opus-4-5 · deep analysis
Opus reads the full project materials and extracts structured entities: research goals, technical requirements, domain expertise needed, and keywords.
Faculty matching
keyword + embedding · free tier
Extracted entities are matched against faculty keyword profiles using keyword overlap scoring. Returns a ranked candidate list with match rationale.
Executive summary
claude-opus-4-5 · highest quality
Opus writes the final report: why each matched faculty member is relevant, specific overlap with the project, and a synthesis of cross-faculty synergies. Exported as a .docx downloadable from the project card.

Models in use

claude-haiku-4-5

Material compression when inputs are large. Fast, cheap, used only when needed.

claude-opus-4-5

Entity extraction and executive summary — the core matching intelligence. Highest-capability model; costs reflect this (~$0.05–$2 per project depending on materials size).

Cost and CO₂ estimates are shown on each project card after a run.

Known limitations

📄
Material quality matters. Vague or generic project descriptions produce weak matches. The more specific the materials, the better the output.
👤
Profile staleness. Matching is only as good as the faculty keyword profiles. Faculty with outdated or thin profiles may be under-matched.
🤖
LLM match rationale. The reasoning in the report is AI-generated. Review matches against the faculty's actual publications before making introductions.
💰
Cost per run. Opus is expensive relative to Haiku. Large uploads (multi-PDF decks) can cost $1–2 per run. Cost shown on each project card.

Source: github.com/UCSD-DesignLab/dlab-faculty-intelligence