Design Lab Faculty Intelligence

Faculty Profiles

All … Design Lab faculty — search, filter, and manage profiles

Loading…

Add New Faculty Member

Name (Last, First) *

Role *

FTE

Title *

Department *

Email *

Primary Webpage

🎨 Color guide

Faculty Highlights

Publications, news, and recognition across all Design Lab faculty

Loading…

Synergies Finder

Match Design Lab faculty against external collaborators, partners, and funding calls

AI Model Card

How AI is used across Faculty Profiles, Faculty Highlights, and Synergies Finder

Pipeline — how each monthly digest is built

S1 · Discover

rule-based · free

Crawls arXiv, Google Scholar (via SerpAPI), PubMed, EurekAlert press releases, faculty personal websites, and LinkedIn (via Lix API). Produces a raw candidate list for every faculty member.

S2 · Filter

rule-based · free

Keyword matching against faculty name, UCSD affiliation tokens, and topic keywords. Drops obvious false positives before any LLM call.

S3 · Verify

claude-haiku-4-5 · ~$0.0003/item

LLM checks each candidate: Is this actually about the right person? Is the date within the coverage window? Assigns high / medium / low confidence and writes a verify_reason (visible on item hover).

S4 · Summarize

claude-haiku-4-5 · ~$0.0005/item

LLM writes a 1–2 sentence plain-English description of each verified item. Results cached per-item so reruns skip already-summarized items.

Q · Quarterly

claude-sonnet-4-5 · higher quality

Aggregates 3 months of digest data. A second LLM pass (Sonnet) curates highlights, selects representative items, and generates the downloadable .docx report.

Models in use

claude-haiku-4-5

Monthly digest — verification (S3) and summarization (S4). Fast and cheap; sufficient for structured fact-checking and short summaries.

claude-sonnet-4-5

Quarterly highlights — curation and narrative generation. More capable model used where output quality matters most.

All calls go through the Anthropic API. No fine-tuning; standard prompt engineering only. Model versions are pinned in shared/digest_guardrails.py.

Data sources

arXiv — preprints (cs, eess, q-bio, stat)
Google Scholar via SerpAPI — papers, citations, news mentions
PubMed — biomedical publications
EurekAlert — university press releases
Faculty websites — scraped per-person (author_specific)
LinkedIn via Lix API — posts, announcements (credit-limited)

SerpAPI and Lix credits are consumed per run. LinkedIn enrichment may be skipped to conserve credits (--skip-linkedin flag).

Item provenance badges

Each item in the digest carries a badge showing where it was found:

📍 Faculty site Found on the faculty member's own website or profiles (author_specific). Highest-trust source.

🌐 Web Discovered by scraping a specific known page (scraped_webpage).

🔍 Search Found via general web or Scholar search (general). Requires stronger LLM verification.

⚠ Medium confidence LLM verified but flagged lower certainty. Hover for reasoning.

Hover any item's badges to see the LLM's verify_reason.

Known limitations

🔍

Coverage gaps. Items outside indexed sources (conference websites, institutional repositories, social media beyond LinkedIn) will be missed.

💳

LinkedIn is credit-limited. Not every faculty member is enriched every run; LinkedIn-sourced items may appear inconsistently.

📅

Date attribution. Items are dated by publication, not discovery. A paper published in March but indexed in May may appear in the May digest.

🤖

LLM summaries. Descriptions are AI-generated. Verify against the source link before citing or forwarding to faculty.

🔁

No deduplication across months. An item that spans a month boundary may appear in two consecutive digests.

Pipeline — how each faculty profile is built and maintained

Collect

rule-based · free

Pulls structured data from UCSD Profiles (bio, title, department), ORCID (publication IDs), and faculty personal websites. Produces a raw profile record for each faculty member.

Enrich

claude-haiku-4-5 · per-run

Scrapes and parses each faculty member's website. LLM extracts structured information: current projects, lab name, student roster, and recent highlights that aren't in UCSD Profiles.

Theme extraction

claude-haiku-4-5 · per-run

LLM reads the full profile text and assigns concise research theme tags (e.g., "Human-Computer Interaction", "AI for Health"). Used for the color-coded faculty map and cross-faculty filtering.

Keyword extraction

claude-haiku-4-5 · per-run

A second LLM pass extracts fine-grained research keywords from the bio, publications, and website content. Keywords power the search bar and Synergies matching.

Bio rewrite

claude-haiku-4-5 · per-run

LLM rewrites raw scraped bios into a consistent, readable style. Original source text is retained; the rewrite is used for display only.

Models in use

claude-haiku-4-5

All profile pipeline steps — web enrichment, theme extraction, keyword extraction, bio rewriting. Fast and cost-effective for structured extraction tasks.

Faculty profiles run on a weekly schedule and on-demand via "↺ Update Profile." Incremental mode skips LLM steps if the source data hasn't changed.

Known limitations

🌐

Website coverage. Faculty without a personal website or with difficult-to-parse pages (heavy JS, PDFs) will have thinner profiles.

🤖

LLM-generated content. Bios, themes, and keywords are AI-generated from scraped text. Verify before using in official communications.

📅

Staleness. Profiles reflect the last run date. Faculty who update their website between runs won't appear changed until the next run.

👤

UCSD Profiles dependency. Title, department, and affiliation fields come from UCSD Profiles. Errors there propagate here.

Pipeline — how a synergies match is computed

Ingest

rule-based · free

Reads uploaded materials (PDF, DOCX, PPTX) or URLs you provide. Extracts plain text from each source.

Compress

claude-haiku-4-5 · fast pre-pass

If materials exceed the context window, a fast LLM pass summarizes them down to key entities, goals, and technical terms before the main match.

Entity extraction

claude-opus-4-5 · deep analysis

Opus reads the full project materials and extracts structured entities: research goals, technical requirements, domain expertise needed, and keywords.

Faculty matching

keyword + embedding · free tier

Extracted entities are matched against faculty keyword profiles using keyword overlap scoring. Returns a ranked candidate list with match rationale.

Executive summary

claude-opus-4-5 · highest quality

Opus writes the final report: why each matched faculty member is relevant, specific overlap with the project, and a synthesis of cross-faculty synergies. Exported as a .docx downloadable from the project card.

Models in use

claude-haiku-4-5

Material compression when inputs are large. Fast, cheap, used only when needed.

claude-opus-4-5

Entity extraction and executive summary — the core matching intelligence. Highest-capability model; costs reflect this (~$0.05–$2 per project depending on materials size).

Cost and CO₂ estimates are shown on each project card after a run.

Known limitations

📄

Material quality matters. Vague or generic project descriptions produce weak matches. The more specific the materials, the better the output.

👤

Profile staleness. Matching is only as good as the faculty keyword profiles. Faculty with outdated or thin profiles may be under-matched.

🤖

LLM match rationale. The reasoning in the report is AI-generated. Review matches against the faculty's actual publications before making introductions.

💰

Cost per run. Opus is expensive relative to Haiku. Large uploads (multi-PDF decks) can cost $1–2 per run. Cost shown on each project card.

Source: github.com/UCSD-DesignLab/dlab-faculty-intelligence