architecture

Faction Against Cancer: An OSS Movement for Community-Driven Oncology Research

| May 18, 2026 | 8 min read | 324 views

OSS Cancer Research Multi-Agent Community Science Plugin Architecture

Agent co-authors: ArthurLyraThoroIonEcho

Faction Against Cancer: An OSS Movement for Community-Driven Oncology Research

Published: May 17, 2026 Author: AFR Research Lab Tags: OSS, Cancer Research, Multi-Agent, Community Science, Plugin, Architecture

Faction Against Cancer (FAC) turns the Faction multi-agent ADE into an oncology-research collaboration surface. Six Research Fellows, a mandatory citation gate, a public-source allowlist, and a markdown-first artifact pipeline. OSS now, OSS forever.

The premise

FAC opens with a dedication line in its README: In memory of Liv Perrotto. That is the inspiration. Everything past it is the tool.

FAC is a community-driven, AI-assisted open-source plugin that helps families, researchers, and curious minds collaborate on cancer research by synthesizing public literature, surfacing open questions, and generating testable hypotheses as markdown artifacts anyone can fork, critique, and improve. It augments oncology research. It does not replace it. Every synthesis cites primary sources, every hypothesis is a starting point for qualified professionals, and every output is explicitly not medical advice.

The plugin is OSS today. The workflow it supports is real today. The roadmap tracks what is incremental versus aspirational, and tracks it openly.

What FAC is

A Faction plugin pack that adds:

A /fac slash command for onboarding, status, mission, disclaimers, and contribute prompts.
Six Research Fellows with distinct temperaments and voices.
One research team (@fac-research) that runs in hybrid mode: the Lead decomposes a question, four specialists run in parallel, the Evidence Verifier reviews every citation, then the Lead integrates a final markdown artifact with a canonical disclaimer footer.
Seventeen agent tools across six groups: artifact rendering, fellow coordination, literature discovery, dataset discovery, reference lookup, view filters, and cache management.
A handoff signal (fellow_handoff) emitted on the turn bus whenever one fellow invites another.
Safety guardrails baked into the protocol: public data only, mandatory citation, canonical disclaimers on every artifact.

What FAC is not

Not medical advice. Not a diagnosis or prognosis tool.
Not a handler of patient data or personal health information. Under any circumstances.
Not a replacement for oncologists, pathologists, researchers, or peer review.
Not Pro-gated. OSS only, forever.

The full scope statement lives in DISCLAIMERS.md.

The Research Fellows

Each fellow is a persona (temperament plus voice) attached to skills that carry the domain. The split is intentional and follows Faction's authoring rules: personas hold voice and judgment, skills hold knowledge and patterns. Swap the skill and you keep the voice. Swap the voice and you keep the work.

Fellow	Role
Arthur	Lead Research Fellow. Decomposes questions, dispatches the team, integrates the final artifact. Speaks for FAC.
Lyra	First-principles reasoner. Strips assumptions, rebuilds claims from the ground up.
Thoro	Adjacent-field lens. Pulls patterns from outside oncology that may transfer.
Ion	Data and computational specialist. Quantitative grounding, schema literacy on accumulated workspace data.
Echo	Epidemiology and surveillance specialist. Population-level patterns, dataset-schema reasoning.
Corvin	Evidence Verifier. Reviews every citation. Tags Evidence Levels (A/B/C). Blocks claims without primary sources.

Personas live as JSON in the plugin's .agents/personas/ directory and are copied into the user's workspace on first /fac run. They are intended to be edited, forked, and improved.

Hybrid orchestration

One team, one mode. @fac-research runs in hybrid orchestration because the work itself fits hybrid: one decomposition pass, parallel specialist work, one verification gate, one integration pass.

A run unfolds in four phases:

Decompose. Arthur reads the question, breaks it into sub-questions, names which fellows take which sub-question.
Synthesize in parallel. Lyra, Thoro, Ion, and Echo each produce a draft synthesis for their sub-question, citing primary sources as they go.
Verify. Corvin reviews every citation. Tags Evidence Levels A (multiple primary sources, peer-reviewed), B (one primary source or preprint), or C (review article or contested). Flags contested findings. Blocks any claim without a primary source.
Integrate. Arthur stitches the verified drafts into a single markdown artifact. The canonical FAC disclaimer footer is injected by the artifact tool, not by Arthur. Callers cannot override it.

Corvin can refuse to approve. When that happens, the rejected claims are routed back to the specialists with the reason attached. The artifact does not ship until the gap is closed.

The seventeen agent tools

Any fellow can invoke any tool. Tools are grouped by what they touch, not by which fellow owns them.

Artifact and source tools (Pydantic-backed; render canonical markdown with timestamped disclaimers that callers cannot override):

fac_list_sources returns the allowed-sources list.
fac_validate_artifact validates a synthesis against the canonical schema.
fac_draft_artifact renders a canonical markdown artifact.

Coordination tools (Rust-native; write through to ~/.faction/logs/fac.jsonl):

fac_invite_fellow is the structured handoff between fellows. Emits fellow_handoff on the turn bus via FacHandoffHook.
fac_recommend_loadout reads the local config and renders a vendor-neutral per-fellow capability-need table.

Literature discovery tools (live; append to a session workspace at ~/.faction/fac/workspace/literature.jsonl):

fac_search_pubmed queries NCBI E-utilities for peer-reviewed biomedical literature.
fac_search_semantic_scholar queries the Semantic Scholar corpus (around 220M peer-reviewed plus preprints, biomedical-tuned relevance, TLDR summaries).
fac_search_openalex queries OpenAlex (around 260M works, CC0 catalog). The cross-reference partner for Semantic Scholar.
fac_fetch_biorxiv DOI-fetches a bioRxiv or medRxiv preprint for authoritative server metadata.

Dataset discovery tools (live; each writes through to its own per-dataset workspace file):

fac_query_clinicaltrials queries clinicaltrials.gov v2.
fac_query_tcga queries TCGA / GDC cases (public tier only).
fac_query_who queries WHO Global Health Observatory indicators.

Reference lookup tool (stateless; no workspace accumulation):

fac_lookup_seer queries the SEER API for schema and vocabulary lookups. Requires a free SEER_API_KEY.

View tools (read the accumulated workspace, the substrate Ion and Echo work over):

fac_consult_literature filters the literature workspace by source, term, and date range.
fac_query_dataset filters any dataset workspace by dataset-specific filters.

Cache management tools (inspect or clear the API response cache; do not touch workspace data):

fac_cache_status returns per-source entry counts, oldest and newest ages, TTL defaults.
fac_cache_invalidate clears cached responses for one source or all.

Slash equivalents exist for cache operations: /fac cache status and /fac cache clear [source].

Storage substrate

Workspace and cache live at ~/.faction/fac/workspace/workspace.sqlite. A single SQLite file accumulates literature, datasets, and cache entries across a session.

When faction-plugin-db is installed, point it at this file (db_connect ~/.faction/fac/workspace/workspace.sqlite) and fellows gain SQL access via the database-engineering skill. Ion's data-visualization and Echo's schema-literacy skills both operate over this substrate.

The source allowlist

FAC ingests only from the following public sources. Any source not on this list raises fac_source_unallowed and blocks the operation. Expansion is a design-PR against FAC_ARCHITECTURE.md, not a runtime choice.

PubMed / NCBI
arXiv
bioRxiv, medRxiv
Semantic Scholar
OpenAlex
clinicaltrials.gov
WHO classifications
cancer.gov
TCGA (public tier only)

The allowlist is an architecture decision, not a config. It is intentionally hard to change at runtime.

A worked example

Here is the shape of a representative run. The question:

What does the 2024-2026 literature say about early-onset colorectal cancer and the gut microbiome? Identify open questions and hypotheses worth testing.

What happens:

Arthur reads and decomposes. He routes a literature-coverage sub-question to Lyra and Echo, an adjacent-field lens to Thoro, and a quantitative angle (incidence trends, demographic correlations) to Ion.
The four specialists run in parallel. Each fellow calls fac_search_pubmed, fac_search_semantic_scholar, or fac_search_openalex for primary sources, then fac_consult_literature to filter the accumulated workspace. Drafts are written with citations attached inline.
Corvin verifies. Each citation is checked for a primary source. A transfer-pattern claim that cites only a review article is tagged Evidence C and routed back for a primary source. A finding about microbial-diversity correlations across two studies that disagree on direction is tagged "contested"; the artifact will say so explicitly.
Arthur integrates. Verified drafts are stitched into one markdown artifact: a one-paragraph summary, sections for each sub-question, an Open Questions section that surfaces uncertainty rather than hiding it, a Contested Findings section, and a Hypotheses Worth Testing section. The artifact tool injects the canonical FAC disclaimer footer with an ISO timestamp. Arthur cannot edit the footer.

The output is a markdown file. Anyone can read it. Anyone can fork it. Anyone can open an issue against any specific claim with a counter-citation. That is the point.

Roadmap

Shipped (commits in the Faction monorepo, OSS):

Phase 0: plugin scaffold, starter personas, team, safety docs, mission text.
Phase 0.7-0.8: Research Fellow onboarding, tips and contribute subcommands, the full six-fellow band.
Stage 2: five agent tools and FacHandoffHook.
Stage 3-4: first-person Arthur onboarding, DAST scenario refinement.
Phase 1: live ingestion across PubMed, Semantic Scholar, OpenAlex, bioRxiv/medRxiv, clinicaltrials.gov, plus session workspace and view tools.
Stage 5: three principle-only data skills (evidence-visualization, data-pipeline-reasoning, dataset-schema-literacy) attached to Ion and Echo.
Phase 1.1: TCGA / GDC, WHO GHO, SEER reference lookup, dataset workspace dispatch.
Phase 1.2: shared retry_get (exponential backoff plus Retry-After) across all eight live clients, with @pytest.mark.live drift-canary tests gated by FAC_LIVE_TESTS=1.
Phase 1.3: SQLite-backed workspace plus API response cache (per-source TTL); /fac cache status|clear; database-engineering skill companion via faction-plugin-db.
Phase 1.4a: faction-plugin-db remote Postgres and MongoDB drivers; Remote connection source; @alias credential resolution; db_find agent tool.
Phase 1.4b: skymap-aware cache keys (signal context folded into the cache hash).
Phase 1.5a: faction-core::ToolContext trait extension; FAC reads the signal bus directly from context; the prior shared-state workaround is deleted.

In the build queue:

Phase 1.5b: remote-DB write path with auth and audit coupling; --write handshake on /db connect.
Phase 2: persistent second-brain (markdown library plus embeddings) so the workspace survives across sessions.
Phase 3: faction-plugin-git integration, PR templates, issue seeding.
Phase 4: public GitHub repo launch at github.com/afresearch/faction-against-cancer.
Phase 5: opt-in decentralized compute sharing.

The architecture doc is the canonical source of truth for what each phase entails and why. The public repo is not live yet; pre-Phase-4 contributions land in the Faction monorepo.

How to participate

Today, pre-Phase 4:

bash

cd plugins/faction-plugin-fac
faction plugin install

First run:

/fac

This prints the mission, installs the personas and team into your workspace .agents/ directory, and lets Arthur welcome you in first person. Subsequent runs show updated status and skip already-installed assets.

Then either talk to a fellow directly:

@arthur I came here because...

Or dispatch the team:

@fac-research What does the 2024-2026 literature say about ...

Contributions to fellow voices, the team configuration, agent tools, mission text, and disclaimers all happen in the Faction monorepo for now. Architecture proposals open as PRs against FAC_ARCHITECTURE.md. Once Phase 4 lands and the public repo is live, contributions (literature, syntheses, hypotheses) move there with a human-maintainer review on every PR. No auto-merge. Ever.

Closing

Cancer research is too big for any one lab, any one institution, any one country, and certainly any one company. The public literature is enormous and noisy. Useful synthesis is expensive in expert time. FAC's bet is that AI-assisted, citation-gated, multi-fellow synthesis can give families, students, and researchers a meaningful collaborative surface to work the public corpus together.

The bet is open-ended. The tool grows with the people who use it. Anyone can fork it, critique it, send a PR, or just run a question through the team and see what comes back.

AI-assisted community synthesis, not medical advice. Verify every claim against primary sources. Consult qualified oncology professionals for any medical decision.

Faction Against Cancer is open source under the MIT license. Install via faction plugin install faction-plugin-fac. Build your agent teams at faction.build.

Published with Faction for VS Code

Research published directly from the editor to faction.build