GIANT¶
Gigapixel Image Agent for Navigating Tissue
GIANT is an agentic system that uses large language models to autonomously navigate whole-slide images (WSIs) for pathology analysis. The agent iteratively examines regions of a gigapixel image, zooming in on areas of interest until it can answer a question about the slide.
Key Features¶
- Autonomous Navigation - LLM decides where to look next based on visual content
- Multi-Scale Analysis - Examines both tissue architecture and cellular detail
- Provider Agnostic - Supports OpenAI and Anthropic providers (Gemini planned)
- Benchmark Evaluation - Reproduce results on MultiPathQA (GTEx, TCGA, PANDA)
- Trajectory Visualization - Interactive viewer for agent reasoning
Quick Start¶
# Install dependencies
uv sync
# Activate virtual environment
source .venv/bin/activate
# Configure API key
export OPENAI_API_KEY=sk-...
# Run on a single WSI
giant run /path/to/slide.svs -q "What tissue is this?"
# Run benchmark
giant benchmark gtex --provider openai
How It Works¶
┌─────────────────────────────────────────────┐
│ 1. Load WSI + Generate Thumbnail │
│ ┌─────────────────┐ │
│ │ ███████████████ │ + Axis Guides │
│ │ █ Tissue █ │ │
│ │ ███████████████ │ │
│ └─────────────────┘ │
├─────────────────────────────────────────────┤
│ 2. LLM Analyzes + Selects Region │
│ "I see suspicious area at (45K, 32K) │
│ Let me zoom in for cellular detail..." │
├─────────────────────────────────────────────┤
│ 3. Extract High-Res Crop │
│ ┌───────┐ │
│ │░░░░░░░│ 1000x1000 @ high resolution │
│ │░░░░░░░│ │
│ └───────┘ │
├─────────────────────────────────────────────┤
│ 4. Repeat Until Answer │
│ "Based on cellular morphology, │
│ this is adenocarcinoma." │
└─────────────────────────────────────────────┘
Benchmark Results¶
| Benchmark | Task | Our Result | Paper (GIANT x1) | Paper (GIANT x5) | Thumbnail Baseline |
|---|---|---|---|---|---|
| GTEx | Organ Classification (20-way) | 70.3% | 53.7% ± 3.4% | 60.7% ± 3.2% | 36.5% ± 3.4% |
| ExpertVQA | Pathologist-Authored (128 Q) | 60.1% | 57.0% ± 4.5% | 62.5% ± 4.4% | 50.0% ± 4.4% |
| SlideBench | Visual QA (197 Q) | 51.8% | 58.9% ± 3.5% | 59.4% ± 3.4% | 54.8% ± 3.5% |
| TCGA | Cancer Diagnosis (30-way) | 26.2% | 32.3% ± 3.5% | 29.3% ± 3.3% | 9.2% ± 1.9% |
| PANDA | Prostate Grading (6-way) | 20.3% | 23.2% ± 2.3% | 25.4% ± 2.0% | 12.2% ± 2.2% |
- GTEx (70.3%): Exceeds the paper's GIANT x1 (53.7%) and x5 (60.7%) results.
- ExpertVQA (60.1%): Exceeds the paper's GIANT x1 (57.0%), approaching x5 (62.5%).
- SlideBench (51.8%): Below paper's GIANT x1 (58.9%), but comparable to thumbnail baseline.
- TCGA (26.2%): Below paper's GIANT x1 (32.3%), but 3× above thumbnail baseline (9.2%).
- PANDA (20.3%): Rescored from saved artifacts after extraction fixes.
Supported Models¶
| Provider | Model | Status |
|---|---|---|
| OpenAI | gpt-5.2 |
Default |
| Anthropic | claude-sonnet-4-5-20250929 |
Supported |
gemini-3-pro-preview |
Planned (not yet implemented) |
Documentation¶
Getting Started¶
- Installation - Set up your development environment
- Quickstart - Run your first inference
- First Benchmark - Reproduce paper results
Understanding GIANT¶
- What is GIANT? - High-level overview
- Architecture - System design
- Navigation Algorithm - Core algorithm explanation
- LLM Integration - Provider implementations
How-To Guides¶
- Running Inference - Single WSI analysis
- Running Benchmarks - Full benchmark evaluation
- Configuring Providers - API key setup
- Visualizing Trajectories - Inspect agent behavior
Reference¶
- CLI Reference - Command-line options
- Configuration - All configuration options
- Project Structure - Codebase organization
- Model Registry - Approved models
Development¶
- Contributing - How to contribute
- Testing - Testing practices
- Specifications - Implementation specs (Spec-01 to Spec-12)
Data Requirements¶
The MultiPathQA benchmark requires 862 unique WSI files (~500+ GiB):
- TCGA: 474
.svsfiles (~472 GiB) - GTEx: 191
.tifffiles - PANDA: 197
.tifffiles
See Data Acquisition for download instructions.