GIANT¶

Gigapixel Image Agent for Navigating Tissue

GIANT is an agentic system that uses large language models to autonomously navigate whole-slide images (WSIs) for pathology analysis. The agent iteratively examines regions of a gigapixel image, zooming in on areas of interest until it can answer a question about the slide.

Key Features¶

Autonomous Navigation - LLM decides where to look next based on visual content
Multi-Scale Analysis - Examines both tissue architecture and cellular detail
Provider Agnostic - Supports OpenAI and Anthropic providers (Gemini planned)
Benchmark Evaluation - Reproduce results on MultiPathQA (GTEx, TCGA, PANDA)
Trajectory Visualization - Interactive viewer for agent reasoning

Quick Start¶

# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate

# Configure API key
export OPENAI_API_KEY=sk-...

# Run on a single WSI
giant run /path/to/slide.svs -q "What tissue is this?"

# Run benchmark
giant benchmark gtex --provider openai

How It Works¶

┌─────────────────────────────────────────────┐
│  1. Load WSI + Generate Thumbnail           │
│     ┌─────────────────┐                     │
│     │ ███████████████ │ + Axis Guides       │
│     │ █   Tissue    █ │                     │
│     │ ███████████████ │                     │
│     └─────────────────┘                     │
├─────────────────────────────────────────────┤
│  2. LLM Analyzes + Selects Region           │
│     "I see suspicious area at (45K, 32K)    │
│      Let me zoom in for cellular detail..." │
├─────────────────────────────────────────────┤
│  3. Extract High-Res Crop                   │
│     ┌───────┐                               │
│     │░░░░░░░│ 1000x1000 @ high resolution   │
│     │░░░░░░░│                               │
│     └───────┘                               │
├─────────────────────────────────────────────┤
│  4. Repeat Until Answer                     │
│     "Based on cellular morphology,          │
│      this is adenocarcinoma."               │
└─────────────────────────────────────────────┘

Benchmark Results¶

Benchmark	Task	Our Result	Paper (GIANT x1)	Paper (GIANT x5)	Thumbnail Baseline
GTEx	Organ Classification (20-way)	70.3%	53.7% ± 3.4%	60.7% ± 3.2%	36.5% ± 3.4%
ExpertVQA	Pathologist-Authored (128 Q)	60.1%	57.0% ± 4.5%	62.5% ± 4.4%	50.0% ± 4.4%
SlideBench	Visual QA (197 Q)	51.8%	58.9% ± 3.5%	59.4% ± 3.4%	54.8% ± 3.5%
TCGA	Cancer Diagnosis (30-way)	26.2%	32.3% ± 3.5%	29.3% ± 3.3%	9.2% ± 1.9%
PANDA	Prostate Grading (6-way)	20.3%	23.2% ± 2.3%	25.4% ± 2.0%	12.2% ± 2.2%

GTEx (70.3%): Exceeds the paper's GIANT x1 (53.7%) and x5 (60.7%) results.
ExpertVQA (60.1%): Exceeds the paper's GIANT x1 (57.0%), approaching x5 (62.5%).
SlideBench (51.8%): Below paper's GIANT x1 (58.9%), but comparable to thumbnail baseline.
TCGA (26.2%): Below paper's GIANT x1 (32.3%), but 3× above thumbnail baseline (9.2%).
PANDA (20.3%): Rescored from saved artifacts after extraction fixes.

Supported Models¶

Provider	Model	Status
OpenAI	`gpt-5.2`	Default
Anthropic	`claude-sonnet-4-5-20250929`	Supported
Google	`gemini-3-pro-preview`	Planned (not yet implemented)

Documentation¶

Getting Started¶

Installation - Set up your development environment
Quickstart - Run your first inference
First Benchmark - Reproduce paper results

Understanding GIANT¶

What is GIANT? - High-level overview
Architecture - System design
Navigation Algorithm - Core algorithm explanation
LLM Integration - Provider implementations

How-To Guides¶

Running Inference - Single WSI analysis
Running Benchmarks - Full benchmark evaluation
Configuring Providers - API key setup
Visualizing Trajectories - Inspect agent behavior

Reference¶

CLI Reference - Command-line options
Configuration - All configuration options
Project Structure - Codebase organization
Model Registry - Approved models

Development¶

Contributing - How to contribute
Testing - Testing practices
Specifications - Implementation specs (Spec-01 to Spec-12)

Data Requirements¶

The MultiPathQA benchmark requires 862 unique WSI files (~500+ GiB):

TCGA: 474 .svs files (~472 GiB)
GTEx: 191 .tiff files
PANDA: 197 .tiff files

See Data Acquisition for download instructions.