Skip to content

GIANT

Gigapixel Image Agent for Navigating Tissue

GIANT is an agentic system that uses large language models to autonomously navigate whole-slide images (WSIs) for pathology analysis. The agent iteratively examines regions of a gigapixel image, zooming in on areas of interest until it can answer a question about the slide.

Key Features

  • Autonomous Navigation - LLM decides where to look next based on visual content
  • Multi-Scale Analysis - Examines both tissue architecture and cellular detail
  • Provider Agnostic - Supports OpenAI and Anthropic providers (Gemini planned)
  • Benchmark Evaluation - Reproduce results on MultiPathQA (GTEx, TCGA, PANDA)
  • Trajectory Visualization - Interactive viewer for agent reasoning

Quick Start

# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate

# Configure API key
export OPENAI_API_KEY=sk-...

# Run on a single WSI
giant run /path/to/slide.svs -q "What tissue is this?"

# Run benchmark
giant benchmark gtex --provider openai

How It Works

┌─────────────────────────────────────────────┐
│  1. Load WSI + Generate Thumbnail           │
│     ┌─────────────────┐                     │
│     │ ███████████████ │ + Axis Guides       │
│     │ █   Tissue    █ │                     │
│     │ ███████████████ │                     │
│     └─────────────────┘                     │
├─────────────────────────────────────────────┤
│  2. LLM Analyzes + Selects Region           │
│     "I see suspicious area at (45K, 32K)    │
│      Let me zoom in for cellular detail..." │
├─────────────────────────────────────────────┤
│  3. Extract High-Res Crop                   │
│     ┌───────┐                               │
│     │░░░░░░░│ 1000x1000 @ high resolution   │
│     │░░░░░░░│                               │
│     └───────┘                               │
├─────────────────────────────────────────────┤
│  4. Repeat Until Answer                     │
│     "Based on cellular morphology,          │
│      this is adenocarcinoma."               │
└─────────────────────────────────────────────┘

Benchmark Results

Benchmark Task Our Result Paper (GIANT x1) Paper (GIANT x5) Thumbnail Baseline
GTEx Organ Classification (20-way) 70.3% 53.7% ± 3.4% 60.7% ± 3.2% 36.5% ± 3.4%
ExpertVQA Pathologist-Authored (128 Q) 60.1% 57.0% ± 4.5% 62.5% ± 4.4% 50.0% ± 4.4%
SlideBench Visual QA (197 Q) 51.8% 58.9% ± 3.5% 59.4% ± 3.4% 54.8% ± 3.5%
TCGA Cancer Diagnosis (30-way) 26.2% 32.3% ± 3.5% 29.3% ± 3.3% 9.2% ± 1.9%
PANDA Prostate Grading (6-way) 20.3% 23.2% ± 2.3% 25.4% ± 2.0% 12.2% ± 2.2%
  • GTEx (70.3%): Exceeds the paper's GIANT x1 (53.7%) and x5 (60.7%) results.
  • ExpertVQA (60.1%): Exceeds the paper's GIANT x1 (57.0%), approaching x5 (62.5%).
  • SlideBench (51.8%): Below paper's GIANT x1 (58.9%), but comparable to thumbnail baseline.
  • TCGA (26.2%): Below paper's GIANT x1 (32.3%), but 3× above thumbnail baseline (9.2%).
  • PANDA (20.3%): Rescored from saved artifacts after extraction fixes.

Supported Models

Provider Model Status
OpenAI gpt-5.2 Default
Anthropic claude-sonnet-4-5-20250929 Supported
Google gemini-3-pro-preview Planned (not yet implemented)

Documentation

Getting Started

Understanding GIANT

How-To Guides

Reference

Development

Data Requirements

The MultiPathQA benchmark requires 862 unique WSI files (~500+ GiB):

  • TCGA: 474 .svs files (~472 GiB)
  • GTEx: 191 .tiff files
  • PANDA: 197 .tiff files

See Data Acquisition for download instructions.