CorridorKey / README.md
Nekochu's picture
fix: remove tree paths from credit links
51cee71

A newer version of the Gradio SDK is available: 6.15.0

Upgrade
metadata
title: CorridorKey
emoji: 🎬
colorFrom: yellow
colorTo: yellow
sdk: gradio
sdk_version: 6.9.0
app_file: app.py
python_version: '3.10'
pinned: false
tags:
  - green-screen
  - background-removal
  - video-matting
  - alpha-matting
  - vfx
  - corridor-digital
  - transparency
  - onnx
  - pytorch
  - zerogpu
  - mcp-server
short_description: Remove green/blue screen from video, even glass & hair

CorridorKey Green/Blue Screen Matting

Remove green or blue screen backgrounds from video. Handles transparent objects (glass, water, cloth) that traditional chroma key cannot.

Based on CorridorKey by Corridor Digital.

Inference Paths

  • GPU (ZeroGPU H200): PyTorch GreenFormer with batched inference (batch 32 at 1024, batch 16 at 2048)
  • CPU (fallback): ONNX Runtime sequential inference (batch 1)

Pipeline

  1. BiRefNet - Generates coarse foreground mask (ONNX, or fast classical HSV for green/blue screens)
  2. CorridorKey GreenFormer - Refines alpha matte + extracts clean foreground (PyTorch on GPU, ONNX on CPU)
  3. GPU Postprocessing - Despill, despeckle (connected components), resize — all on GPU via torchvision, single CPU transfer at the end

GPU Optimizations

  • Full GPU pipeline: preprocessing (resize + normalize) and postprocessing (despill, clean_matte, resize) stay on device — avoids CPU↔GPU round-trips per batch
  • TF32 tensor cores: torch.set_float32_matmul_precision('high') for FP32 postprocessing ops
  • AOTI compilation with torch.inductor + triton cudagraphs (native CUDA kernels, fused ops, replays entire kernel sequence without CPU-GPU sync overhead) don't benefit GreenFormer: tested max-autotune (118s, 0 triton kernels) and reduce-overhead (36s compile + 48s graph recording = 84s for 5% speedup). Small feature maps (112-896ch) are cublas-optimal, not triton-friendly. Disabled on ZeroGPU — eager at 0.32s/frame beats 84s+ overhead. torch.compile still available for local GPU

Pipeline timing (89 frames, batch 32 @ 1024px model res): CPU mask 22s → GPU load 5s → inference 29s → write 15s → stitch 9s ≈ 80s total, 49s GPU. Model always processes at 1024x1024 or 2048x2048 regardless of input resolution

API

REST API

Step 1: Submit request

curl -X POST "https://huggingface.co/proxy/luminia-corridorkey.hf.space/gradio_api/call/process_video" \
  -H "Content-Type: application/json" \
  -d '{"data": ["video.mp4", "1024", 5, "Hybrid (auto)", true, 400]}'

Step 2: Get result

curl "https://huggingface.co/proxy/luminia-corridorkey.hf.space/gradio_api/call/process_video/{event_id}"

MCP (Model Context Protocol)

MCP Config:

{
  "mcpServers": {
    "corridorkey": {
      "url": "https://huggingface.co/proxy/luminia-corridorkey.hf.space/gradio_api/mcp/"
    }
  }
}

Credits

  • CorridorKey by Niko Pueringer / Corridor Digital (synced to 8a4e4b4, 2026-05-01)
  • EZ-CorridorKey UI reference by edenaion (synced to 888e032, 2026-04-25)
  • BiRefNet by ZhengPeng7