GPT-2 โ€” ONNX

ONNX export of GPT-2 (124M parameters) with KV cache support for efficient autoregressive generation.

Converted for use with inference4j, an inference-only AI library for Java.

Original Source

  • Repository: OpenAI
  • License: MIT

Usage with inference4j

try (Gpt2TextGenerator gen = Gpt2TextGenerator.builder().build()) {
    GenerationResult result = gen.generate("Once upon a time");
    System.out.println(result.text());
}

Model Details

Property Value
Architecture GPT-2 (124M parameters, 12 layers, 768 hidden, 12 heads)
Task Text generation
Context length 1024 tokens
Vocabulary 50257 tokens (BPE)
Original framework PyTorch (transformers)
Export method Hugging Face Optimum (with KV cache)

License

This model is licensed under the MIT License. Original model by OpenAI.

Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support