Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,7 @@ license: cc-by-nc-4.0
|
|
| 7 |
|
| 8 |
# Model Information
|
| 9 |
|
| 10 |
-
We introduce **UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
|
| 11 |
|
| 12 |
|
| 13 |
## The UltraLong Models
|
|
|
|
| 7 |
|
| 8 |
# Model Information
|
| 9 |
|
| 10 |
+
We introduce **Nemotron-UltraLong-8B**, a series of ultra-long context language models designed to process extensive sequences of text (up to 1M, 2M, and 4M tokens) while maintaining competitive performance on standard benchmarks. Built on the Llama-3.1, UltraLong-8B leverages a systematic training recipe that combines efficient continued pretraining with instruction tuning to enhance long-context understanding and instruction-following capabilities. This approach enables our models to efficiently scale their context windows without sacrificing general performance.
|
| 11 |
|
| 12 |
|
| 13 |
## The UltraLong Models
|