Jan-nano-128k: A 4B Model with a Super-Long Context Window

Submitted ⁨⁨5⁩ ⁨months⁩ ago⁩ by ⁨cm0002@lemmy.world⁩ to ⁨technology@lemmy.zip⁩

https://huggingface.co/Menlo/Jan-nano-128k

Jan-nano-128k is model fine-tuned to improve performance when enable YaRN scaling (instead of having degraded performance). This model will require YaRN Scaling supported from inference engine.

It can uses tools continuously, repeatedly.

It can perform deep research

Extremely persistent

gguf can be found at: huggingface.co/Menlo/Jan-nano-128k-gguf

source

Comments

Sort:hotnew top