view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages davanstrien • Jul 8, 2025 • 35
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 775
view article Article FineWeb2-C: Help Build Better Language Models in Your Language davanstrien • Dec 23, 2024 • 21