Instructions to use sihab/slm-1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sihab/slm-1.0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="sihab/slm-1.0") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("sihab/slm-1.0") model = AutoModelForCausalLM.from_pretrained("sihab/slm-1.0") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use sihab/slm-1.0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "sihab/slm-1.0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sihab/slm-1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/sihab/slm-1.0
- SGLang
How to use sihab/slm-1.0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "sihab/slm-1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sihab/slm-1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "sihab/slm-1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sihab/slm-1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use sihab/slm-1.0 with Docker Model Runner:
docker model run hf.co/sihab/slm-1.0
SLM 1.0
SLM 1.0 is a specialized language model trained by NeuroBrain, optimized for structured output generation, JSON schema compliance, and tool calling capabilities.
Model Details
Model Description
SLM 1.0 is a language model specifically trained to excel at:
Structured Output: Generating well-formatted, structured responses
JSON Schema: Producing outputs that strictly adhere to JSON schemas
Tool Calling: Effectively utilizing and calling external tools and functions
This model has been trained by NeuroBrain to provide reliable, structured responses suitable for production applications requiring precise output formatting.
Model Specifications
Architecture: SLM1ForCausalLM
Model Type: Causal Language Model
Context Length: 32,768 tokens
Hidden Size: 1,536
Number of Layers: 28
Attention Heads: 12
Vocabulary Size: 151,936
Training Information
Trained by: NeuroBrain
Training Method: Trained for structured output, JSON schema compliance, and tool calling
Usage
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "sihab/slm-1.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example: Generate structured output
prompt = "Generate a JSON object with user information"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Structured Output Generation
SLM 1.0 is particularly effective when you need structured outputs:
prompt = """
Generate a JSON object following this schema:
{
"name": "string",
"age": "number",
"email": "string"
}
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Tool Calling
The model is optimized for tool calling scenarios:
prompt = """
Available tools:
- get_weather(location: str)
- send_email(to: str, subject: str, body: str)
User request: Check the weather in Paris and send me an email with the result.
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=1024)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
Model Performance
SLM 1.0 demonstrates strong performance in:
JSON schema compliance
Structured data generation
Tool calling accuracy
Function parameter extraction
Limitations
The model may occasionally require post-processing to ensure strict JSON compliance
Tool calling accuracy depends on the clarity of tool descriptions provided
Maximum context length is 32,768 tokens
Citation
If you use SLM 1.0 in your research or applications, please cite:
@misc{slm1.0,
title={SLM 1.0: A Language Model for Structured Output and Tool Calling},
author={NeuroBrain},
year={2025},
howpublished={\url{https://huggingface.co/sihab/slm-1.0}}
}
License
This model is licensed under the Apache 2.0 license.
Contact
For questions, issues, or contributions, please contact NeuroBrain.
Model trained by NeuroBrain
- Downloads last month
- 9