You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

vidavox/SKK-Router-1.5B

Version: v1.0 โ€“ SKK Router for internal routing
Base model: katanemo/Arch-Router-1.5B (itself built on Qwen2.5-1.5B-Instruct) :contentReference[oaicite:0]{index=0}

SKK-Router-1.5B is a domain-specialized router model fine-tuned from Arch-Router-1.5B for question complexity routing inside an internal SKK agent system.

Instead of routing across many domains and actions, this model focuses on a single domain (SKK upstream oil & gas and related KSMI regulations) and chooses between:

  • a non-reasoning model for basic questions
  • a reasoning model for complex questions

The model outputs a minimal JSON object:

{"route": "basic"}

or

{"route": "complex"}

It is designed for internal orchestration, not for direct end-user text generation.


1. Intended Use

Primary use case

  • Task: Route incoming questions to either a basic or complex LLM path based on question difficulty and reasoning requirements.
  • Domain: SKK internal agent system, with content grounded in KSMI and related SKK upstream O&G documents.
  • Users: Internal systems and engineers building the SKK agent stack. Not intended for general public use.

What the routes mean

  • "basic" route

    • Short, direct, or factoid-style questions.
    • Queries that can be answered with light or no multi-step reasoning.
    • Good for low-latency, low-cost non-reasoning models.
  • "complex" route

    • Multi-step reasoning, multi-constraint, or ambiguous questions.
    • Questions that require combining multiple facts, interpreting regulations, or deeper analysis.
    • Intended for slower, more capable reasoning models.

Out of scope

  • General conversational use outside SKK / KSMI context.
  • Safety-critical routing (e.g., medical, legal, or financial decisions).
  • Direct Q&A: this router only selects models; it does not itself produce the final answer.

2. How It Relates to Arch-Router

Arch-Router-1.5B is a 1.5B-parameter preference-aligned router that maps queries to user-defined domains and actions for flexible multi-model routing. (Hugging Face)

SKK-Router-1.5B:

  • keeps the same routing prompt format as the original Arch-Router model (including the JSON route output).
  • narrows the routing space to question complexity within the SKK domain.
  • is trained on a bilingual (Indonesian/English) mix of synthetic and manually-written Q&A tailored to SKKโ€™s internal use.

If you are already familiar with Arch-Router, you can plug this model in as a drop-in replacement for the router, as long as your route configuration reflects the "basic" and "complex" choices used during fine-tuning.


3. Model Architecture

  • Backbone: Qwen2.5-1.5B-Instruct via Arch-Router-1.5B (Hugging Face)
  • Parameters: โ‰ˆ1.5B (same as base router) (Hugging Face)
  • Tokenizer & chat template: inherited from Arch-Router-1.5B.
  • Fine-tune type: PEFT/LoRA fine-tune on Arch-Router-1.5B, followed by merging the adapter into the base weights to form a standalone checkpoint (vidavox/SKK-Router-1.5B).

Languages:

  • Indonesian (Bahasa Indonesia)
  • English

4. Training Data

The fine-tune uses a private, domain-specific dataset:

DatasetDict({
    train: 3096 samples
    val:   884 samples
    test:  443 samples
})

Each split has the following fields:

  • instruction: the main user question / request.
  • input: optional auxiliary context (may be empty).
  • route: original label in the data pipeline.
  • output_route: JSON string used as the target, e.g. {"route": "basic"}.

Data sources

  • Synthetic conversations and prompts generated to reflect SKKโ€™s internal workflows.
  • Manually authored Q&A examples capturing realistic SKK / KSMI questions.
  • All data is private and not released with this model.
  • Domain focus: questions grounded in KSMI and related SKK upstream O&G regulations.

Label space

For this fine-tune, the router is effectively binary:

  • basic โ€“ non-reasoning route
  • complex โ€“ reasoning route

The original Arch-Router "other" route is present in the base model evaluation but not used as a target in the fine-tuned test set (see evaluation below).


5. Training Details

  • Framework: TRL SFTTrainer with SFTConfig (supervised fine-tuning).
  • Adapter: PEFT / LoRA attached to Arch-Router-1.5B; final model created by merging adapters into base.
  • Hardware: single NVIDIA GeForce RTX 3090 GPU.

Key training configuration (high-level):

  • per_device_train_batch_size = 2
  • per_device_eval_batch_size = 4
  • gradient_accumulation_steps = 8 โ†’ effective batch size โ‰ˆ 16 (sequence-wise)
  • Early stopping with patience = 1 based on validation loss.
  • Train/val splits above; test used only for the final benchmark.

For full configuration details, see the Router-SFTTrainer.ipynb notebook in this repository.


6. Evaluation

The model was evaluated on a held-out test set of 443 samples, containing only basic and complex routes as the target labels.

6.1 Route distribution

Comparison of how often each model predicts each route:

Route Target test data Fine-tuned model Base Arch-Router
Basic 147 160 201
Complex 296 283 156
Other 0 0 86

Observations:

  • The fine-tuned model routes almost all queries to basic or complex, matching the target distribution closely.

  • The base Arch-Router tends to:

    • over-predict basic, and
    • send many SKK-style queries to the generic other route.

6.2 Routing accuracy

Accuracy is computed as:

  • prediction is correct if the chosen "route" matches the output_route label for that sample.
Metric Fine-tuned model Base Arch-Router
Basic route accuracy 91.50% 74.83%
Complex route accuracy 93.10% 45.27%
Overall accuracy 92.55% 55.08%

Improvements (absolute percentage points):

  • Basic route: +16.67 pp
  • Complex route: +47.83 pp
  • Overall: +37.47 pp

In practice, this means:

  • The router is much more reliable at distinguishing between simple and complex SKK queries.
  • Mis-routing complex questions to the non-reasoning path is drastically reduced compared to the base Arch-Router.

Note: These metrics are computed on private, synthetic + manually-authored data tailored to the SKK domain. Performance on other domains may be substantially different.


7. How to Use

โš ๏ธ Important: This model assumes the same overall routing prompt structure as katanemo/Arch-Router-1.5B. For best results, follow the upstream Arch-Router prompt format and simply adapt the route_config to your use case. (Hugging Face)

7.1 Minimal example

import json
from typing import Any, Dict, List
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "vidavox/SKK-Router-1.5B"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Please use our provided prompt for best performance
TASK_INSTRUCTION = """
You are a helpful assistant designed to find the best suited route.
You are provided with route description within <routes></routes> XML tags:
<routes>

{routes}

</routes>

<conversation>

{conversation}

</conversation>
"""

FORMAT_PROMPT = """
Your task is to decide which route is best suit with user intent on the conversation in <conversation></conversation> XML tags.  Follow the instruction:
1. If the latest intent from user is irrelevant or user intent is full filled, response with other route {"route": "other"}.
2. You must analyze the route descriptions and find the best match route for user latest intent. 
3. You only response the name of the route that best matches the user's request, use the exact name in the <routes></routes>.

Based on your analysis, provide your response in the following JSON formats if you decide to match any route:
{"route": "route_name"} 
"""

# Define route config
route_config = [
    {
        "name": "basic",
        "description": "Answering simple questions that ask for factual information, term meanings, or general knowledge.",
    },
    {
        "name": "complex",
        "description": "Handling specific, complex, or multi (more than one task) questions that require multi-step reasoning and interaction with databases to fetch and process data. For example, answering questions that need calculations, data analysis, or synthesis of information from multiple sources.",
    },
]

# Helper function to create the system prompt for our model
def format_prompt(
    route_config: List[Dict[str, Any]], conversation: List[Dict[str, Any]]
):
    return (
        TASK_INSTRUCTION.format(
            routes=json.dumps(route_config), conversation=json.dumps(conversation)
        )
        + FORMAT_PROMPT
    )

# Define conversations
conversation = [
    {
        "role": "user",
        "content": "Apa pengertian dari Cadangan A dan berapa jumlahnya untuk Lapangan X?",
    }
]
route_prompt = format_prompt(route_config, conversation)
messages = [
    {"role": "user", "content": route_prompt},
]
input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

# 2. Generate
generated_ids = model.generate(
    input_ids=input_ids,  # or just positional: model.generate(input_ids, โ€ฆ)
    max_new_tokens=32768,
)

# 3. Strip the prompt from each sequence
prompt_lengths = input_ids.shape[1]  # same length for every row here
generated_only = [
    output_ids[prompt_lengths:]  # slice off the prompt tokens
    for output_ids in generated_ids
]

# 4. Decode if you want text
response = tokenizer.batch_decode(generated_only, skip_special_tokens=True)[0]
print(response)

In the actual SKK agent system, this "route" is then used to decide whether to call the basic or reasoning LLM.


8. Limitations & Known Failure Modes

Limitations

  • Multi-turn conversations: The model may be less reliable on very long, multi-turn chats with shifting intent. It was primarily trained on shorter, focused interactions.
  • Ambiguous queries: If the question does not clearly indicate complexity (e.g., vague or underspecified prompts), the router may pick an unintuitive route.
  • Out-of-domain content: Questions unrelated to SKK / KSMI / upstream O&G may be routed unpredictably, since the training data is domain-specific.
  • Binary perspective: The router assumes a simple basic vs complex split; if you need multiple levels of reasoning or different tools, you may need to extend the label space and re-train.

Safety considerations

  • Not designed for medical, legal, or financial decision-making.
  • Should not be used in settings where an incorrect routing decision can cause harm or safety-critical failures.
  • Outputs are not explanations; they are discrete labels used for orchestration.

9. Bias & Data Caveats

  • Training data is heavily skewed toward:

    • SKK upstream petroleum / regulatory topics.
    • Text derived from or inspired by KSMI and related technical documents.
  • Language mix:

    • Bilingual Indonesian/English, but primarily focused on expert / technical wording typical for this domain.
  • As a result:

    • The model may over-assume that questions with regulatory or technical phrasing are โ€œcomplexโ€.
    • It may not behave sensibly on informal, social-media style data or on domains very different from SKK.

Because the underlying data is private and internal, users cannot independently audit its biases or coverage. Treat this model as highly specialized rather than general-purpose.


10. License & Usage

This model is a fine-tuned derivative of katanemo/Arch-Router-1.5B, which is distributed under the Katanemo research license. (Hugging Face)

  • License on this repo: other โ€“ katanemo-research.

  • By using this model, you must comply with:

    • the original Katanemo license for Arch-Router, and
    • any additional internal policies that apply to SKK data and systems.

Intended usage policy

  • Allowed / intended:

    • Research and experimentation on routing for question complexity.
    • Internal use as part of the SKK Internal Agent System.
    • Exploration of routing strategies in similar regulatory or technical domains, provided you have rights to the underlying data.
  • Not recommended / discouraged:

    • Exposing this router directly to end users as a chatbot.
    • Using it as a general-purpose router outside its domain without additional evaluation.
    • Using the model, or any system built with it, as the sole basis for safety-critical decisions.

This description is not legal advice. For any production or commercial deployment, please review the Katanemo research license and your own organizational policies with qualified counsel.


11. Citation

If you use this model or build upon it in academic or technical work, please consider citing the Arch-Router paper:

@article{tran2025archrouter,
  title   = {Arch-Router: Aligning LLM Routing with Human Preferences},
  author  = {Tran, Co and Paracha, Salman and Hafeez, Adil and Chen, Shuguang},
  journal = {arXiv preprint arXiv:2506.16655},
  year    = {2025}
}

And you may also reference this checkpoint as:

vidavox/SKK-Router-1.5B (v1.0 โ€“ SKK Router for internal routing), fine-tuned from katanemo/Arch-Router-1.5B on SKK-specific synthetic and manually curated routing data for basic vs complex question routing.

Downloads last month
18
Safetensors
Model size
2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vidavox/SKK-Router-1.5B

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(3)
this model