System Prompt Encoder

Overview

System Prompts (also called System Instructions) are processed by LLM models before they begin processing user input. They define the model's identity, task, and behavior - essentially establishing who the AI should be and what it should do.

Key characteristics: - Processed before every user interaction - Define model behavior and constraints - Often repeated thousands of times per day - Can be 500-5,000+ tokens - Critical for consistent model performance

The more detailed a system prompt, the better the model understands and performs its intended task. However, detailed prompts are token-intensive and costly at scale.

Typical compression: 75-90% token reduction

Prompt Modes

CLM recognizes two distinct types of system prompts, each with specialized compression strategies:

Task Prompts

Task prompts are action-oriented instructions that tell the model what to do for a specific task. They typically include:

Intent: The primary action (analyze, extract, generate, summarize)
Target: What to operate on (transcript, document, data)
Output Format: Expected response structure (JSON, text, table)
Constraints: Rules, validations, and boundaries

Example:

You are a Betting Analysis system. Analyze soccer matches and provide betting odds.
Return your analysis as: {"win": 0.45, "draw": 0.30, "lose": 0.25}

Compressed:

[REQ:PREDICT:SPECS:BETTING_ODDS][TARGET:REPORT:DOMAIN=BUSINESS][OUT_JSON:{win:FLOAT,draw:FLOAT,lose:FLOAT}]

Learn more about Task Prompts

Configuration Prompts

Configuration prompts are template-based instructions that define persistent behavior and can be parameterized at runtime. They typically include:

Role: The assistant's identity and persona
Rules: Basic and custom behavioral rules
Priority: How to handle rule conflicts
Placeholders: Runtime variables (e.g., {{user_name}}, {{context}})
Output Format: Structured output requirements

Example:

<role>You are a helpful assistant</role>

<basic_rules>
Follow standard guidelines for clarity and accuracy.
</basic_rules>

<custom_rules>
Always greet the user by name: {{user_name}}
</custom_rules>

Custom instructions are paramount. If there are conflicts, prioritize custom rules.

Compressed:

[PROMPT_MODE:CONFIGURATION][ROLE:HELPFUL_ASSISTANT][RULES:BASIC,CUSTOM][PRIORITY:CUSTOM_OVER_BASIC]

Learn more about Configuration Prompts

When to Use Each Mode

Aspect	Task Prompt	Configuration Prompt
Purpose	Execute a specific task	Define persistent model behavior
Structure	Action + Target + Output	Role + Rules + Placeholders
Runtime	Static per request	Parameterized with runtime values
Use Case	Data extraction, analysis, generation	Chatbots, assistants, configurable systems
Compression Focus	Intent, extraction fields, output schema	Role, rules, meta-instructions

Why Compress System Prompts?

The Scale Problem

In production environments, system prompts create significant overhead:

Scenario	Details	Impact
Contact Center	1,000 users x 20,000 calls/day	20M prompt tokens/day
Chatbot Service	10,000 concurrent users x 50 interactions/day	500K prompt tokens/day
Enterprise Tool	500 employees x 100 queries/day	50K prompt tokens/day

Benefits of Compression

Cost Reduction - 75-90% fewer tokens = 75-90% cost savings on system prompts - At scale, this translates to significant monthly savings

Faster Processing - Smaller prompts = faster inference - Reduced latency by 30-73% - Better user experience

Higher Context Window Utilization - More room for actual conversation - Longer context histories - More examples in few-shot learning

Better Scalability - Handle more concurrent requests - Lower infrastructure requirements - Smoother peak load handling

Quick Start

Task Prompt Compression

from clm_core import CLMConfig, CLMEncoder

cfg = CLMConfig(lang="en")
encoder = CLMEncoder(cfg=cfg)

task_prompt = """
You are a customer service quality analyst. Analyze call transcripts
for compliance violations and sentiment issues.
"""

result = encoder.encode(task_prompt)
print(result.compressed)
# [REQ:ANALYZE][TARGET:TRANSCRIPT:DOMAIN=QA][EXTRACT:COMPLIANCE,SENTIMENT,ISSUE]

Configuration Prompt Compression

from clm_core import CLMConfig, CLMEncoder
from clm_core.components.sys_prompt import ConfigurationPromptEncoder

cfg = CLMConfig(lang="en")
encoder = CLMEncoder(cfg=cfg)

config_prompt = """
<role>You are a helpful customer support agent</role>

<basic_rules>
Be polite and professional.
</basic_rules>

<custom_rules>
Always address the customer as {{customer_name}}.
</custom_rules>

Custom instructions are paramount.
"""

# The encoder automatically detects configuration prompts
result = encoder.encode(config_prompt)
print(result.compressed)
# [PROMPT_MODE:CONFIGURATION][ROLE:CUSTOMER_SUPPORT_AGENT][RULES:BASIC,CUSTOM][PRIORITY:CUSTOM_OVER_BASIC]

Next Steps

Task Prompt Encoding - Deep dive into task-oriented compression
Configuration Prompt Encoding - Learn about template-based compression
Advanced: Token Hierarchy - Understand semantic tokens
Advanced: CLM Dictionary - Language-specific vocabularies