π Multi-Provider Routing¶
Note
π Hey there! Siyarix is a personal passion project built by a single developer that is growing and under active development. Some of the architectural components and features described on this page might currently be Planned, Work in Progress, or basic implementations. Stay tuned as it evolves! π
Siyarix boasts robust support for 25 AI providers (24 cloud/local + 1 offline registry), all accessible through a unified, OpenAI-compatible adapter located in openai_compat.py.
At the heart of this system is the ProviderManager singleton. Think of it as the air traffic controller for your AI requestsβit handles provider registration, credential pooling, seamless failover, exponential-backoff cooldowns, and smart multi-model ensemble decisions.
π Supported Providers¶
Siyarix integrates with a wide array of top-tier AI providers. Here's a quick look at what's supported out of the box:
| Provider | Type | Env Variable | Default Model | Base URL |
|---|---|---|---|---|
| OpenAI | Cloud | OPENAI_API_KEY |
gpt-5.5 | (default) |
| Anthropic | Cloud | ANTHROPIC_API_KEY |
claude-sonnet-4-6 | (via openai compat) |
| Google Gemini | Cloud | GEMINI_API_KEY |
gemini-3.1-pro-preview | generativelanguage.googleapis.com/v1beta/openai/ |
| DeepSeek | Cloud | DEEPSEEK_API_KEY |
deepseek-v4-flash | api.deepseek.com |
| xAI (Grok) | Cloud | XAI_API_KEY |
grok-4.1 | api.x.ai |
| Perplexity | Cloud | PERPLEXITY_API_KEY |
sonar-pro | api.perplexity.ai |
| Groq | Cloud | GROQ_API_KEY |
llama-4-scout | api.groq.com/openai/v1 |
| Together AI | Cloud | TOGETHER_API_KEY |
Llama-4-Scout | api.together.xyz/v1 |
| OpenRouter | Cloud | OPENROUTER_API_KEY |
openai/gpt-5.5 | openrouter.ai/api/v1 |
| Cerebras | Cloud | CEREBRAS_API_KEY |
gpt-oss-120b | api.cerebras.ai/v1 |
| Fireworks AI | Cloud | FIREWORKS_API_KEY |
kimi-k2.6 | api.fireworks.ai/inference/v1 |
| Mistral AI | Cloud | MISTRAL_API_KEY |
(from profile) | api.mistral.ai |
| Z.AI | Cloud | ZAI_API_KEY |
glm-5.1 | api.z.ai/api/paas/v4 |
| MiniMax | Cloud | MINIMAX_API_KEY |
MiniMax-M3 | api.minimax.io/v1 |
| Moonshot | Cloud | MOONSHOT_API_KEY |
kimi-k2.6 | api.moonshot.ai/v1 |
| NVIDIA NIM | Cloud | NVIDIA_API_KEY |
Nemotron-3-Super | integrate.api.nvidia.com/v1 |
| HuggingFace | Cloud | HUGGINGFACE_API_KEY |
(varies) | api-inference.huggingface.co/v1 |
| Azure OpenAI | Cloud | AZURE_OPENAI_API_KEY |
gpt-5.5 | (user-configured) |
| OpenCodeZen | Cloud | OPENCODE_API_KEY |
deepseek-v4-flash | opencode.ai/zen/v1 |
| Ollama | Local | β | llama3.1 | localhost:11434/v1 |
| LM Studio | Local | β | (varies) | localhost:1234/v1 |
| llama.cpp | Local | β | (varies) | localhost:18080/v1 |
| vLLM | Local | β | (varies) | localhost:8000/v1 |
| LocalAI | Local | β | (varies) | localhost:8080/v1 |
| Registry | Heuristic | β | β | β |
Tip
Local providers like Ollama, LM Studio, and others typically don't require an API key environment variable. Siyarix is smart enough to handle them seamlessly!
ποΈ Architecture¶
Understanding how Siyarix routes a user request can help you debug and optimize your configuration. Here's a simplified flow:
User Input β _execute_instruction()
β
βΌ
ProviderManager.select_provider(preferred)
β
βΌ
ββββββββββββββββ
β Provider A β β preferred (user config or auto-detect)
β (primary) β
ββββββββ¬ββββββββ
β
ββββ Success βββββ Return result
β
ββββ Failure βββββ ProviderManager.classify_error()
βββ AUTH β mark credential "dead"
βββ RATE_LIMIT β exponential backoff
βββ TIMEOUT β retry with backoff
βββ CONTEXT_OVERFLOW β compact and retry
βββ MODEL_NOT_FOUND β fallback model
βββ SERVER_ERROR β record_failure with cooldown
β
βΌ
ProviderStateManager.record_failure()
(persistent cooldown across restarts via JSON)
Note
The ProviderStateManager ensures that failure cooldowns persist even if you restart Siyarix, preventing endless retry loops on failing APIs.
ποΈ Provider Manager (Singleton)¶
The ProviderManager is a thread-safe singleton, meaning there's only ever one instance running, and it safely handles requests from multiple threads. It centralizes all provider logic.
π Registration¶
All 25 providers are registered using individual profile files located in src/siyarix/providers/profiles/. This modular approach makes it super easy to add new providers in the future.
pm.register(ProviderProfile(
name="openai",
display_name="OpenAI",
default_model="gpt-5.5",
api_key_env="OPENAI_API_KEY",
base_url="",
supports_streaming=True,
supports_tools=True,
supports_vision=True,
cost_tier=CostTier.HIGH,
provider_type=ProviderType.CLOUD,
priority=10,
docs_url="https://platform.openai.com/docs/models",
))
Each profile defines its supported models using the ModelInfo dataclass, ensuring Siyarix knows exactly what each model is capable of:
ModelInfo(
name="gpt-5.5",
supports_vision=True,
supports_structured_output=True,
supports_function_calling=True,
context_window=1050000,
cost_tier=CostTier.HIGH,
)
π΅οΈ Auto-Detect¶
If you set model_provider = "auto", Siyarix isn't just guessing. ProviderManager.auto_detect_provider() intelligently scans through profiles based on priority, looking for configured API keys or running local endpoints.
def auto_detect_provider(self) -> str | None:
for profile in self.list_profiles():
if resolve_api_key(profile.name, profile.api_key_env):
return profile.name
if profile.provider_type == ProviderType.LOCAL and profile.base_url:
return profile.name
return None
βοΈ Preference Ordering¶
You can control the priority of providers via your settings.toml file. The list_profiles() function respects this configuration.
Providers are sorted first by their index in your priority list, and then by their default priority score.
π¦ Provider Data Models¶
To keep things organized, all data structures representing providers and models are stored in src/siyarix/providers/types.py.
ProviderProfile¶
This defines everything Siyarix needs to know about a provider.
@dataclass
class ProviderProfile:
name: str # Internal identifier (e.g. "openai")
display_name: str # Human-readable name
models: list[ModelInfo] # Supported models with capability metadata
default_model: str # Fallback model for this provider
api_key_env: str # Environment variable for API key
base_url: str # API base URL
supports_streaming: bool # Streaming support
supports_tools: bool # Function/tool calling
supports_vision: bool # Image input support
supports_structured_output: bool # JSON structured output mode
sdk_dependency: str # Optional SDK package requirement
max_tokens: int # Max output tokens
max_context_tokens: int # Max context window size
priority: int # Preference ordering
cost_tier: CostTier # FREE / LOW / MEDIUM / HIGH
provider_type: ProviderType # CLOUD or LOCAL
fallback_models: list[str] # Alternative models to try on failure
docs_url: str # Link to provider documentation
ProviderCredential¶
This keeps track of API keys, URLs, and the current health status of the credential.
@dataclass
class ProviderCredential:
provider: str
api_key: str = ""
base_url: str = ""
status: str = "active" # "active", "dead", or "cooldown"
cooldown_until: float = 0.0
failure_count: int = 0
last_used: float = 0.0
@property
def is_available(self) -> bool:
# True unless dead, in cooldown, or missing both key and URL
ModelInfo¶
Details the specific capabilities of an individual model.
@dataclass
class ModelInfo:
name: str
supports_vision: bool = False
supports_tools: bool = True
supports_structured_output: bool = False
supports_function_calling: bool = True
context_window: int = 8192
cost_tier: CostTier = CostTier.MEDIUM
π·οΈ Enums¶
We use standardized enums to keep categories consistent:
- FailoverReason: AUTH, RATE_LIMIT, BILLING, TIMEOUT, SERVER_ERROR, CONTEXT_OVERFLOW, MODEL_NOT_FOUND, UNKNOWN
- CostTier: FREE, LOW, MEDIUM, HIGH
- ProviderType: CLOUD, LOCAL
ClassifiedError¶
When something goes wrong, it's classified into an actionable format.
@dataclass
class ClassifiedError:
reason: FailoverReason
retryable: bool = True
should_rotate_credential: bool = False
should_fallback: bool = False
should_compress: bool = False
message: str = ""
π‘οΈ Error Classification & Failover¶
Robust error handling is critical when working with external APIs. Siyarix is designed to handle hiccups gracefully.
Classification Strategy¶
When an API call fails, ProviderManager.classify_error() kicks into action using a multi-pass strategy:
1. HTTP status code: Quickly maps standard errors (like 429 for rate limits) to a FailoverReason.
2. Error message text: Scans the error response for keywords (e.g., "rate limit", "timeout", "401").
3. Credential rotation hints: Detects auth or billing failures to trigger credential rotation.
Failover Reasons & Actions¶
| Reason | HTTP Status | Retryable | Action |
|---|---|---|---|
AUTH |
401, 403 | No | Mark credential dead, rotate |
RATE_LIMIT |
429 | Yes | Exponential backoff (10sβ20sβ40sβ...β3600s) |
BILLING |
402 | No | Mark credential dead |
TIMEOUT |
408 | Yes | Backoff (5sβ10sβ...β300s) |
SERVER_ERROR |
500, 502, 503, 504, 529 | Yes | Backoff (5sβ10sβ...β300s) |
CONTEXT_OVERFLOW |
β | Yes | Compact history, retry |
MODEL_NOT_FOUND |
404 | No | Fall back to alternative model |
UNKNOWN |
β | No | Propagate error |
Warning
If a credential fails due to AUTH or BILLING issues, Siyarix marks it as "dead" to prevent burning through retries and instantly pivots to a fallback provider.
Failure Recording (Circuit Breaking)¶
ProviderManager.record_failure() is Siyarix's built-in circuit breaker:
- AUTH/BILLING: Immediate halt. No further attempts with this credential.
- RATE_LIMIT: Calculates an exponential backoff time (up to an hour) to let the API recover.
- TIMEOUT/SERVER_ERROR: Uses a shorter backoff curve (up to 5 minutes) since these are often temporary glitches.
Per-Session "Skip-Known-Bad" Cache¶
Nobody likes waiting for the same failing model over and over. ProviderStateManager keeps a short-term memory (5 minutes) of failing (provider, model) combos to skip them entirely.
state_manager.mark_skip_candidate(session_id, "openai", "gpt-5.5")
state_manager.is_candidate_skipped(session_id, "openai", "gpt-5.5") # True for 5 min
Availability Checks¶
Need to know who's ready to work?
pm.get_available_providers(preferred=["openai", "gemini"])
# Returns only non-cooldown providers, with preferred ones at the top of the list
πΎ Provider State Manager¶
API state shouldn't be lost when you restart the app. The ProviderStateManager persists cooldown and failure states to a lightweight JSON file (provider_state.json).
This ensures that if you hit an hour-long rate limit, restarting Siyarix won't accidentally hammer the API again. It tracks:
- disabled: Timestamps for when cooldowns expire.
- failure_counts: How many times a provider has failed consecutively.
- last_fail_time: When the most recent failure happened.
state_manager.record_failure(provider, reason) # Saves to JSON automatically
state_manager.record_success(provider) # Clears cooldown status
state_manager.is_disabled(provider) # Checks if still in cooldown
state_manager.cooldown_remaining(provider) # Time left until ready
π Credential Resolution¶
Finding the right API key is handled by resolve_api_key(). It uses a smart, three-tier fallback approach:
- Credential Store: Checks the secure
CredentialStore(CredentialStore.retrieve(provider, "api_key")). - Environment Variable: Looks for standard env vars like
OPENAI_API_KEY. - Empty String: Allows local providers (like Ollama) to proceed without a key.
def resolve_api_key(provider: str, env_var: str | None = None) -> str | None:
# 1. Try credential store
# 2. Try environment variable
# 3. Return None
πͺͺ Model ID Normalization¶
Model names change, and standardizing them is crucial. model_aliases.py ensures that no matter what the user types, Siyarix knows the correct internal name.
from siyarix.model_aliases import normalize_model_id, resolve_alias, list_aliases, register_alias
model = normalize_model_id("anthropic", "claude-opus-4.8") # β "claude-opus-4-8"
model = normalize_model_id("gemini", "gemini-3-pro") # β "gemini-3.1-pro-preview"
model = normalize_model_id("deepseek", "deepseek-v4") # β "deepseek-v4-flash"
π¦ Ollama Utilities¶
Working with local models should be frictionless. ollama_utils.py provides helpers to ensure Ollama is running when you need it.
from siyarix.providers.ollama_utils import ensure_ollama_running
# Launches Ollama in background if configured and not already running
ensure_ollama_running()
Tip
Siyarix can automatically launch Ollama if model_provider is set to "ollama" or if _start_ollama_on_launch is enabled in your settings!
π― Provider Selection¶
Need to ask an AI a question? Here's how Siyarix decides who gets the job.
# Auto-detect the first available provider
provider, model = pm.select_provider(preferred=None)
# Explicitly request a specific provider
provider, model = pm.select_provider(preferred="openai")
Capability-Based Filtering¶
You can also ask Siyarix for providers that meet specific criteria:
# Get all cloud providers supporting function calling
providers = pm.get_providers_by_capability(function_calling=True, local=False, free=False)
# Get only free-tier local providers
free_local = pm.get_providers_by_capability(free=True, local=True)
# Get vision-capable providers
vision_providers = pm.get_providers_by_capability(vision=True)
π Usage Tracking¶
Keep an eye on your API costs! The UsageTracker (found in usage.py) monitors token consumption and estimates costs per provider.
from siyarix.providers import UsageTracker
tracker = UsageTracker()
tracker.record_call("openai", "gpt-5.5", input_tokens=500, output_tokens=150, cost_tier=CostTier.HIGH)
print(tracker.summary())
# LLM calls: 1 | Tokens: 500β 150β | Est. cost: $0.0086
Info
Usage statistics are persisted to JSON, allowing you to track costs and token limits across multiple sessions.
π©Ί Health Check¶
Wondering if your providers are online? Run the health check command:
This command pings all configured providers and reports back on their availability, latency, and any recent errors.
π Provider Statistics¶
For programmatic access to provider health:
stats = pm.stats()
# {
# "total_providers": 25,
# "credentials": {"openai": 1, "anthropic": 0, ...},
# "error_counts": {"openai": 3},
# }
π Related Modules¶
Want to dive deeper into the code? Here is where everything lives:
| Module | Path | Purpose |
|---|---|---|
ProviderManager |
src/siyarix/providers/manager.py |
Singleton provider registry, failover, ensemble, stats |
ProviderStateManager |
src/siyarix/providers/state.py |
Persistent cooldown state (JSON-based), skip-known-bad cache |
UsageTracker |
src/siyarix/providers/usage.py |
Token usage and cost estimation |
ProviderProfile / ModelInfo |
src/siyarix/providers/types.py |
Data models for provider metadata |
openai_compat.py |
src/siyarix/chat/openai_compat.py |
Universal OpenAI-compatible adapter |
normalize_model_id |
src/siyarix/model_aliases.py |
Model ID normalization and alias resolution |
ensure_ollama_running |
src/siyarix/providers/ollama_utils.py |
Ollama background launcher |
profiles/ |
src/siyarix/providers/profiles/ |
25 individual provider profiles |