π§ Memory & State Management¶
Note
π Hey there! Siyarix is a personal passion project built by a single developer that is growing and under active development. Some of the architectural components and features described on this page might currently be Planned, Work in Progress, or basic implementations. Stay tuned as it evolves! π
Welcome to the heart of Siyarix! This document outlines our multi-layered memory and state management system. We designed this architecture to flawlessly handle everything from lightning-fast in-memory processing to reliable SQLite persistence and portable file-based exports.
At a high level, the system comprises several specialized components: - KnowledgeGraph: Connects the dots on infrastructure relationships. - MemoryManager: Powers our semantic memory using embeddings. - ChatSession: Handles conversational history with powerful branching capabilities. - SessionKernel: Persists state across sessions using JSON/JSONL. - CacheManager: Keeps things snappy with LRU (Least Recently Used) and TTL (Time-To-Live) caching. - Context Manager: Carefully optimizes what the LLM sees to maximize context window efficiency. - Continuous Learning System (CLS): Learns new skills dynamically while strictly preserving privacy.
π₯ Memory Layers¶
Siyarix categorizes memory into three distinct, robust layers.
Note
This layered approach ensures that fast, ephemeral data lives in RAM, critical operations persist safely to disk, and shareable insights can be effortlessly exported.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β‘ In-Memory (Session Runtime) β
β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β Knowledge β β MemoryManager β β Context β β
β β Graph β β (semantic memory β β Manager β β
β β (entities, β β + embeddings) β β (window β β
β β relations) β β β β build/ β β
β β β β β β compress) β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β CacheManager β β Conversation β β Continuous β β
β β (LRU + TTL) β β History (deque) β β Learning β β
β β β β Session Messages β β System (CLS) β β
β β β β maxlen=300) β β (skill cache)β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β πΎ SQLite (Persistent) β
β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β OfflineStore β β Continuous β β ProviderStateβ β
β β (scans, β β Learning System β β Manager β β
β β findings, β β .db) β β (cooldown, β β
β β plans) β β β β failures) β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π File-Based (Export/Import) β
β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β Reports β β ChatSession β β Knowledge β β
β β (MD/HTML/ β β Exports β β Graph JSON β β
β β JSON/SARIF) β β (JSONL tree fmt, β β Export β β
β β β β PDF, TXT, MD) β β β β
β ββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββ ββββββββββββββββββββ β
β β SessionKernelβ β Tool Failure β β
β β (JSON files) β β State β β
β β β β (tool_failures β β
β β β β .json) β β
β ββββββββββββββββ ββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
πΈοΈ 1. KnowledgeGraph¶
Located at siyarix/knowledge_graph.py, the KnowledgeGraph is a dynamic, in-memory directed graph. It maps out all discovered infrastructure entities and their intricate relationships.
Tip
Think of this as the "brain's map" of the target environment. It allows Siyarix to understand that a specific vulnerability lives on a service, which in turn runs on a particular host.
π’ Node Types¶
| Node | Attributes | Example |
|---|---|---|
HOST |
IP, hostname, OS, MAC | 10.0.0.1 |
PORT |
Number, protocol, state | 80/tcp open |
SERVICE |
Name, version, banner | Apache 2.4.41 |
VULNERABILITY |
CVE ID, severity, CVSS | CVE-2024-1234 |
DOMAIN |
FQDN, registrar, DNS | example.com |
CREDENTIAL |
Username, type, hash | admin:$2y$10$... |
FINDING |
Tool, description, ref | Nmap finding |
π Edge Types¶
| Edge | Source β Target | Meaning |
|---|---|---|
RUNS_ON |
Service β Host | Service runs on host |
HAS_PORT |
Host β Port | Host has open port |
HAS_VULN |
Service β Vulnerability | Service has vulnerability |
RESOLVES_TO |
Domain β Host | Domain resolves to IP |
USES_CRED |
Service β Credential | Service uses credential |
RELATED_TO |
Finding β Finding | Related findings |
π οΈ Key Operations¶
- Pathfinding: BFS (Breadth-First Search) to find the shortest path between any two entities.
- Advanced Querying: Extract subgraphs by node type, attribute, or relationship.
- Real-time Parsing: Instantly inserts new nodes and edges directly from tool parser outputs.
- Persistence: Easily export/import state via JSON (
save_json/load_json) so no context is lost between sessions.
π§ 2. MemoryManager¶
Located at siyarix/memory.py, the MemoryManager handles our semantic, long-term memory utilizing vector embeddings.
Info
Semantic memory empowers Siyarix to recall past learnings contextually, rather than relying on exact keyword matches.
π‘ Core Methods¶
memory = MemoryManager()
# Store a new memory with rich metadata
await memory.store(
content="Host 10.0.0.1 has Apache 2.4.41 running on port 80",
metadata={"source": "nmap", "session_id": "sess-123"}
)
# Search for related concepts
similar = await memory.search_similar("Apache versions", top_k=5)
# Grab all relevant context for a specific target
context = await memory.get_context(target="10.0.0.1")
| Method | Purpose |
|---|---|
store(content, metadata) |
Saves a new memory entry into the semantic vault. |
search_similar(query, top_k) |
Uses embeddings to find the most conceptually similar memories. |
get_context(target) |
Retrieves a consolidated background context for a given target. |
ποΈ 3. Context Manager¶
Located at siyarix/context.py, the Context Manager is the gatekeeper for the LLM. It intelligently builds, compresses, and optimizes the context window so the LLM gets precisely what it needs without overflowing its token budget.
context = ContextManager(memory=memory_manager)
# Log conversation history
context.add_history("User message", "user")
context.add_history("Assistant response", "assistant")
# Build the perfectly sized context payload
history = context.get_history()
context = context.build_context(
conversation_history=history,
knowledge_subgraph=relevant_entities,
session_state={"mode": "autonomous", "target": "10.0.0.1"},
tool_availability=available_tools,
memory_entries=relevant_memories,
max_tokens=8192,
)
ποΈ Compression via CompactionEngine¶
When context gets too large, the CompactionEngine (siyarix/compaction.py) steps in to aggressively yet safely compress the payload.
Warning
Failing to compress context effectively can lead to LLM truncation errors and hallucinations. The CompactionEngine prevents this.
compactor = CompactionEngine()
tokens = compactor.analyze_tokens(raw_context)
compressed = compactor.compress_context(raw_context, target_tokens=4096)
| Strategy | Description | Token Reduction |
|---|---|---|
| Truncation | Drops the oldest, least relevant conversation turns. | 20β40% |
| Summarization | Uses the LLM to summarize older history blocks. | 40β60% |
| KG Pruning | Retains only high-severity or immediately related graph entities. | 30β50% |
| Memory Prioritization | Filters out memories falling below a calculated importance threshold. | 50β70% |
| Deduplication | Strips out redundant tool outputs. | 10β20% |
π¬ 4. ChatSession¶
Located at siyarix/chat/session.py, the ChatSession manages conversation state. It's not just a flat listβit natively supports complex branching via a JSONL tree structure.
πΏ Branching Model¶
Ever wanted to explore a different train of thought without breaking your current conversation? Siyarix supports conversation forks!
Session Root
βββ Branch A (main thread)
β βββ Message 1
β βββ Message 2
β β βββ Branch B (forked from message 2)
β β βββ Message 3
β β βββ Message 4
β βββ Message 5
βββ Branch C (forked from root)
βββ Message 6
βοΈ Session Configuration¶
- Retains a rolling window of history (
maxlen=300). - Messages are robustly tracked using unique
id,parent,role,content,timestamp, andbranchidentifiers.
π€ Export Formats¶
Exporting a session is as simple as calling ChatSession.export().
| Format | Description |
|---|---|
json |
Standard JSON array of messages. |
jsonl |
Advanced JSONL tree format (perfect for reloading). |
pdf |
A polished PDF document for reporting. |
txt |
A simple, raw plain-text transcript. |
md |
Markdown transcript for beautiful rendering. |
html |
An interactive HTML document. |
ποΈ 5. SessionKernel¶
Located at siyarix/compat.py, the SessionKernel is the master controller for overarching session state and operational tracking.
kernel = SessionKernel()
session = kernel.start(
objective="Scan target network",
scope="10.0.0.0/24",
identity="operator-1",
)
# Track tactical operations
op = kernel.add_operation(session, "scan 10.0.0.1", "scan", "medium")
kernel.update_operation(session, op.operation_id, state="completed")
# Persist and Restore
path = kernel.save(session)
restored = kernel.load(session_id)
Note
Unlike other modules that use SQLite, the SessionKernel utilizes JSON-based persistence to easily track operation cards, state, mode, risk tier, and related artifacts.
- Supports distinct persistence tiers:
EPHEMERAL,WORKSPACE, andORG_SHARED.
β±οΈ 6. CacheManager¶
Located at siyarix/cache_manager.py, the CacheManager speeds up operations by temporarily holding onto frequently accessed data.
cache = CacheManager(
max_size=1000,
ttl=300,
persist_path="~/.siyarix/cache.db"
)
# Easily monitor cache health
stats = cache.get_stats()
# Result: CacheStats(hits=450, misses=30, hit_rate=0.94, size=200, evictions=15)
- Implements LRU (Least Recently Used) paired with strict TTL (Time-To-Live).
- Optionally persists to disk to survive reboots.
π 7. Continuous Learning System (CLS)¶
Located at siyarix/learning_system.py, the Continuous Learning System is how Siyarix gets smarter over time. It organically acquires new skills by observing operator behavior.
Danger
Privacy First Guarantee: Real targets are NEVER stored. Every hostname, IP, URL, email, or hash is strictly replaced with a {target} placeholder before any data is saved.
ποΈ Key Design Principles¶
- Separate Store: Learning data is completely isolated inside
learning_store.db. - Zero Dependencies: Relies purely on the Python standard library, employing a BM25-style Jaccard similarity engine over NLP token sets.
- Bayesian Confidence: Skills are rated using a Bayesian-smoothed confidence formula that factors in time decay and operational complexity.
π¦ Data Models¶
@dataclass
class LearnedStep:
tool: str
command_template: str # E.g., "nmap -sS {target}"
description: str
args: dict
@dataclass
class LearnedSkill:
skill_id: str
intent_pattern: str # The anonymised intent
steps: list[LearnedStep]
confidence: float # 0.0 to 1.0 (Bayesian-smoothed)
usage_count: int
success_count: int
tokens: list[str] # NLP tokens for rapid similarity matching
source: str # Origin: 'llm', 'offline', or 'inferred'
π The Learning Flow¶
- Observe: Functions like
observe_llm_action()passively watch the execution. - Anonymize: Scour and scrub the data, replacing real endpoints with
{target}. - Match: Run multi-tier similarity checks (β₯0.60 is strong, <0.35 implies a brand new skill).
- Learn: Adjust confidence, extract parameters, and merge overlapping steps.
- Inject: High-confidence skills get promoted and can be executed automatically.
- Maintain: Constantly prune, decay old skills, and merge redundancies.
π Integration¶
- Integrated Mode: Skills exceeding 80% confidence trigger automatic execution before the LLM is even consulted.
- Offline Mode: Learned skills dramatically enhance the heuristic planner.
- Synonyms: Maps human keywords to specific tools to beef up the NLP engine.
β»οΈ State Lifecycle¶
Ever wonder what happens from the moment Siyarix boots up until it safely shuts down?
π Session Start
β
βββ Load config from ~/.siyarix/settings.toml
βββ Initialize KnowledgeGraph (empty or restore from JSON)
βββ Initialize MemoryManager (load persisted embeddings)
βββ Initialize CacheManager (load disk cache)
βββ Initialize Continuous Learning System (load skill library)
βββ Open OfflineStore (SQLite WAL)
βββ Open ProviderStateManager (JSON file)
β
βΌ
π₯ Session Active
β
βββ KnowledgeGraph populated from tool outputs (real-time)
βββ MemoryManager updated from tool outputs
βββ Conversation history appended (deque maxlen=300)
βββ Continuous Learning System passively observes execution
βββ Findings continuously stored in OfflineStore
βββ Commands meticulously tracked via SessionKernel
βββ Provider state tracked (cooldowns, failures, API costs)
βββ Cache populated/evicted via LRU + TTL strategies
β
βΌ
π Session End
β
βββ Save KnowledgeGraph to JSON (if configured)
βββ Persist MemoryManager embeddings safely to disk
βββ Save comprehensive session via SessionKernel
βββ Flush CacheManager memory to disk
βββ Generate polished post-session reports
βββ Safely close all SQLite connections
βββ Trigger CLS maintenance (prune, decay, merge)
βββ Clear ephemeral in-memory state gracefully
π§© Integration Points¶
Hereβs a quick-reference cheat sheet for how everything connects:
| Component | Role |
|---|---|
| Context Manager | Curates and compresses the LLM context from the KG, memory, and history. |
| MemoryManager | Manages vector-based semantic memory. |
| KnowledgeGraph | Maps real-time entity relationships. |
| ChatSession | Houses branching conversation trees in JSONL. |
| SessionKernel | Masters JSON-based session persistence and restoration. |
| CacheManager | Disk-backed LRU + TTL caching. |
| OfflineStore | Persists offline scans and findings to SQLite. |
| OfflineQueue | Queues requests for disconnected execution. |
| CompactionEngine | Trims context payload to respect LLM token budgets. |
| Continuous Learning System | Siyarix's privacy-first evolving skill library. |
| ProviderStateManager | Tracks API provider health, cooldowns, and failures. |
| ToolCallTracker | Remembers tool failures to avoid repeated mistakes. |
| EventBus | Broadcasts state changes globally (e.g., kg.updated, cache.evicted). |