AI Trainer (ai-trainer) — Workflow Guide

File Information

Property	Value
Binary Name	ai-trainer
Version	9.0.1
File Size	2.8MB
Author	Warith Al Maawali warith@digi77.com
License	LicenseRef-Kodachi-SAN-1.0
Category	AI & Intelligence
Description	AI model training and validation for Kodachi OS command intelligence
JSON Data	View Raw JSON

SHA256 Checksum

59d5be5a3db52848a9881cff9274ef6e41a9480f791a7533174890a4e6ce125f

What ai-trainer Does

ai-trainer manages the machine learning model lifecycle — downloading ONNX models, training from command metadata, validating accuracy, and managing versioned snapshots. It produces the models that ai-cmd uses for semantic command matching.

Key Capabilities

Feature	Description
Model Download	Fetch ONNX model and tokenizer for neural embeddings
Full Training	Train from command metadata JSON file
Incremental Training	Add new commands without full retraining
Validation	Test model accuracy against a test dataset
Snapshots	Save and restore versioned model states
Export	Export model to JSON for backup/deployment

Scenario 1: First-Time Model Setup — Complete Engine Installation

Complete end-to-end setup from zero to a working trained model with ONNX neural engine.

# Step 1: Download the ONNX model and tokenizer
sudo ai-trainer download-model

# Step 2: Verify download was successful
sudo ai-trainer status

# Step 3: Train the model from command metadata
sudo ai-trainer train --data ./data/training-data.json

# Step 4: Validate model accuracy
sudo ai-trainer validate --test-data ./data/test-commands.json

# Step 5: Create an initial snapshot for rollback safety
sudo ai-trainer snapshot -v 1.0.0

# Step 6: Test it with ai-cmd using the ONNX engine (Tier 2)
ai-cmd query "check network connectivity" --engine onnx --dry-run
ai-cmd query "rotate tor circuit" --engine onnx --dry-run

# Step 7: If results look good, export for backup
sudo ai-trainer export -o model_export.json --format full

After this: ai-cmd automatically uses the trained ONNX model (Tier 2) for semantic matching instead of just TF-IDF (Tier 1). The neural engine provides much better understanding of natural language queries.

What Tier 2 enables: - Semantic similarity matching (understands "check network" = "test connectivity") - Context-aware intent recognition - Better handling of synonyms and variations - Higher accuracy for complex queries

Scenario 2: New Binary Installed — Updating the Model

When a new Kodachi binary is deployed, use ai-discovery + ai-trainer to make its commands available via ai-cmd.

# Step 1: ai-discovery detects new binary automatically (if daemon running)
ai-discovery status --verbose

# Or force a reindex
sudo ai-discovery reindex

# Step 2: Create a pre-update snapshot for safety
sudo ai-trainer snapshot -v "pre-update-$(date +%Y%m%d)"

# Step 3: Add new commands incrementally (preserves existing model)
sudo ai-trainer incremental --new-data ./data/training-data.json

# Step 4: Validate the updated model hasn't degraded
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90

# Step 5: Test new commands work via ai-cmd
ai-cmd preview "use the new service" --alternatives 5
ai-cmd query "new service command" --dry-run

# Step 6: If validation passes, create post-update snapshot
sudo ai-trainer snapshot -v "post-update-$(date +%Y%m%d)"

Why incremental? Full retraining takes minutes. Incremental training takes seconds and preserves the existing model knowledge while adding new commands.

Cross-binary workflow: 1. ai-discovery — Detects new binary and extracts command metadata 2. ai-trainer — Incrementally updates model with new commands 3. ai-cmd — Immediately queries new commands with neural matching

Scenario 3: Monthly Retraining Cycle

Run a full retraining cycle monthly to incorporate all accumulated feedback and new commands.

# Step 1: Create a pre-retraining snapshot
sudo ai-trainer snapshot -v "monthly-$(date +%Y-%m)-pre"

# Step 2: Check current accuracy baseline with ai-learner
sudo ai-learner analyze --period last-30-days --metric accuracy

# Step 3: Run full retraining with complete dataset
sudo ai-trainer train --data ./data/training-data.json

# Step 4: Validate against test dataset with 90% threshold
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90

# Step 5: Compare accuracy to previous month
sudo ai-learner analyze --period last-7-days --metric accuracy

# Step 6: If validation passes, create post-retraining snapshot
sudo ai-trainer snapshot -v "monthly-$(date +%Y-%m)-post"

# Step 7: Export the updated model for backup
sudo ai-trainer export -o model_monthly_$(date +%Y%m).json --format full

# Step 8: Verify ai-cmd works with the retrained model
ai-cmd query "check dns leaks" --engine onnx --dry-run
ai-cmd query "rotate tor" --engine onnx --dry-run

Automate this with ai-scheduler:

# Schedule monthly retraining on the 1st at midnight
ai-scheduler add --name "monthly-retrain" \
  --command "ai-trainer train --data ./data/training-data.json" \
  --cron "0 0 1 * *"

# Schedule post-training validation
ai-scheduler add --name "monthly-validate" \
  --command "ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90" \
  --cron "0 1 1 * *"

Cross-binary workflow: 1. ai-trainer — Snapshots, trains, validates model 2. ai-learner — Analyzes accuracy trends before/after 3. ai-cmd — Verifies query performance with updated model 4. ai-scheduler — Automates monthly execution

Scenario 4: Model Rollback After Bad Training

If a training run degrades accuracy, roll back to a previous snapshot and diagnose the issue.

# Step 1: Detect the problem — accuracy dropped after training
sudo ai-learner analyze --period last-7-days --metric accuracy
# Shows: 65% accuracy (was 85%)

# Step 2: See available snapshots
sudo ai-trainer list-snapshots

# Output:
# v1.0.0 (2026-01-15)
# monthly-2026-01-post (2026-02-01)
# pre-update-20260209 (2026-02-09)   <-- last known good

# Step 3: Create a "broken" snapshot for diagnosis later
sudo ai-trainer snapshot -v "broken-$(date +%Y%m%d)"

# Step 4: Backup current database before recovery
ai-admin db backup --output ./backup/before-rollback.db

# Step 5: Retrain from the original clean dataset
sudo ai-trainer train --data ./data/training-data.json

# Step 6: Validate recovered accuracy
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.85

# Step 7: Verify ai-cmd works correctly
ai-cmd query "check network" --engine onnx --dry-run
ai-cmd preview "rotate tor" --alternatives 5

# Step 8: If recovered, create recovery snapshot
sudo ai-trainer snapshot -v "recovery-$(date +%Y%m%d)"

Cross-binary workflow: 1. ai-learner — Detects accuracy degradation 2. ai-trainer — Creates diagnostic snapshot, retrains, validates 3. ai-admin — Backs up database state for recovery 4. ai-cmd — Verifies query performance post-recovery

Diagnosis tips: - Compare "broken" snapshot export with "good" snapshot - Check training data for corrupted/malformed entries - Review incremental training logs for anomalies - Test with smaller dataset to isolate problematic commands

Scenario 5: Training Data Management

How to prepare and maintain training data quality for optimal model performance.

Training Data Format

Training data lives in ./data/training-data.json:

{
  "metadata": {
    "version": "1.0.0",
    "generated_at": "2026-02-09T12:00:00Z",
    "total_commands": 150,
    "categories": ["network", "security", "system", "privacy"]
  },
  "commands": [
    {
      "command_id": "health-control_net-check",
      "service": "health-control",
      "command": "net-check",
      "description": "Check network connectivity and status",
      "category": "network",
      "examples": [
        "check network",
        "test internet connection",
        "is my network working",
        "verify connectivity"
      ],
      "keywords": ["network", "connectivity", "internet", "connection"],
      "typical_queries": [
        "check network status",
        "verify internet connection",
        "test network connectivity"
      ]
    },
    {
      "command_id": "tor-switch_status",
      "service": "tor-switch",
      "command": "status",
      "description": "Show current Tor network status",
      "category": "privacy",
      "examples": [
        "tor status",
        "check tor connection",
        "is tor running",
        "show tor state"
      ],
      "keywords": ["tor", "anonymity", "privacy", "status"],
      "typical_queries": [
        "check tor status",
        "verify tor connection",
        "show tor network state"
      ]
    }
  ]
}

Adding New Command Patterns

# Step 1: Edit training data to add new patterns
nano ./data/training-data.json

# Step 2: Validate JSON format
cat ./data/training-data.json | jq .

# Step 3: Update model incrementally (fast, preserves existing knowledge)
sudo ai-trainer incremental --new-data ./data/training-data.json

# Step 4: Validate the update
sudo ai-trainer validate --test-data ./data/test-commands.json

# Step 5: Test via ai-cmd with preview mode
ai-cmd preview "new command phrase" --alternatives 5

# Step 6: If satisfied, create snapshot
sudo ai-trainer snapshot -v "added-new-patterns-$(date +%Y%m%d)"

Data Quality Tips

DO: - Keep examples concise (5-10 words each) - Include diverse phrasing variations for each command (4-6 examples minimum) - Use natural language users would actually type - Balance training data across all categories - Add new commands as they're discovered by ai-discovery

DON'T: - Overlap intent examples across different commands (confuses the model) - Use overly technical jargon (users type casual phrases) - Include deprecated or obsolete commands - Create imbalanced datasets (100 network commands, 5 privacy commands) - Use identical phrasing for different commands

Test Data Format

Test data (./data/test-commands.json) validates model accuracy:

{
  "test_queries": [
    {
      "query": "check my internet connection",
      "expected_command_id": "health-control_net-check",
      "expected_service": "health-control"
    },
    {
      "query": "is tor working",
      "expected_command_id": "tor-switch_status",
      "expected_service": "tor-switch"
    }
  ]
}

Validation workflow:

# Test with different accuracy thresholds
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.85
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.95

# Get detailed JSON output for analysis
sudo ai-trainer validate --test-data ./data/test-commands.json --json | jq .

Scenario 6: Engine Download and Model Management

Comprehensive ONNX model download and management workflow.

Initial Model Download

# Step 1: Download ONNX model and tokenizer to default location
sudo ai-trainer download-model

# Output:
# Downloading ONNX model (all-MiniLM-L6-v2)...
# Downloading tokenizer files...
# Model saved to: /opt/kodachi/ai-models/
# Download complete.

# Step 2: Check download status and model info
sudo ai-trainer status

# Output shows:
# - Model location
# - Model version
# - Tokenizer files
# - Training status
# - Last validation metrics

Custom Download Location

# Download to custom directory (useful for network shares or custom setups)
sudo ai-trainer download-model --output-dir /mnt/models/kodachi

# Note: You'll need to configure ai-cmd to use this custom location
# via environment variable or config file

Force Re-download (Recovery)

# If model files are corrupted or outdated, force fresh download
sudo ai-trainer download-model --force

# With custom location and force
sudo ai-trainer download-model --output-dir ./custom-models --force

# With JSON output for scripting
sudo ai-trainer download-model --force --json | jq .

Complete Setup Workflow with Custom Location

# Step 1: Download to custom directory
sudo ai-trainer download-model --output-dir /opt/kodachi-custom/models

# Step 2: Verify download with JSON output
sudo ai-trainer download-model --output-dir /opt/kodachi-custom/models --json | jq .

# Step 3: Check status
sudo ai-trainer status --json | jq .

# Step 4: Train with the downloaded model
sudo ai-trainer train --data ./data/training-data.json

# Step 5: Validate training results
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90

# Step 6: Export model for backup/deployment
sudo ai-trainer export -o model_backup.json --format full

# Step 7: Test with ai-cmd using ONNX engine
ai-cmd query "check network" --engine onnx --dry-run

What Downloaded Model Enables

Before download (TF-IDF only):

ai-cmd query "check network"
# Uses: Tier 1 (TF-IDF keyword matching)
# Accuracy: 70-75%

After download + training (ONNX neural):

ai-cmd query "check network" --engine onnx
# Uses: Tier 2 (Neural semantic embeddings)
# Accuracy: 90-95%
# Understands: "test connectivity" = "check network"

Model Management Best Practices

# 1. Always verify download success
sudo ai-trainer download-model
sudo ai-trainer status  # Confirm model files exist

# 2. Create backup after initial download
sudo ai-trainer export -o model_fresh_download.json --format full

# 3. Snapshot after successful training
sudo ai-trainer train --data ./data/training-data.json
sudo ai-trainer snapshot -v "post-download-train"

# 4. Periodic re-validation
sudo ai-trainer validate --test-data ./data/test-commands.json --threshold 0.90

# 5. Export for deployment to other systems
sudo ai-trainer export -o model_production_$(date +%Y%m%d).json --format full

Troubleshooting Downloads

Download fails or times out:

# Check internet connectivity
curl -I https://huggingface.co

# Try force re-download with verbose output
sudo ai-trainer download-model --force --json | tee download.log

# Check disk space
df -h /opt/kodachi/ai-models/

Model files corrupted:

# Force fresh download
sudo ai-trainer download-model --force

# Verify integrity
sudo ai-trainer status

# Test with simple query
ai-cmd query "test" --engine onnx --dry-run

Custom location not working:

# Ensure directory exists and has correct permissions
sudo mkdir -p /custom/path/models
sudo chown -R kodachi:kodachi /custom/path/models

# Download to custom location
sudo ai-trainer download-model --output-dir /custom/path/models

# Verify files exist
ls -la /custom/path/models/

Quick Reference: All Commands

Based on ai-trainer -e output:

# MODEL TRAINING
sudo ai-trainer train --data commands.json
sudo ai-trainer train --data commands.json --database custom.db
sudo ai-trainer train --data commands.json --json

# INCREMENTAL TRAINING
sudo ai-trainer incremental --new-data updates.json
sudo ai-trainer incremental --new-data updates.json --database custom.db --json

# VALIDATION
sudo ai-trainer validate --test-data test_commands.json
sudo ai-trainer validate --threshold 0.90
sudo ai-trainer validate --database custom.db --json
sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json

# MODEL EXPORT
sudo ai-trainer export -o model_export.json
sudo ai-trainer export --format compact
sudo ai-trainer export --format stats --json
sudo ai-trainer export -o model.json --format full --json

# SNAPSHOTS
sudo ai-trainer snapshot -v 1.0.0
sudo ai-trainer list-snapshots
sudo ai-trainer list-snapshots --json
sudo ai-trainer snapshot -v 1.0.0 --json

# MODEL DOWNLOAD
sudo ai-trainer download-model
sudo ai-trainer download-model --output-dir ./custom-models --force
sudo ai-trainer download-model --output-dir ./custom-models --json
sudo ai-trainer download-model --force --json

# STATUS
sudo ai-trainer status
sudo ai-trainer status --json

Complete Training & Learning Cycle — The full 5-step workflow
New Binary Integration — Discovery → Train → Query flow
ai-cmd — Querying and feedback scenarios
ai-learner — Learning and analysis scenarios
ai-discovery — Binary detection and metadata extraction
ai-admin — Database backup and recovery
Full CLI Reference: ai-trainer commands

Troubleshooting

Problem	Cause	Solution
Training fails with "no data"	Missing or empty training-data.json	Verify file exists: `cat ./data/training-data.json \\| jq .`
Low accuracy after training	Insufficient or imbalanced training data	Add more diverse examples (4-6 per command), balance categories
Snapshot restore fails	Corrupt or missing snapshot	List available: `sudo ai-trainer list-snapshots`; try older one
ONNX model download fails	Network issue or disk space	Check connectivity and ensure 200MB+ free disk space
Incremental update degrades accuracy	Conflicting patterns	Roll back with snapshot restore, review overlapping examples
Validation shows 0% accuracy	Wrong test data format	Verify JSON format matches schema with `jq .`