Skip to content

ai-trainer

AI model training and validation for Kodachi OS command intelligence

Version: 9.0.1 | Size: 4.2MB | Author: Warith Al Maawali warith@digi77.com

License: LicenseRef-Kodachi-SAN-1.0 | Website: https://www.digi77.com


File Information

Property Value
Binary Name ai-trainer
Version 9.0.1
Build Date REDACTED-BUILD-TIME
Rust Version 1.82.0
File Size 4.2MB
Author Warith Al Maawali warith@digi77.com
License LicenseRef-Kodachi-SAN-1.0
Category Kodachi Binary
Description AI model training and validation for Kodachi OS command intelligence
Git Commit unknown
Metadata Generated 2026-06-08T16:22:03Z
Binary Timestamp Unknown
JSON Data View Raw JSON

SHA256 Checksum

b141bedc320133e0826028f13c0c69875ad790d8011ee3f1d6319167005d45a9

Features

# Feature
1 TF-IDF based command embeddings
2 Incremental model updates
3 Model validation and accuracy testing

Security Features

Feature Description
Input Validation Argument parsing via clap; per-command validation is the consumer's responsibility
Rate Limiting Not provided by cli-core
Authentication Not provided by cli-core (see online-auth)
Encryption Not provided by cli-core

System Requirements

Requirement Value
OS Linux (Debian-based)
Privileges root/sudo for system operations
Dependencies OpenSSL, libcurl

Global Options

Flag Description
-h, --help Print help information
-v, --version Print version information
-n, --info Display detailed information
-e, --examples Show usage examples
--json Output in JSON format
-o, --output-format <FORMAT> Force output format (text
--json-pretty Pretty-print JSON output with indentation
--json-human Enhanced JSON output with improved formatting (like jq)
--fields <FIELD_LIST> Select specific fields to include in output (comma-separated)
--limit <NUMBER> Limit number of results returned
--offset <NUMBER> Skip first N results (for pagination)
-d, --work-dir <PATH> Working directory (defaults to auto-detected base directory)
--port <PORT> Set custom port number (1024-65535)
--log-level <LEVEL> Set log level (error
--verbose Enable verbose output
--quiet Suppress non-essential output
--no-color Disable colored output
--config <FILE> Use custom configuration file
--timeout <SECS> Set operation timeout in seconds (optional; no default applied)
--retry <COUNT> Retry attempts (optional; no default applied)

Commands

Model Management

export

Export model embeddings and metadata to JSON file

Usage:

ai-trainer export --output <FILE> [--format <FORMAT>]

Examples:

ai-trainer export --output model_export.json

snapshot

Save current model as versioned snapshot

Usage:

ai-trainer snapshot --snapshot-version <VERSION>

Examples:

ai-trainer snapshot --snapshot-version 1.0.0
ai-trainer snapshot -s 1.1.0-beta

list-snapshots

List all saved model snapshots

Usage:

ai-trainer list-snapshots

Examples:

ai-trainer list-snapshots

status

Display current model status and statistics

Usage:

ai-trainer status

Examples:

ai-trainer status

download-model

Download ONNX model, tokenizer, or GGUF model for AI engine tiers

Usage:

ai-trainer download-model [--llm [default|small|large|xlarge|xlarge-hq]] [--show-models] [--all] [--output-dir <DIR>] [--force] [--allow-unverified-model]

Examples:

ai-trainer download-model --allow-unverified-model
ai-trainer download-model --llm --allow-unverified-model
ai-trainer download-model --llm small --allow-unverified-model
ai-trainer download-model --llm large --allow-unverified-model
ai-trainer download-model --llm xlarge --allow-unverified-model # Qwen3-8B Q4_K_M, SPEED tuned
ai-trainer download-model --llm xlarge-hq --allow-unverified-model # Qwen3-8B Q5_K_M, QUALITY tuned
ai-trainer download-model --all --allow-unverified-model
ai-trainer download-model --show-models
ai-trainer download-model --force --allow-unverified-model

Model Training

train

Train AI model from command metadata (full retraining)

Usage:

ai-trainer train --data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer train --data commands.json
ai-trainer train --data commands.json --json

incremental

Update model incrementally with new command data

Usage:

ai-trainer incremental --new-data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer incremental --new-data new_commands.json

Validation & Testing

validate

Validate model accuracy against test dataset

Usage:

ai-trainer validate --test-data <FILE> [--threshold <THRESHOLD>]

Examples:

ai-trainer validate --test-data test_cases.json

Operational Scenarios

Scenario-oriented workflows generated from the binary's built-in -e --json examples.

Scenario 1: Model Training

Full model training operations

Step 1: Train model with command data

sudo ai-trainer train --data commands.json
Expected Output: Training statistics and embeddings metrics

Note

Creates new model from scratch

Step 2: Train with custom database

sudo ai-trainer train --data commands.json --database custom.db
Expected Output: Training results with custom DB location

Note

Allows custom database path specification

Step 3: Train and output results as JSON

sudo ai-trainer train --data commands.json --json
Expected Output: JSON-formatted training metrics

Note

Structured output for automation

Scenario 2: Incremental Training

Update existing models with new data

Step 1: Incrementally train with new data

sudo ai-trainer incremental --new-data updates.json
Expected Output: New embeddings added to existing model

Note

Requires existing trained model

Step 2: Incremental training with custom DB and JSON output

sudo ai-trainer incremental --new-data updates.json --database custom.db --json
Expected Output: JSON-formatted incremental training results

Note

Combines custom DB path with structured output

Scenario 3: Validation

Model accuracy testing and validation

Step 1: Validate model with test data

sudo ai-trainer validate --test-data test_commands.json
Expected Output: Validation results with accuracy metrics

Note

Tests model against known test cases

Step 2: Validate with custom accuracy threshold

sudo ai-trainer validate --test-data test_commands.json --threshold 0.90
Expected Output: Pass/fail validation with 90% threshold

Note

Default threshold is 0.85

Step 3: Validate with custom DB and JSON output

sudo ai-trainer validate --test-data test_commands.json --database custom.db --json
Expected Output: JSON-formatted validation metrics

Note

Structured validation results

Step 4: Validate with all parameters combined

sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json
Expected Output: JSON validation with custom test data, 90% threshold, and custom DB

Note

Full parameter example for CI/CD pipelines

Scenario 4: Model Export

Export trained models and statistics

Step 1: Export trained model

sudo ai-trainer export --output model_export.json
Expected Output: Complete model export with embeddings

Note

Default format includes all embeddings

Step 2: Export in compact format

sudo ai-trainer export --output model_compact.json --format compact
Expected Output: Compact model export without full embeddings

Note

Reduces export file size

Step 3: Export statistics as JSON

sudo ai-trainer export --output model_stats.json --format stats --json
Expected Output: Model statistics without embeddings

Note

Lightweight statistics export

Step 4: Full export with JSON envelope output

sudo ai-trainer export --output model.json --format full --json
Expected Output: Complete model export with JSON status envelope

Note

Combines full embeddings export with structured output

Scenario 5: Snapshots

Model versioning and snapshot management

Step 1: Create model snapshot with version

sudo ai-trainer snapshot --snapshot-version 1.0.0
Expected Output: Versioned snapshot created successfully

Note

Preserves model state at specific version

Step 2: List all model snapshots

sudo ai-trainer list-snapshots
Expected Output: List of saved model versions

Note

Shows snapshot metadata and versions

Step 3: List snapshots as JSON

sudo ai-trainer list-snapshots --json
Expected Output: JSON-formatted snapshot listing

Note

Structured snapshot information

Step 4: Create snapshot with JSON output

sudo ai-trainer snapshot --snapshot-version 1.0.0 --json
Expected Output: JSON with snapshot name, version, and embedding count

Note

Structured output for automation

Scenario 6: Model Download

Download ONNX and GGUF model files for AI engine tiers

Step 1: Download ONNX embeddings model to default models/ directory

sudo ai-trainer download-model
Expected Output: Model files downloaded successfully

Note

Downloads all-MiniLM-L6-v2 ONNX model and tokenizer

Step 2: Download default GGUF model (Qwen3-1.7B Q4_K_M, ~1.1GB)

sudo ai-trainer download-model --llm
Expected Output: GGUF model downloaded to models/ directory

Note

Best balance of quality, speed, and size for CPU inference

Step 3: Download small GGUF model (Qwen3-1.7B Q4_K_S, ~1.0GB)

sudo ai-trainer download-model --llm small
Expected Output: Small GGUF model downloaded

Note

For systems with <4GB available RAM

Step 4: Download large GGUF model (Phi-3.5-mini, ~2.3GB)

sudo ai-trainer download-model --llm large
Expected Output: Large GGUF model downloaded

Note

Better reasoning, 128K trained context

Step 5: Download 8B GGUF model tuned for SPEED (Qwen3-8B Q4_K_M, ~4.8GB)

sudo ai-trainer download-model --llm xlarge
Expected Output: Qwen3-8B Q4_K_M downloaded

Note

8-billion-parameter Qwen3 at 4-bit quantization. Use on 8+ GB RAM systems for faster tokens-per-second. Lower quality than xlarge-hq, higher than default.

Step 6: Download 8B GGUF model tuned for QUALITY (Qwen3-8B Q5_K_M, ~5.6GB)

sudo ai-trainer download-model --llm xlarge-hq
Expected Output: Qwen3-8B Q5_K_M downloaded

Note

8-billion-parameter Qwen3 at 5-bit quantization. Recommended on 16+ GB RAM systems. Best local-LLM quality available in the catalog. About 15 percent slower than xlarge.

Step 7: Download both ONNX embeddings and default GGUF model

sudo ai-trainer download-model --all
Expected Output: All model files downloaded

Note

Complete setup for all AI tiers

Step 8: List downloaded and available models

sudo ai-trainer download-model --show-models
Expected Output: Model inventory with sizes and status

Note

Shows what's installed and what can be downloaded

Step 9: Model inventory as JSON

sudo ai-trainer download-model --show-models --json
Expected Output: JSON with downloaded and available model details

Step 10: Force re-download of ONNX model

sudo ai-trainer download-model --force
Expected Output: Model files re-downloaded

Note

Overwrites existing files

Scenario 7: Status

Model status and health checks

Step 1: Show training status

sudo ai-trainer status
Expected Output: Current model status and statistics

Note

Displays model readiness and metrics

Step 2: Show training status as JSON

sudo ai-trainer status --json
Expected Output: JSON-formatted status information

Note

Structured status output for automation

Scenario 8: AI Tier Integration

Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)

Step 1: Validate model against all tier responses

sudo ai-trainer validate --test-data tests.json --json
Expected Output: Validation results covering all active AI tiers

Note

Tests model accuracy across available tiers

Step 2: Train model with feedback from all tiers

sudo ai-trainer train --data commands.json --json
Expected Output: Training metrics including multi-tier feedback data

Note

Includes feedback from mistral.rs and GenAI tier executions

Scenario 9: ONNX Intent Classifier

Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)

Step 1: Evaluate ONNX intent classifier accuracy

sudo ai-trainer validate --test-data intent_tests.json --json
Expected Output: JSON with per-intent precision, recall, and F1-score

Note

Target: 95%+ accuracy on held-out test set

Step 2: Check if intent classifier model is downloaded

sudo ai-trainer download-model --show-models --json
Expected Output: JSON showing classifier model status

Note

Model: kodachi-intent-classifier.onnx (~65MB)

Environment Variables

Variable Description Default Values
RUST_LOG Set logging level info error
NO_COLOR Disable all colored output when set unset 1

Exit Codes

Code Description
0 Success
3 Permission denied
4 Network error
1 General error
2 Invalid arguments
5 File not found