Skip to content

ai-trainer

AI model training and validation for Kodachi OS command intelligence

Version: 9.0.1 | Size: 2.9MB | Author: Warith Al Maawali warith@digi77.com

License: LicenseRef-Kodachi-SAN-1.0 | Website: https://www.digi77.com


File Information

Property Value
Binary Name ai-trainer
Version 9.0.1
Build Date 2026-04-02T13:42:17.631165216Z
Rust Version 1.82.0
File Size 2.9MB
JSON Data View Raw JSON

SHA256 Checksum

556fc3511d4f76e4531a2f30d8c0724ec395ac83a62e2a61f028013e686e42cb

Features

Feature Description
Feature TF-IDF based command embeddings
Feature Incremental model updates
Feature Model validation and accuracy testing

Security Features

Feature Description
Inputvalidation All inputs are validated and sanitized
Ratelimiting Built-in rate limiting for network operations
Authentication Secure authentication with certificate pinning
Encryption TLS 1.3 for all network communications

System Requirements

Requirement Value
OS Linux (Debian-based)
Privileges root/sudo for system operations
Dependencies OpenSSL, libcurl

Global Options

Flag Description
-h, --help Print help information
-v, --version Print version information
-n, --info Display detailed information
-e, --examples Show usage examples
--json Output in JSON format
--json-pretty Pretty-print JSON output with indentation
--json-human Enhanced JSON output with improved formatting (like jq)
--verbose Enable verbose output
--quiet Suppress non-essential output
--no-color Disable colored output
--config <FILE> Use custom configuration file
--timeout <SECS> Set timeout (default: 30)
--retry <COUNT> Retry attempts (default: 3)

Commands

Model Management

export

Export model embeddings and metadata to JSON file

Usage:

ai-trainer export --output <FILE> [--format <FORMAT>]

Examples:

ai-trainer export --output model_export.json

snapshot

Save current model as versioned snapshot

Usage:

ai-trainer snapshot --snapshot-version <VERSION>

Examples:

ai-trainer snapshot --snapshot-version 1.0.0
ai-trainer snapshot -s 1.1.0-beta

list-snapshots

List all saved model snapshots

Usage:

ai-trainer list-snapshots

Examples:

ai-trainer list-snapshots

status

Display current model status and statistics

Usage:

ai-trainer status

Examples:

ai-trainer status

download-model

Download ONNX model, tokenizer, or GGUF model for AI engine tiers

Usage:

ai-trainer download-model [--llm [default|small|large]] [--show-models] [--all] [--output-dir <DIR>] [--force]

Examples:

ai-trainer download-model
ai-trainer download-model --llm
ai-trainer download-model --llm small
ai-trainer download-model --all
ai-trainer download-model --show-models
ai-trainer download-model --force

Model Training

train

Train AI model from command metadata (full retraining)

Usage:

ai-trainer train --data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer train --data commands.json
ai-trainer train --data commands.json --json

incremental

Update model incrementally with new command data

Usage:

ai-trainer incremental --new-data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer incremental --new-data new_commands.json

Validation & Testing

validate

Validate model accuracy against test dataset

Usage:

ai-trainer validate --test-data <FILE> [--threshold <THRESHOLD>]

Examples:

ai-trainer validate --test-data test_cases.json

Operational Scenarios

Scenario-oriented workflows generated from the binary's built-in -e --json examples.

Scenario 1: Model Training

Full model training operations

Step 1: Train model with command data

sudo ai-trainer train --data commands.json
Expected Output: Training statistics and embeddings metrics

Note

Creates new model from scratch

Step 2: Train with custom database

sudo ai-trainer train --data commands.json --database custom.db
Expected Output: Training results with custom DB location

Note

Allows custom database path specification

Step 3: Train and output results as JSON

sudo ai-trainer train --data commands.json --json
Expected Output: JSON-formatted training metrics

Note

Structured output for automation

Scenario 2: Incremental Training

Update existing models with new data

Step 1: Incrementally train with new data

sudo ai-trainer incremental --new-data updates.json
Expected Output: New embeddings added to existing model

Note

Requires existing trained model

Step 2: Incremental training with custom DB and JSON output

sudo ai-trainer incremental --new-data updates.json --database custom.db --json
Expected Output: JSON-formatted incremental training results

Note

Combines custom DB path with structured output

Scenario 3: Validation

Model accuracy testing and validation

Step 1: Validate model with test data

sudo ai-trainer validate --test-data test_commands.json
Expected Output: Validation results with accuracy metrics

Note

Tests model against known test cases

Step 2: Validate with custom accuracy threshold

sudo ai-trainer validate --test-data test_commands.json --threshold 0.90
Expected Output: Pass/fail validation with 90% threshold

Note

Default threshold is 0.85

Step 3: Validate with custom DB and JSON output

sudo ai-trainer validate --test-data test_commands.json --database custom.db --json
Expected Output: JSON-formatted validation metrics

Note

Structured validation results

Step 4: Validate with all parameters combined

sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json
Expected Output: JSON validation with custom test data, 90% threshold, and custom DB

Note

Full parameter example for CI/CD pipelines

Scenario 4: Model Export

Export trained models and statistics

Step 1: Export trained model

sudo ai-trainer export --output model_export.json
Expected Output: Complete model export with embeddings

Note

Default format includes all embeddings

Step 2: Export in compact format

sudo ai-trainer export --output model_compact.json --format compact
Expected Output: Compact model export without full embeddings

Note

Reduces export file size

Step 3: Export statistics as JSON

sudo ai-trainer export --output model_stats.json --format stats --json
Expected Output: Model statistics without embeddings

Note

Lightweight statistics export

Step 4: Full export with JSON envelope output

sudo ai-trainer export --output model.json --format full --json
Expected Output: Complete model export with JSON status envelope

Note

Combines full embeddings export with structured output

Scenario 5: Snapshots

Model versioning and snapshot management

Step 1: Create model snapshot with version

sudo ai-trainer snapshot --snapshot-version 1.0.0
Expected Output: Versioned snapshot created successfully

Note

Preserves model state at specific version

Step 2: List all model snapshots

sudo ai-trainer list-snapshots
Expected Output: List of saved model versions

Note

Shows snapshot metadata and versions

Step 3: List snapshots as JSON

sudo ai-trainer list-snapshots --json
Expected Output: JSON-formatted snapshot listing

Note

Structured snapshot information

Step 4: Create snapshot with JSON output

sudo ai-trainer snapshot --snapshot-version 1.0.0 --json
Expected Output: JSON with snapshot name, version, and embedding count

Note

Structured output for automation

Scenario 6: Model Download

Download ONNX and GGUF model files for AI engine tiers

Step 1: Download ONNX embeddings model to default models/ directory

sudo ai-trainer download-model
Expected Output: Model files downloaded successfully

Note

Downloads all-MiniLM-L6-v2 ONNX model and tokenizer

Step 2: Download default GGUF model (Qwen2.5-3B-Instruct Q4_K_M, ~1.8GB)

sudo ai-trainer download-model --llm
Expected Output: GGUF model downloaded to models/ directory

Note

Best balance of quality, speed, and size for CPU inference

Step 3: Download small GGUF model (Qwen2.5-1.5B, ~0.9GB)

sudo ai-trainer download-model --llm small
Expected Output: Small GGUF model downloaded

Note

For systems with <4GB available RAM

Step 4: Download large GGUF model (Phi-3.5-mini, ~2.3GB)

sudo ai-trainer download-model --llm large
Expected Output: Large GGUF model downloaded

Note

Better reasoning, 128K trained context

Step 5: Download both ONNX embeddings and default GGUF model

sudo ai-trainer download-model --all
Expected Output: All model files downloaded

Note

Complete setup for all AI tiers

Step 6: List downloaded and available models

sudo ai-trainer download-model --show-models
Expected Output: Model inventory with sizes and status

Note

Shows what's installed and what can be downloaded

Step 7: Model inventory as JSON

sudo ai-trainer download-model --show-models --json
Expected Output: JSON with downloaded and available model details

Step 8: Force re-download of ONNX model

sudo ai-trainer download-model --force
Expected Output: Model files re-downloaded

Note

Overwrites existing files

Scenario 7: Status

Model status and health checks

Step 1: Show training status

sudo ai-trainer status
Expected Output: Current model status and statistics

Note

Displays model readiness and metrics

Step 2: Show training status as JSON

sudo ai-trainer status --json
Expected Output: JSON-formatted status information

Note

Structured status output for automation

Scenario 8: AI Tier Integration

Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)

Step 1: Validate model against all tier responses

sudo ai-trainer validate --test-data tests.json --json
Expected Output: Validation results covering all active AI tiers

Note

Tests model accuracy across available tiers

Step 2: Train model with feedback from all tiers

sudo ai-trainer train --data commands.json --json
Expected Output: Training metrics including multi-tier feedback data

Note

Includes feedback from mistral.rs and GenAI tier executions

Scenario 9: ONNX Intent Classifier

Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)

Step 1: Evaluate ONNX intent classifier accuracy

sudo ai-trainer validate --test-data intent_tests.json --json
Expected Output: JSON with per-intent precision, recall, and F1-score

Note

Target: 95%+ accuracy on held-out test set

Step 2: Check if intent classifier model is downloaded

sudo ai-trainer download-model --show-models --json
Expected Output: JSON showing classifier model status

Note

Model: kodachi-intent-classifier.onnx (~65MB)

Environment Variables

Variable Description Default Values
RUST_LOG Set logging level info error
NO_COLOR Disable all colored output when set unset 1

Exit Codes

Code Description
0 Success
4 Network error
5 File not found
1 General error
2 Invalid arguments
3 Permission denied