ai-trainer
AI model training and validation for Kodachi OS command intelligence
Version: 9.0.1 | Size: 2.9MB | Author: Warith Al Maawali warith@digi77.com
License: LicenseRef-Kodachi-SAN-1.0 | Website: https://www.digi77.com
File Information
| Property | Value |
|---|---|
| Binary Name | ai-trainer |
| Version | 9.0.1 |
| Build Date | 2026-04-02T13:42:17.631165216Z |
| Rust Version | 1.82.0 |
| File Size | 2.9MB |
| JSON Data | View Raw JSON |
SHA256 Checksum
Features
| Feature | Description |
|---|---|
| Feature | TF-IDF based command embeddings |
| Feature | Incremental model updates |
| Feature | Model validation and accuracy testing |
Security Features
| Feature | Description |
|---|---|
| Inputvalidation | All inputs are validated and sanitized |
| Ratelimiting | Built-in rate limiting for network operations |
| Authentication | Secure authentication with certificate pinning |
| Encryption | TLS 1.3 for all network communications |
System Requirements
| Requirement | Value |
|---|---|
| OS | Linux (Debian-based) |
| Privileges | root/sudo for system operations |
| Dependencies | OpenSSL, libcurl |
Global Options
| Flag | Description |
|---|---|
-h, --help |
Print help information |
-v, --version |
Print version information |
-n, --info |
Display detailed information |
-e, --examples |
Show usage examples |
--json |
Output in JSON format |
--json-pretty |
Pretty-print JSON output with indentation |
--json-human |
Enhanced JSON output with improved formatting (like jq) |
--verbose |
Enable verbose output |
--quiet |
Suppress non-essential output |
--no-color |
Disable colored output |
--config <FILE> |
Use custom configuration file |
--timeout <SECS> |
Set timeout (default: 30) |
--retry <COUNT> |
Retry attempts (default: 3) |
Commands
Model Management
export
Export model embeddings and metadata to JSON file
Usage:
Examples:
snapshot
Save current model as versioned snapshot
Usage:
Examples:
list-snapshots
List all saved model snapshots
Usage:
Examples:
status
Display current model status and statistics
Usage:
Examples:
download-model
Download ONNX model, tokenizer, or GGUF model for AI engine tiers
Usage:
ai-trainer download-model [--llm [default|small|large]] [--show-models] [--all] [--output-dir <DIR>] [--force]
Examples:
Model Training
train
Train AI model from command metadata (full retraining)
Usage:
Examples:
incremental
Update model incrementally with new command data
Usage:
Examples:
Validation & Testing
validate
Validate model accuracy against test dataset
Usage:
Examples:
Operational Scenarios
Scenario-oriented workflows generated from the binary's built-in -e --json examples.
Scenario 1: Model Training
Full model training operations
Step 1: Train model with command data
Expected Output: Training statistics and embeddings metricsNote
Creates new model from scratch
Step 2: Train with custom database
Expected Output: Training results with custom DB locationNote
Allows custom database path specification
Step 3: Train and output results as JSON
Expected Output: JSON-formatted training metricsNote
Structured output for automation
Scenario 2: Incremental Training
Update existing models with new data
Step 1: Incrementally train with new data
Expected Output: New embeddings added to existing modelNote
Requires existing trained model
Step 2: Incremental training with custom DB and JSON output
Expected Output: JSON-formatted incremental training resultsNote
Combines custom DB path with structured output
Scenario 3: Validation
Model accuracy testing and validation
Step 1: Validate model with test data
Expected Output: Validation results with accuracy metricsNote
Tests model against known test cases
Step 2: Validate with custom accuracy threshold
Expected Output: Pass/fail validation with 90% thresholdNote
Default threshold is 0.85
Step 3: Validate with custom DB and JSON output
Expected Output: JSON-formatted validation metricsNote
Structured validation results
Step 4: Validate with all parameters combined
Expected Output: JSON validation with custom test data, 90% threshold, and custom DBNote
Full parameter example for CI/CD pipelines
Scenario 4: Model Export
Export trained models and statistics
Step 1: Export trained model
Expected Output: Complete model export with embeddingsNote
Default format includes all embeddings
Step 2: Export in compact format
Expected Output: Compact model export without full embeddingsNote
Reduces export file size
Step 3: Export statistics as JSON
Expected Output: Model statistics without embeddingsNote
Lightweight statistics export
Step 4: Full export with JSON envelope output
Expected Output: Complete model export with JSON status envelopeNote
Combines full embeddings export with structured output
Scenario 5: Snapshots
Model versioning and snapshot management
Step 1: Create model snapshot with version
Expected Output: Versioned snapshot created successfullyNote
Preserves model state at specific version
Step 2: List all model snapshots
Expected Output: List of saved model versionsNote
Shows snapshot metadata and versions
Step 3: List snapshots as JSON
Expected Output: JSON-formatted snapshot listingNote
Structured snapshot information
Step 4: Create snapshot with JSON output
Expected Output: JSON with snapshot name, version, and embedding countNote
Structured output for automation
Scenario 6: Model Download
Download ONNX and GGUF model files for AI engine tiers
Step 1: Download ONNX embeddings model to default models/ directory
Expected Output: Model files downloaded successfullyNote
Downloads all-MiniLM-L6-v2 ONNX model and tokenizer
Step 2: Download default GGUF model (Qwen2.5-3B-Instruct Q4_K_M, ~1.8GB)
Expected Output: GGUF model downloaded to models/ directoryNote
Best balance of quality, speed, and size for CPU inference
Step 3: Download small GGUF model (Qwen2.5-1.5B, ~0.9GB)
Expected Output: Small GGUF model downloadedNote
For systems with <4GB available RAM
Step 4: Download large GGUF model (Phi-3.5-mini, ~2.3GB)
Expected Output: Large GGUF model downloadedNote
Better reasoning, 128K trained context
Step 5: Download both ONNX embeddings and default GGUF model
Expected Output: All model files downloadedNote
Complete setup for all AI tiers
Step 6: List downloaded and available models
Expected Output: Model inventory with sizes and statusNote
Shows what's installed and what can be downloaded
Step 7: Model inventory as JSON
Expected Output: JSON with downloaded and available model detailsStep 8: Force re-download of ONNX model
Expected Output: Model files re-downloadedNote
Overwrites existing files
Scenario 7: Status
Model status and health checks
Step 1: Show training status
Expected Output: Current model status and statisticsNote
Displays model readiness and metrics
Step 2: Show training status as JSON
Expected Output: JSON-formatted status informationNote
Structured status output for automation
Scenario 8: AI Tier Integration
Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)
Step 1: Validate model against all tier responses
Expected Output: Validation results covering all active AI tiersNote
Tests model accuracy across available tiers
Step 2: Train model with feedback from all tiers
Expected Output: Training metrics including multi-tier feedback dataNote
Includes feedback from mistral.rs and GenAI tier executions
Scenario 9: ONNX Intent Classifier
Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)
Step 1: Evaluate ONNX intent classifier accuracy
Expected Output: JSON with per-intent precision, recall, and F1-scoreNote
Target: 95%+ accuracy on held-out test set
Step 2: Check if intent classifier model is downloaded
Expected Output: JSON showing classifier model statusNote
Model: kodachi-intent-classifier.onnx (~65MB)
Environment Variables
| Variable | Description | Default | Values |
|---|---|---|---|
RUST_LOG |
Set logging level | info | error |
NO_COLOR |
Disable all colored output when set | unset | 1 |
Exit Codes
| Code | Description |
|---|---|
| 0 | Success |
| 4 | Network error |
| 5 | File not found |
| 1 | General error |
| 2 | Invalid arguments |
| 3 | Permission denied |