Training Pipeline¶

The ORION training pipeline fine-tunes the LiquidAI LFM2.5-VL-1.6B vision-language model for orbital image triage classification. The pipeline produces a quantized GGUF model suitable for CPU-only inference on the Raspberry Pi 5. The fine-tuned LoRA adapter weights are available on Hugging Face.

Pipeline Overview¶

LFM2.5-VL-1.6B (base model)
    |
    v
fine_tune.py : QLoRA fine-tuning with ORION dataset
    |
    v
orion_lora_weights/ : LoRA adapter weights
    |
    v
fuse.py : Merge LoRA into base model
    |
    v
orion_merged/ : Standalone FP16 Hugging Face model
    |
    v
llama.cpp convert + quantize : GGUF Q4_K_M quantization
    |
    v
orion-q4_k_m.gguf  (~730 MB, flight-ready)
orion-mmproj-f16.gguf  (vision encoder projection)

Base Model¶

Model: LiquidAI/LFM2.5-VL-1.6B
Architecture: Vision-language model with 1.6 billion parameters
Loaded in: 4-bit quantization via BitsAndBytes (NF4)

QLoRA Configuration¶

Parameter	Value	Description
Rank (r)	16	LoRA adapter rank
Alpha	32	Scaling factor (2x rank)
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`	Attention mechanism projections
Dropout	0.05	LoRA dropout rate
Task type	CAUSAL_LM	Causal language modeling

Training Configuration¶

Parameter	Value	Description
Batch size	1 (micro-batch)	Per-device training batch size
Gradient accumulation	16 steps	Effective batch size of 16
Learning rate	2e-4	AdamW learning rate
Epochs	3	Full passes over the training set
Optimizer	paged_adamw_8bit	Memory-efficient 8-bit AdamW
Precision	FP16	Half-precision training
Gradient checkpointing	Enabled	Reduces memory at the cost of compute

Weight Fusion¶

After fine-tuning, fuse.py merges the LoRA adapter weights permanently into the base model. It loads the base model in FP16 on CPU, applies merge_and_unload(), and saves the result with SafeTensors serialization to orion_merged/.

GGUF Quantization¶

The merged FP16 model is converted to GGUF format and quantized to Q4_K_M using llama.cpp tools. The multimodal projector (mmproj) is extracted separately and kept at FP16 precision. The two output files (orion-q4_k_m.gguf and orion-mmproj-f16.gguf) are deployed to the Pi. Pre-trained models are available on Hugging Face. For artifact sizes at each stage, see Compute Budgets.

Dependencies¶

The training pipeline requires the following Python packages (installed via ground_segment/pyproject.toml):

torch, torchvision: PyTorch framework
transformers: Hugging Face model loading and training
peft: Parameter-Efficient Fine-Tuning (LoRA/QLoRA)
datasets: JSONL dataset loading
bitsandbytes: 4-bit quantization support
accelerate: Hardware-agnostic model loading
gguf: GGUF format conversion utilities

Validation and Ablation Studies¶

Full per-condition accuracy numbers, per-class precision/recall/F1, fine-tuning delta table, and raw inference logs are in the Model Card.

Both evaluate.py (fine-tuned model) and ablation.py (base model) evaluate the model under four conditions:

Condition	Image Input	Prompt	Purpose
A: Full System	Real satellite image	Includes coordinates	Baseline performance
B: Vision Only	Real satellite image	Coordinates stripped	Measures visual reasoning without GPS hints
C: Blind LLM	Gaussian noise (512x512)	Includes coordinates	Tests coordinate memorization (no vision)
D: Sensor Conflict	Real satellite image	Mismatched coordinates	Tests whether model trusts vision or coordinates

Condition D: Mismatch Logic¶

The script deliberately feeds coordinates from the opposite category:

HIGH images receive LOW coordinates
LOW images receive HIGH coordinates
MEDIUM images receive HIGH coordinates

This stress-tests the model's ability to reason from visual evidence when coordinate telemetry is misleading.

Condition C: Gaussian Noise¶

The noise image is a deterministic 512x512 random RGB array seeded with np.random.seed(42). Using Gaussian noise rather than a blank image prevents the model from defaulting to "ocean" for featureless inputs.

Metrics¶

For conditions A, B, and C: per-class recall and precision for HIGH, MEDIUM, and LOW, plus overall accuracy. For condition D: ratio of visual-trust (correct) versus coordinate-trust (failure).

Post-quantization evaluation (Q4_K_M GGUF)¶

The same 4-condition protocol can be run against the quantized GGUF model via llama.cpp's HTTP server (evaluate.py --quantized-model). This isolates accuracy loss from quantization.

Condition	Fine-tuned (FP16)	Q4_K_M GGUF	Δ (quantization)
A: Vision + GPS coords	58.3%	55.0%	−3.3 pp
B: Vision only (no coords)	65.0%	63.3%	−1.7 pp
C: Blind LLM (noise+coords)	43.3%	28.3%	−15.0 pp

Sensor conflict (Condition D): coordinate-trust failure improves slightly from 16.7% (FP16) to 15.0% (Q4_K_M). Quantization does not degrade GPS robustness.

Accuracy loss on operational conditions (A: −3.3 pp, B: −1.7 pp) is modest, confirming that Q4_K_M quantization retains most of the fine-tuned model's capability.

The large Condition C drop (−15.0 pp) is expected and benign as it tests coordinate memorization using noise images, which never occurs in deployment. Crucially, GPS robustness (Condition D) slightly improves after quantization, meaning the deployed model is not more susceptible to spoofed telemetry.

Full per-class logs are in the Model Card.

For step-by-step instructions, see the guides for training, quantization, and validation/ablation studies.

Data and Weight Transfer Scripts¶

Two shell scripts handle moving data and weights between the local machine and the remote training server:

ground_segment/data/upload_to_server.sh: compresses the local dataset, uploads it via rsync, and clones/pulls the ORION repo on the server. Run this before training.
ground_segment/training/download_weights.sh: pulls orion_lora_weights/ from the server after training completes and deletes the server's repo and dataset (scorched earth).

See Utility Scripts for invocation details and Ground Segment Environment Variables for the required env vars.