Offline LLM
This page documents the mobile on-device LLM path and its integration with backend services in a hybrid online/offline architecture.
Why Offline Inference Exists
- Disaster scenarios can have unstable or no connectivity.
- Field users still need guidance and structured capture support.
- Local inference allows partial continuity until sync is restored.
Mobile Runtime Architecture
graph TD
subgraph Flutter
UI[Chat/Help Screens]
CH[LlmPlatformChannel]
QUEUE[Local pending-action queue]
end
subgraph Android
ACT[MainActivity MethodChannel handler]
INF[InferenceModel MediaPipe session]
PREFS[SharedPreferences model path]
FILES[App filesDir model artifact]
end
subgraph Cloud
API[Backend API]
DB[(Firestore)]
BOT[/chatbot/ask/]
end
UI --> CH
CH --> ACT
ACT --> INF
ACT --> PREFS
ACT --> FILES
UI -->|online mode| API
API --> DB
API --> BOT
UI -->|offline mode| INF
UI --> QUEUE
QUEUE -->|reconnect sync| API
Online/Offline Routing Logic
flowchart TD
A[User submits prompt or help action] --> B{Network available?}
B -- Yes --> C[Call backend APIs]
C --> D[Use cloud chatbot + central persistence]
B -- No --> E[Run on-device inference]
E --> F[Return local guidance immediately]
F --> G[Store unsynced action locally]
G --> H{Connectivity restored?}
H -- Yes --> I[Replay queued actions to backend]
I --> J[Firestore becomes source of truth]
Sync Semantics
- Local actions are timestamped and queued when offline.
- On reconnect, queue replay posts actions to backend in order.
- Backend validates and persists canonical records.
- Mobile marks queue items synced only after API confirmation.
Operational Notes
- The current mobile docs already include model download and inference internals via
MethodChannel. - This overview models the target production behavior where mobile also submits requests/chats to backend when online.
- Offline output is advisory; dispatch authority remains on backend-controlled operational workflows.