AI Assistant¶
ai_assistant
¶
AI assistant backend — vision-capable LLM interaction.
Supported backends¶
- ollama – Local Ollama server (recommended; supports llava and other
vision models out of the box). Uses the
/api/chatendpoint so conversation history is passed natively. - openai – Local OpenAI-compatible REST API (LM Studio, llama.cpp, etc.). External cloud endpoints are blocked by default.
- npu – AMD Ryzen AI ONNX model running on the NPU / iGPU.
Privacy & security¶
By default (network.allow_external: false) every backend URL is validated
before each request. Only localhost, 127.x.x.x, ::1, and RFC-1918
private-network addresses are accepted. Any attempt to configure an external
endpoint raises :class:ExternalNetworkBlockedError at request time so the
check cannot be bypassed by a bad config file without explicitly opting in.
Backend resource efficiency¶
requestsis imported lazily; no persistentSessionis kept between calls (Connection: closeis sent with every request so the socket is released immediately after the response).- Responses are streamed token-by-token and yielded to the caller so the UI can update incrementally without buffering the full reply in RAM.
- Screenshot / image bytes are passed in and can be deleted by the caller as
soon as :func:
~AIAssistant.askreturns — they are not retained here. - NPU sessions are unloaded right after inference (see :mod:
npu_manager).
AIAssistant
¶
Facade for talking to a vision-capable LLM backend.
Parameters¶
config:
The application :class:~src.config.Config object.
npu_manager:
An optional :class:~src.npu_manager.NPUManager. Only used when
backend == "npu".
Source code in src/ai_assistant.py
ask
¶
ask(prompt, *, history=None, screenshot_jpeg=None, attachment_image_jpegs=None, attachment_texts=None, max_context_messages=40)
Send a prompt (with optional images/text/history) and stream the reply.
This is a generator: iterate over it to receive response tokens as
they arrive from the model. The caller should delete
screenshot_jpeg and any attachment bytes once this function returns
to free memory.
Parameters¶
prompt:
The user's natural-language question or instruction.
history:
:class:~src.conversation.ConversationHistory whose past messages
are passed to the model for multi-turn context.
screenshot_jpeg:
JPEG bytes of the current screen (optional).
attachment_image_jpegs:
List of JPEG bytes for user-uploaded images (optional).
attachment_texts:
List of text file contents to include in the context (optional).
max_context_messages:
How many of the most recent past messages to include in the
request. None includes all of them.
Yields¶
str Incremental response tokens as they arrive.