The connect flow
What you can do once connected
The main brain you talk to in DOSIA now has working tools for:| You say | DOSIA does |
|---|---|
| ”Draw a red apple in flat style” | image-tools.generate_image(model=gpt-image-2, prompt=...) |
| ”Make this image blue-background” + attach | image-tools.generate_image with source_image |
| ”How many cats are in this photo?” + attach | image-tools.analyze_image(model=gpt-5-4, ...) |
| ”Make a 5-second product video” | video-tools.generate_video → poll_video |
| ”Have GPT-5.5 write me a summary of this thread” | text-writing-tools.chat_with(model=gpt-5-5, ...) |
| ”Get Gemini to translate this into English” | text-writing-tools.chat_with(model=gemini-3.1-pro, ...) |
The plugin / tool surface
Three plugins, three MCP servers, six tools total.| Plugin | Tools | What it solves |
|---|---|---|
image-tools | generate_image(model, prompt, source_image?) analyze_image(model, image_url, question) | Text-to-image, image-to-image, vision-on-image |
video-tools | generate_video(model, prompt, source_image?) → task_id poll_video(task_id) analyze_video(model, video_url, question) ⚠️ | Text-to-video, image-to-video, vision-on-video (analyze endpoint behind a feature flag) |
text-writing-tools | chat_with(model, prompt, system?) | Use a non-primary LLM (GPT / Gemini / DeepSeek / Doubao) as a writing co-processor |
analyze_video is reserved; the corresponding endpoint is not live in the public gateway as of this writing. The tool definition is in place so the main brain can plan around it; calls will surface a clear “not yet available” error until the endpoint ships.
known-models registry — the four buckets
When DOSIA fetches/v1/account/capabilities, ByteSpike returns two model lists:
anthropicModels[]— the set of “main brains” you can chat with (claude-*, plus any anthropic-compat aliases). Drives the model picker.otherModels[]— every other model your account has permission to call (gpt-*, gemini-*, deepseek-*, gpt-image-2, sora-*, veo-*, …).
otherModels[] against a known-models registry that maps each model id to one of four capability buckets:
| Bucket | Members feed into | User-facing meaning |
|---|---|---|
image_generate | generate_image.model.enum | ”I can make pictures” |
video_generate | generate_video.model.enum | ”I can make videos” |
vision | analyze_image.model.enum, analyze_video.model.enum, and chat_with.model.enum | ”I can look at images / use a vision-capable model to write” |
external_chat | chat_with.model.enum | ”I can use a non-Claude LLM to write text” |
gpt-5-4 legitimately appear in three tool enums — the SDK allows a single model id in multiple enum lists, and the registry treats vision as a cross-cutting capability rather than a single bucket.
The registry lives in DOSIA, not in your account. Adding a new model to ByteSpike doesn’t break old DOSIA builds; they’ll just ignore the unknown id until the next DOSIA release teaches them which bucket it belongs to.
Data flow end to end
generate_image tool — not a greyed-out one, not a “permission denied” call. The tool simply isn’t loaded.
Permission refresh
Permissions can shift mid-session (admin adds you to a model, a quota lifts, a trial expires):| Trigger | What happens |
|---|---|
| User clicks “Refresh permissions” in Settings → AI Models | Re-fetch capabilities → re-partition → persist → reloadPlugins() |
| DOSIA app launch | Silent fetch + reload at startup |
| ByteSpike webhook (post-P7 stretch) | Server-pushed reload — no user action needed |
Where this fits
If you’re already familiar with DOSIA Agent mode, the MCP integration described here is the other half of the DOSIA-ByteSpike story: Agent mode is about Anthropic Messages protocol passing throughtool_use / cache_control blocks; MCP integration is about which tools the main brain has available in the first place.
Multimodal endpoints — see Multimodal — are the underlying HTTP surface that image-tools / video-tools call into. The plugin layer is what turns those endpoints into something the chat-driven user never has to think about.
Setup checklist for new users
- Install DOSIA (latest signed build for your platform)
- Open Settings → Account → Connect ByteSpike account
- Approve in browser → see the connect toast
- Open a fresh chat → ask the main brain to draw something, write with GPT, or generate a clip