Discovery flow
A conformant Saturn client performs four steps. The four are mechanical — every reference implementation does them the same way.
1. Browse
Send a PTR query for _saturn._tcp.local. to multicast address 224.0.0.251 (IPv4) or ff02::fb (IPv6) on UDP/5353. Every Saturn responder on the broadcast domain answers with its instance name.
2. Resolve
For each instance, send SRV + TXT queries. SRV gives host:port; TXT gives the Saturn metadata. Most resolvers issue both as a single mDNS message and aggregate the answer.
3. Select
Sort discovered services by the TXT priority field, ascending — lower is preferred. Optionally filter by features (e.g. only route a vision request to instances advertising vision) or deployment (e.g. require deployment=local for privacy-sensitive prompts). Pick the first healthy match.
ranked = sorted(services, key=lambda s: int(s.txt["priority"]))
for s in ranked:
if probe(s): # GET /v1/health, see step 4
chosen = s
break
The SRV record carries its own priority and weight fields per RFC 2782; Saturn does not use them. Use the TXT priority only.
4. Connect
Connection method depends on deployment:
deployment=localornetwork— construct the URL from the SRV:http://<host>:<port>.deployment=cloud— use the TXTapi_basedirectly. Send the TXTephemeral_keyasAuthorization: Bearer <key>.
# local / network
$ curl http://macbook.local:11434/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{"model":"llama3","messages":[{"role":"user","content":"hi"}]}'
# cloud (beacon-supplied)
$ curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $EPHEMERAL_KEY" \
-H 'Content-Type: application/json' \
-d '{"model":"anthropic/claude-3.5-sonnet","messages":[...]}'
Health check first if you care about failover: GET /v1/health → 200 OK + {"status":"ok"}.
Failover
When GET /v1/health returns non-200 or times out, fall through to the next instance in the priority-sorted list. The reference TypeScript provider (ai-sdk-provider-saturn) uses a circuit-breaker per instance to suppress flapping. The reference Python client retries every 20 s.
End-to-end timing
On a quiet LAN, browse + resolve completes in tens of milliseconds. The reference Python client's discover(timeout=2.0) waits 2 s for late responders; lower the timeout if you only need the first answer.
→ Wire format · HTTP API · TXT keys