Justy: Okay, so Apple just released this Foundation Models thing — and now Claude works inside it as a drop-in. Same API, same session, just swap the model.
Cody: Yeah.
Justy: Which is… actually kind of elegant?
Cody: It is, yeah. From a product angle it's clean — you're not asking developers to learn two different APIs depending on which model they pick.
Justy: Exactly. So the claim is basically: use the on-device model for fast, private, offline stuff. When you need bigger context or frontier reasoning, escalate to Claude. Same code path either way.
Cody: Right, and requests go straight from the app to Anthropic's API. Apple doesn't see the prompts or responses at all.
Justy: That's the part I was actually surprised by. I thought Apple might want visibility into what's happening, but no — direct to Anthropic, billed to your account.
Justy: What?
Cody: Nothing, just — that's very Apple. They're not interested in being in the middle of your data. They just want the developer experience to be seamless.
Justy: Fair. So technically, how does this actually work?
Cody: The package is basically a wrapper that conforms Claude to Apple's LanguageModel protocol. You pass a ClaudeLanguageModel to LanguageModelSession, and then you call respond(to:), streamResponse, tool calling, structured output — all the same methods you'd use for the on-device model. The session doesn't care which model it got; it just sends the request.
Justy: Mm-hm.
Cody: But here's the wrinkle: each model declares what it can do. Sampling parameters, effort levels, adaptive thinking, structured output, vision. The package only sends fields that the model actually accepts, because sending a field it doesn't support is a hard error.
Justy: So you can't just… assume all models are the same.
Cody: Exactly. If you're using a new model ID that isn't baked into the package yet, you have to declare the capabilities explicitly. No guessing. And effort levels are interesting — Claude supports five: low, medium, high, xhigh, max. But Apple's framework only goes up to high for reasoning hints. So fixedEffort lets you pin xhigh or max for a single request, and it overrides whatever the framework suggests.
Justy: So you're saying if I want to run expensive reasoning, I have to opt in explicitly.
Cody: Yeah. The default is high. If you want xhigh or max, you set fixedEffort on the model init.
Justy: Got it. What about tools?
Cody: Two flavors. Client-side tools run on the device — you pass them to the session, Claude calls them, the framework invokes them locally. Server-side tools — web search, web fetch, code execution — those run on Anthropic's infrastructure in a single round trip. You configure them on the model with serverTools, not the session, because the session type is Apple's and you might want different tool sets per conversation.
Justy: Interesting. So if I'm building an app and I want Claude to search the web for one turn but not another, I'd just create two different ClaudeLanguageModel instances.
Cody: Exactly.
Justy: Okay, Cody — where does this break down?
Cody: The package only surfaces what Apple's protocol can express. Prompt caching applies automatically, but you can't control cache TTL or where the breakpoints sit. Stop sequences, batch processing, Files API, token counting — none of that exists through this interface. If you need those features, you're going back to the raw Messages API client.
Justy: So it's not a general Messages client.
Cody: Right. It's specifically a Foundation Models provider. If you want the full Claude API surface, you use one of the language SDKs.
Justy: That actually makes sense, though. If you're building inside the Foundation Models framework, you probably don't need batch processing or the Files API. You're doing single-turn or streaming chat.
Cody: Yeah, and the on-device model doesn't have those either, so it would be weird to add them only to Claude.
Justy: Right. So who actually cares about this? Native iOS developers who want smarter reasoning without shipping their own backend, I guess.
Cody: Yeah. If you're already in the Foundation Models framework — which is new in iOS twenty-seven — this is a clean way to escalate to Claude. And because it's the same API surface, you're not training developers on something new. Just swap the model.
Justy: It's beta, though. iOS twenty-seven, macOS twenty-seven, all in beta.
Cody: Right. And the docs are pretty clear about it — APIs may change before general availability. But if you're exploring it now, the example in the repo is a good starting point. It's a command-line chat that streams responses and has a flag for server-side web search.
Justy: Alright. So the thing is: this is not revolutionary. It's just… clean product design. You have an API, you make Claude fit into it the same way an on-device model does, and you get out of the way.
Cody: That's exactly it. It's not trying to be fancy. It's just a good integration.
Justy: Okay. Cool.