Skills
Skills are bundled, agent-facing guides that load when relevant to your task. They teach the agent how to work in a specific domain — clicking and typing on screen, editing documents, reading PDFs, taking screenshots, generating slides, automating browsers, producing media.
What a skill is
A skill is a short, focused guide written for the agent, not for you. When you ask Interpreter to do something in a domain it has a skill for, the agent loads that skill before it starts working. The skill gives it the right vocabulary, the right tools, and the patterns that tend to work for that kind of task.
A general-purpose model is okay at most things. A model with the right skill loaded for the task at hand is much better.
What ships by default
Interpreter ships with skills for the things people actually do on a desktop:
computer-use— clicks, keyboard, on-screen interactionsdoc— read and edit Word and rich text documentspdf— open, extract from, and reason about PDFsscreenshot— capture and reference regions of the screenslides— generate slide decksplaywright— drive a real browser end to endmedia-creation— produce images and video assets
You do not have to install any of these. They are already there.
When you can tell skills are working
Usually you should not have to think about them. The agent picks what it needs and gets on with the task. If something is going sideways and the approach looks wrong for the domain, ask the agent which skills it loaded for this turn. That answer will often tell you whether it framed the problem the way you expected.
Custom skills
Right now, skills are bundled with the app. Authoring your own skills will come later.
Skills vs MCP
Skills are bundled prompts that change how the agent thinks about a domain. MCP servers are external tools the agent can call. They are complementary — see MCP servers for the difference.