Computer Use is a Qoder capability extension that lets the agent perceive your screen the way a person does and click, type, and scroll on your computer. When a task involves a graphical interface and can’t be completed through the command line or an API, the agent can drive desktop apps and browsers directly — while you keep working in the foreground on your own things.Documentation Index
Fetch the complete documentation index at: https://docs.qoder.com/llms.txt
Use this file to discover all available pages before exploring further.
Computer Use is currently in Beta — the experience and the underlying capabilities are still being improved.
Core Capabilities
Screen perception
- Reads the visible content of the target app window and understands the layout, button text, form state, and other visual cues.
- Takes screenshots throughout the run to confirm the page has loaded and the previous action took effect before deciding the next step.
Mouse and keyboard control
- Supports the full range of human input: clicks, double-clicks, drags, text entry, and keyboard shortcuts.
- Operates at pixel-level precision so it can target small UI elements accurately.
Background autonomous execution
- Drives mouse, keyboard, and screenshots in the background without stealing your foreground focus.
- You can continue using the computer for other work while the agent runs.
Cross-app workflows
- Switches between desktop apps and chains multi-step operations into a complete flow.
- Adjusts the next step based on what just happened, instead of replaying a fixed script.
Usage Scenarios
- Drive desktop apps that lack an API: when the target app has no CLI or plugin, the agent works the GUI directly — adjusting parameters in a design tool, batch-updating settings in an admin console, and so on.
- Automate cross-app flows: when a task spans several apps, the agent switches windows, copies data, and fills forms to complete the workflow end to end.
- GUI verification and testing: confirm that a UI change behaves as intended, reproduce a bug that only surfaces in the GUI, or check how the app responds to a specific sequence of actions.
- Collect and organize information: pull data out of apps with no export feature, or consolidate information that’s scattered across several screens.
For web apps, prefer the Browser Agent first.
System Requirements
- macOS 14 (Sonoma) or later.
How to Use
/computer-use slash command to invoke the capability and describe the task in natural language. The session shows the agent’s screenshots and progress in real time — interrupt the task or steer it with follow-up messages at any time.
Every mode of the Editor Window supports Computer Use; the Quest Window supports Computer Use only in Experts mode.
Permissions and Approvals
- Accessibility: lets Qoder read the UI element tree and perform clicks, typing, and other accessibility actions.
- Screen Recording: lets Qoder capture screenshots of the active window so the agent can perceive the interface state.
- Ask every time: the agent asks for your confirmation each time it needs to drive the desktop.
- Auto-run: the agent runs desktop actions on its own, without per-action confirmation.
- Disabled: turn Computer Use off entirely.
Cautions
- Granting access means granting control: once enabled, the agent can drive other apps on your computer with the same effect as if you took the action yourself. Disable it in settings when you don’t need it.
- Some actions can’t be undone: the agent’s actions inside desktop apps (sending messages, deleting files) may be irreversible. For high-risk scenarios, prefer the Ask every time policy.
- Screen contents are screenshotted: the agent perceives the interface through screenshots, so anything visible on screen — including sensitive information — may be captured. Close windows that contain passwords or private data before running automation.