iTranslated by AI
Getting Started with Browser Automation using Playwright MCP: A Beginner's Guide
Must-Read for Beginners! Starting Browser Automation with Playwright MCP
Introduction
When performing testing or automation of web applications, efficient browser operation is essential. While many automation tools have emerged over the years, "Playwright MCP," announced by Microsoft in 2025, is garnering attention as a next-generation browser automation tool designed with AI integration in mind.
The "MCP" in Playwright MCP stands for "Model Context Protocol," an innovative protocol that allows AI (LLMs: Large Language Models) to operate browsers directly. With this tool, you can operate a browser using natural language instructions and easily realize integration with AI agents.
In this article, we will explain everything from the basic concepts of Playwright MCP to its installation and practical usage in a way that is easy for beginners to understand. If you are interested in browser automation or test automation, or if you want to explore new possibilities through AI integration, please read on to the end.
What is Playwright MCP?
Playwright Basics
Before understanding Playwright MCP, let's briefly explain Playwright itself.
Playwright is an open-source browser automation tool developed by Microsoft. It supports multiple browsers such as Chromium, Firefox, and WebKit, allowing for cross-browser testing with a single API. It is a powerful tool that can automate browser operations using JavaScript or TypeScript.

Source: Playwright Official Site
What is MCP?
MCP stands for "Model Context Protocol," a standard protocol for Large Language Models (LLMs) to access tools and data. Simply put, it is like a common language for AI to interact with external tools and data safely and efficiently.
Features of Playwright MCP
Playwright MCP is a version of Playwright that supports MCP, and it has the following features:
- Utilization of the Accessibility Tree: Uses the structural information of web pages instead of pixel-based image recognition.
- LLM-friendly: Designed for easy integration with AI models.
- Fast and Lightweight: Processing is faster than visual recognition via screenshots.
- Deterministic Operations: Reliable element identification and operation.
Differences from Traditional Browser Automation
Compared to traditional browser automation tools, Playwright MCP has the following differences:
| Traditional Approach | Playwright MCP |
|---|---|
| Direct access to HTML/DOM | Utilizes the accessibility tree |
| Image recognition or coordinate-based operation is mainstream | Structure-based operation is mainstream |
| Additional implementation required for AI integration | Designed assuming AI integration |
| Requires specification of complex selectors | Enables instructions close to natural language |
What is the Accessibility Tree?
The core technology of Playwright MCP is the "Accessibility Tree." This represents the semantic structure of a web page, rather than its visual representation.
The accessibility tree is the same as what assistive technologies like screen readers use, containing information such as the roles of elements on the page (buttons, links, text inputs, etc.), names, states, and relationships.

Conceptual diagram of the accessibility tree
Playwright MCP is, by using this accessibility tree, identifies and operates elements based on "meaning" rather than the look of the web page. This makes automation more stable and resistant to visual UI changes.
Two Operating Modes of Playwright MCP
Playwright MCP has two operating modes: "Snapshot Mode" and "Vision Mode." Let's look at the characteristics and use cases for each.
Snapshot Mode (Default)
Snapshot mode is the default operating mode of Playwright MCP. In this mode, it uses the browser's accessibility tree to identify and operate on elements based on the web page's structural information.
Features and Benefits:
- Fast: Faster processing because it doesn't require screenshot generation or image processing
- Lightweight: Consumes fewer resources
- Stability: Resistant to changes in UI appearance
- Structural Understanding: Enables operations based on the semantic structure of the page
- Accuracy: Precise element identification
Snapshot mode is suitable for many common web operations (navigation, form input, data extraction, etc.).
// Example of Snapshot mode (default)
npx @playwright/mcp
Vision Mode
Vision mode operates similarly to traditional visual automation tools. In this mode, it takes screenshots of the web page and performs coordinate-based operations based on them.
Features and Use Cases:
- Visual Element Manipulation: Manipulating visual elements that are difficult to represent structurally
- Coordinate-based Operation: Allows operation based on X-Y coordinates
- Compatibility with Visual AI Models: Suitable for integration with computer-vision-capable AI models
- When Accessibility Information is Insufficient: Operation on sites where accessibility is not considered
Vision mode is helpful for operations on special UI components or websites with insufficient accessibility information.
// Example of enabling Vision mode
npx @playwright/mcp --vision
Criteria for Selecting a Mode
You should decide which mode to choose based on the following criteria:
-
Snapshot Mode (Recommended):
- General website operations
- When performance is important
- When prioritizing stability
- Normal form input or data extraction
-
Vision Mode:
- Sites lacking accessibility information
- When visual confirmation is necessary
- Operations on canvas or graphical elements
- Operations requiring image recognition

The two operating modes of Playwright MCP
Installation and Setup of Playwright MCP
From here, let's prepare to actually use Playwright MCP. I will explain it step-by-step so that even beginners won't get lost.
Required Environment
To use Playwright MCP, the following environment is required:
- Node.js: Playwright MCP operates in a Node.js environment
- Claude Desktop (Optional): If you want to integrate with AI
Installing Node.js
First, let's install Node.js. Download and install the LTS (Long Term Support) version from the official website.
- Access the Node.js Official Site
- Download the LTS version installer
- Run the installer and follow the instructions to install
Once the installation is complete, open a terminal (Command Prompt or PowerShell for Windows) and check the versions of Node.js and npm with the following commands:
node -v
npm -v
If the version numbers are displayed, the Node.js installation was successful.
Installing and Running Playwright MCP
Playwright MCP can be executed directly using the npx command. Permanent installation is not required.
Basic execution method:
npx @playwright/mcp
Running this command starts the Playwright MCP server. By default, it operates in Snapshot mode.
When running in Vision mode:
npx @playwright/mcp --vision
Other Options
Playwright MCP has the following options:
- --port=<Port Number>: Start the server on a specific port
- --headless: Headless mode (do not display the browser window)
- --vision: Enable Vision mode
For example, to start the server in headless mode on port 8000:
npx @playwright/mcp --port=8000 --headless
How to Integrate Claude Desktop with Playwright MCP
One of the attractions of Playwright MCP is the ease of integration with LLMs like Claude AI. Here, we will explain how to integrate it with Claude Desktop.
Installing and Configuring Claude Desktop
- Download and install the desktop app from the Claude Official Site.
- Register an account and log in.
- Enable Developer Mode.
- Top-left menu → Help → Enable Developer Mode.

Enabling Developer Mode in Claude Desktop
Editing the Configuration File
Once you have enabled Developer Mode in Claude Desktop, the next step is to edit the configuration file:
- Open the "Developer" tab in Claude Desktop.
- Click "Edit Config" or "Get Started with Config."
- Open and edit
claude_desktop_config.json.
Add the following content:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
]
}
}
}
If you use Vision mode, change the args as follows:
"args": [
"@playwright/mcp",
"--vision"
]
- Save the file and restart Claude Desktop.
Now, you are ready to use Playwright MCP from Claude Desktop.

Configuration for Claude Desktop and Playwright MCP integration
Basic Usage of Playwright MCP
Once the Playwright MCP setup is complete, let's actually try using it. Here, we introduce the basic operation methods.
Key Tools and Commands
Playwright MCP offers different tools depending on the mode.
Key Tools in Snapshot Mode
In Snapshot mode (default), the following main tools are available:
- browser_navigate: Navigation to a URL
- browser_snapshot: Retrieval of accessibility snapshots
- browser_click: Clicking an element
- browser_type: Text input
- browser_select_option: Selection from a dropdown menu
- browser_go_back: Go back to the previous page in the browser
- browser_go_forward: Go forward to the next page in the browser
- browser_take_screenshot: Retrieval of a screenshot
- browser_choose_file: File selection
- browser_save_as_pdf: Save as PDF
Key Tools in Vision Mode
In Vision mode, additional tools such as the following are available:
- browser_screenshot: Retrieval of a screenshot
- browser_move_mouse: Mouse movement by specifying coordinates
- browser_click: Clicking by specifying coordinates (behavior differs from Snapshot mode)
Basic Operation Examples
Navigation to a Website
// Example of accessing Google
browser_navigate({ url: "https://www.google.com" })
// Retrieve a snapshot
const snapshot = browser_snapshot()
Clicking Elements and Input
// Find and click the search box from the snapshot
browser_click({
element: "Search box",
ref: snapshot.searchBox.ref
})
// Input text
browser_type({
element: "Search box",
ref: snapshot.searchBox.ref,
text: "Playwright MCP",
submit: true
})
Example of Form Input
// Example of a login form
browser_navigate({ url: "https://example.com/login" })
const snapshot = browser_snapshot()
// Input username
browser_click({
element: "Username field",
ref: snapshot.usernameField.ref
})
browser_type({
element: "Username field",
ref: snapshot.usernameField.ref,
text: "testuser",
submit: false
})
// Input password
browser_click({
element: "Password field",
ref: snapshot.passwordField.ref
})
browser_type({
element: "Password field",
ref: snapshot.passwordField.ref,
text: "password123",
submit: false
})
// Click login button
browser_click({
element: "Login button",
ref: snapshot.loginButton.ref
})
Example of Integration with Claude AI
When you integrate Claude Desktop with Playwright MCP, browser operations via natural language instructions become possible. For example:
Access Google, search for "Playwright MCP", and click on the first search result.
When you give an instruction like this, Claude AI actually operates the browser through Playwright MCP.

Image of browser operation by Claude AI
Practical Examples
Here, we will introduce more practical examples of using Playwright MCP.
Form Input Automation
For example, let's consider a case where you automate the task of entering the same data into multiple forms.
// Function for automatic form input
async function fillForm(url, formData) {
await browser_navigate({ url: url });
const snapshot = await browser_snapshot();
// Input each field in the form
for (const [fieldName, value] of Object.entries(formData)) {
const field = findFieldByLabel(snapshot, fieldName);
if (field) {
await browser_click({ element: fieldName, ref: field.ref });
await browser_type({
element: fieldName,
ref: field.ref,
text: value,
submit: false
});
}
}
// Find and click the submit button
const submitButton = findSubmitButton(snapshot);
if (submitButton) {
await browser_click({
element: "Submit button",
ref: submitButton.ref
});
}
}
// Usage example
const userData = {
"Name": "Taro Yamada",
"Email Address": "yamada@example.com",
"Phone Number": "03-1234-5678",
"Comment": "This is a test of Playwright MCP."
};
fillForm("https://example.com/contact", userData);
Data Extraction and Scraping
This is an example of extracting specific data from a web page.
// Example of scraping product information
async function scrapeProductInfo(url) {
await browser_navigate({ url: url });
const snapshot = await browser_snapshot();
// Array to store product information
const products = [];
// Find the product list
const productList = findProductList(snapshot);
if (productList && productList.children) {
// Extract information for each product
for (const product of productList.children) {
const name = findProductName(product);
const price = findProductPrice(product);
const rating = findProductRating(product);
products.push({
name: name ? name.innerText : "Unknown",
price: price ? price.innerText : "Unknown",
rating: rating ? rating.innerText : "Unknown"
});
}
}
return products;
}
// Usage example
const products = await scrapeProductInfo("https://example.com/products");
console.log(products);
Integration with AI Agents
Here is an example of integrating an LLM like Claude AI with Playwright MCP.
// Example of task processing by an AI agent
async function aiAssistedTask(taskDescription) {
// Explain the task to the AI agent
const agentResponse = await claudeAgent.process({
task: taskDescription,
tools: ["playwright-mcp"]
});
// Execute the operation steps generated by the AI
for (const step of agentResponse.steps) {
switch (step.action) {
case "navigate":
await browser_navigate({ url: step.url });
break;
case "click":
const snapshot = await browser_snapshot();
const element = findElementByDescription(snapshot, step.elementDescription);
if (element) {
await browser_click({
element: step.elementDescription,
ref: element.ref
});
}
break;
case "type":
await browser_type({
element: step.elementDescription,
ref: step.elementRef,
text: step.text,
submit: step.submit || false
});
break;
// Other operations...
}
}
return await browser_snapshot();
}
// Usage example
const result = await aiAssistedTask(
"Access Amazon, search for the latest iPhone, and retrieve the price of the first search result."
);
Common Problems and Solutions
Here are some common problems you might encounter when using Playwright MCP, along with their solutions.
1. When Elements Cannot Be Found
Problem: There are cases where elements cannot be identified in Snapshot mode.
Solutions:
- Check if the element is correctly represented in the accessibility tree.
- Use a more specific description of the element.
- Switch to Vision mode to identify the element visually.
// Specifying elements in more detail
const snapshot = await browser_snapshot();
console.log(JSON.stringify(snapshot, null, 2)); // Check the structure of the snapshot
// Identify elements by combining multiple attributes
const element = findElementByMultipleAttributes(snapshot, {
role: "button",
name: "Submit",
// Other attributes...
});
2. Operations on Dynamically Changing Web Pages
Problem: On dynamically changing web pages, such as SPAs, elements may not be found.
Solutions:
- Re-acquire snapshots at the appropriate timing.
- Add logic to wait until the element is displayed.
// Function to wait until an element is displayed
async function waitForElement(description, maxAttempts = 10, interval = 500) {
for (let i = 0; i < maxAttempts; i++) {
const snapshot = await browser_snapshot();
const element = findElementByDescription(snapshot, description);
if (element) {
return element;
}
// Wait for a certain period of time
await new Promise(resolve => setTimeout(resolve, interval));
}
throw new Error(`Element "${description}" not found after ${maxAttempts} attempts`);
}
// Usage example
const loginButton = await waitForElement("Login button");
await browser_click({
element: "Login button",
ref: loginButton.ref
});
3. Operations in iframes or Shadow DOM
Problem: You may be unable to access elements inside iframes or the Shadow DOM.
Solutions:
- Adjust the snapshot acquisition options.
- Use special selectors.
// Retrieve snapshot including iframes
const snapshot = await browser_snapshot({ includeIframes: true });
// Retrieve snapshot including Shadow DOM
const snapshot = await browser_snapshot({ includeShadowDOM: true });
4. Performance Issues
Problem: Performance may degrade on large pages or with complex operations.
Solutions:
- Minimize the number of times snapshots are acquired.
- Perform only the necessary operations.
- Use headless mode.
// Execution in headless mode
npx @playwright/mcp --headless
// Reuse of snapshots
const snapshot = await browser_snapshot();
// Use the same snapshot for multiple operations
Summary - Possibilities and Future Prospects of Playwright MCP
In this article, we have covered everything from the basic concepts of Playwright MCP to its practical usage. Let's summarize the main points of Playwright MCP:
- Innovative Approach: Structure-based operation utilizing the accessibility tree.
- Two Operating Modes: Snapshot mode (default) and Vision mode.
- Easy Introduction: Can be started easily with just Node.js and the npx command.
- AI Integration: Advanced automation integrated with LLMs like Claude AI.
Playwright MCP is particularly effective in the following scenarios:
- Test Automation: E2E testing of web applications.
- Data Collection: Web scraping and information gathering.
- Business Automation: Automation of routine web operations.
- AI Agents: Autonomous task execution utilizing LLMs.
Future Prospects
While Playwright MCP is a relatively new tool, further development is expected:
- More Advanced AI Integration: AI understanding and executing more complex tasks.
- Improved Visual Understanding: Better accuracy in Vision mode.
- Addition of New Features: More diverse browser operations and control functions.
- Expansion of the Ecosystem: Integration with various LLMs and tools.
Final Thoughts
Playwright MCP is a tool that opens up a new era of browser automation. Its design, based on the premise of AI integration, hints at the future direction of web and test automation.
I encourage beginners to try out Playwright MCP while referring to this article. By adopting new technologies early, you will be able to build a more efficient and creative web development and testing environment.
Reference Links
- Playwright MCP Official GitHub
- Playwright Official Documentation
- Overview of Model Context Protocol (MCP)
- Claude AI Official Site
I hope this article serves as a helpful reference for anyone interested in Playwright MCP. If you have any questions or feedback, please feel free to leave a comment!
Discussion