Cursor Playwright: Is This the Best AI IDE for Test Automation?
In today’s article, I’ll shed light on an interesting integration that’s becoming increasingly popular in the world of test automation. I’m talking about the integration of Playwright with Cursor. As you probably know, I have been using Playwright almost daily for many years. In a lot of previous articles, I’ve focused on the tool. As you know, I like to use Playwright I’ve written a few articles about integrations with the tool Visual Regression Tracker – it’s a useful an open-source tool which extend a default functionalities in Playwright or How to Integrate Lighthouse with Playwright for Web Performance Testing (2025 Guide). If you haven’t used Playwright yet, I recommend checking these articles.
Apart from that, Playwright is an exceptional tool but it’s also important to consider how we use it. The default IDE for working with Playwright and Typescript is Visual Studio Code.
However, with the growing popularity of large language models (LLMs) and the rise of AI-powered tools — often surrounded by marketing hype and buzzwords — new IDEs have appeared on the market. One of these tools is Cursor.
What is Cursor?
Cursor IDE has been gaining popularity for some time now. Personally, I’ve been using the tool for a year and have seen real benefit from it. It’s built on the foundation of Visual Studio Code. So any that work in Visual Studio Code will also work in Cursor. An interesting fact is that Cursor is developed by a team of just 30 people. Yet they reportedly generate around $300 million annually from subscriptions. I’m going to show how Cursor works with Playwright.
What does the IDE let us do?
Cursor allows us to integrate with language models (LLMs), and use them natively within the tool. Thanks to this integration, we can use an LLM to solve a problem or add code directly — without leaving the tool.
How to install Cursor
Cursor is available for Windows, Mac, and Linux. In my current setup, I mainly use MacOS so I will focus on that configuration.
https://www.cursor.com/downloads
What can we achieve by leveraging Cursor?
From my observations, I’ve identified several interesting ways to use Cursor and LLMs. They work particularly well with code-related tasks like parsing between formats (e.g., JSON and XML) or generating page objects. LLMs perform best when trained on or provided with large amounts of validated data. Another example is migrating a test automation solution, for instance, converting tests written in Selenium with C# to Cypress. In this scenario, most LLMs handle this task well and features available in Cursor – such as ability to read all project files can significantly simplify the work.
What can we get by leveraging Cursor like the IDE?
Another example is creating a parser to load settings from a YML or JSON file, enabling a convenient way to manage configuration from within the code. In this case, LLMs perform well. It’s tricky to provide exact numbers without proper research, but I estimate that real-time savings can range from 10% to 30%. Of course, results may vary – sometimes better, sometimes worse – especially when we need to spend time fine-tuning prompts or dealing with hallucinations.
Is this a real threat to the future of IDEs like Cursor, or LLMs themselves?
Certainly, I don’t know what the future holds – none of us knows the limits of this technology (or its next iterations, and what might be possible with these tools in the future). From my perspective, these tools will support us, although we still need to build a solid foundation of knowledge. From my experience as a technical assessor, I haven’t seen a major shift – even though we’ve had IntelliSense, StackOverflow, Google for a long time. We still need to work on our core technical knowledge. It’s still rare to find projects with properly set up test automation across all layers.
Additionally, we still need a solid foundation of knowledge to use these tools effectively and save time. So yes, these tools will likely help us become more productive – but we still don’t know by how much.
How to use Cursor for code reviews?
One of the most interesting use cases for LLMs during coding is performing code reviews — for example, using the Claude model.
In the prompt, we can instruct the LLM to act like a developer with ten years of experience or ask it to evaluate whether our code follows SOLID principles.
Outside of Cursor, using AI-based tools for code reviews might also be convenient when working directly within the web interface of a repository provider. For example, this could include Copilot, Qodo or any similar tool. Personally, I’ve used both Copilot and Qodo. I must admit that these tools are quite helpful – at least as an additional way to validate whether everything looks correct.
For this task, I used code I had created for a previous article. It’s a file with two automated tests for Trello app.
Defining a prompt
I’ve defined a prompt, shown below.
It’s a simple prompt that performs a code review of the file.
Cursor suggested a few useful improvements to my code. It’s obvious that we don’t always need to accept every suggested comment — only those that fit the code. In this case, I accepted suggestions to use const values for my test data to improve structure, and to add spacing for better code formatting. Please take notice that we still need to stay focused and carefully consider whether the change is necessary.
Also, I had different code which I prepared for the presentation in Tallin about using Playwright & AI where I showed a real demo of Copilot and Qodo. There is the video of that:
Cursor – Generating page objects based on provided HTML
A smart application of using LLM in your daily work is generating page object classes based on HTML code.
In this example, I’m using a demo page that I often show during my Playwright workshops. I’ll copy the full HTML code for a login form, but for more complex pages, it’s worth limiting what we provide to the model with the prompt:
Generate a page object based on this HTML code using Playwright
There are a few reasons to do this:
First of all, context size refers to the maximum length of the message we send as a prompt. This can be challenging especially with more complex subpages or advanced UI elements like dynamic data tables and other UI complex components. In such cases, instead of loading the full HTML code it’s better to provide only the HTML of the specifc section or control nedded for the test.
How to use Cursor for Rest API Testing?
Another idea about using Cursor in automated tests is API testing, which can be done in Playwright (though the technology may vary). One way to use LLMs in this case is with a Swagger JSON file, which – if available for a particular API – contains the full definition of all API methods. Based on the swagger.json file, we can easily generate methods that can be used in automated tests. Additionally, there may be issues related to context size. To mitigate this, we can include in the prompt only the method we’re interested in, which helps speed up the process and reduce potential hallucinations.
Playwright CodeGen & Cursor
One of the most interesting ways to use cursor with Playwright – is during the prototyping tests we can use Playwright CodeGen and Cursor. Playwright Codegen allow us to create simple tests just using a recorder which records all our actions and then create a test. The code is rather a weak quality but when we use Cursor and LLMs after adding a few prompts the code can be much better almost the same as a production code. I want to emphasize that this integration can be useful in a simple test case.
What we need to do?
First, we enter the Playwright extension which we should have installed to Cursor. After click on “record” button we could see a browser and all of actions which perform will record. Even at the end we can add an assertion for the test.
After that, we enter address of the page into the address browser bar. In my case it will be a page which contains a login form. Moreover, we record all of actions one by one. Thereafter, I see the code of this test.
After generating code, I used a prompt to improve the code. Now the code looks like this:
The code is significantly better than the default generated code. Obviously, this tool has a lot of limitations but it’s worth to check it out plus in a case when we need to create a simple demo Playwright code-gen can be useful.
Cursor & MCP
MCP is a new protocol introduced at in end of 2024. It allows us to extend a standard use of LLMs through integration with external tools. Nowadays, many tools have already started integrating with it:
I recommend visiting this Github repository, which contains a list of currently supported tools:
https://github.com/punkpeye/awesome-mcp-servers
I’ll add an article about using MCP in upcoming days. It’s very interesting topic and I’m preparing a few examples for you.
How to limit context for LLM in Cursor?
This functionality is similar to .gitignore it helps us to ignore files, which we don’t want to include during the work in Cursor. There are two main reasons why it is worth using.
First, security – locally stored key or file settings should be especially secured. In Cursor documentation we can find an important objection:
“Cursor makes its best effort to block access to ignored files, but due to unpredictable LLM behavior, we cannot guarantee these files will never be exposed.”
https://docs.cursor.com/context/ignore-files
So, if you work with very sensitive data, it’s worth to consider a way of using them. In this case we can add an additional layer of security and use Cursor only for less restricted repositories, if necessary, copy manual to the main project.
The second reason is performance. In the case of large and complex repositories, it’s worth excluding unnecessary folders or projects that aren’t relevant to the current task. Thanks to that, the model’s context will be more precised and searching or analyzing the code will be faster and less resource-intensive.
Addtionally, Cursor ignores not only .cursorignore but also files listed in .gitignore by default.
Cons of using Cursor?
Cursor sometimes can crash and showing unspecified errors. It happens rather rarely but there were weeks when a lot was going on that. Apart from that what not always is a problem of Cursor but more regarding to LLM limitations – we can come across on hallucinations which’re solutions which sound reasonably but they don’t work. It can lead to wasted time debugging issues that, in reality, don’t make sense.
Hallucinations and context saturation
One of the most significant potential risks is hallucination. LLMs, as well as tools like Cursor (which use them under the hood) can sometimes suggests methods or solution that simply … don’t exisit. For example, we might get a suggestion to use a method from a framework – even though that method doesn’t actually exist. It could lead to a situation where it’s easier to write a solution from scratch.
Note for huge fans of AI based tools
One of the disadvantages, I’ve noticed, especially for less experienced people is that this type of tools can slightly bludge ourself. There is a risk that we may start learning bad habits or create a code average quality.
Why?
- Because instead of thinking on my own we can often get hints as a ready solution without trying to understand it. Of course, when we do that these tools can help us as well.
- We can forget about veryfing their quality and compliance with our project
- It’s easy to grom accustomed to „autopilot” mode and stop to develop our technical skills.
Alternatives?
There are other IDEs which offer similar capabilities to Cursor. For example, Visual Studio Code has supported integration with GitHub Copilot for quite some time. Another example is Windsurf, which was acquired by OpenAI for $ 3 billion. Personally, for me, I think Cursor is currently the best IDE for TypeScript, JavaScript, Playwright related projects. It’ worth following the market, as things can change in future. Another interesting IDE is Aqua, which also includes integration with LLMs.
Pricing?
Currently, Cursor offers three pricing plans. Cursor can be used for free, but this version has limitations, such as a cap on how many requests can be made to LLM. The pro version includes a higher requests limit, while the business is dedicated for enterprise customers.
Which types of working modes are available with Cursor?
Currently, Cursor supports a few modes which we can use:
1. Agent mode – full-autonomy
The default chat mode. Cursor’s agent can read your whole repo, plan multi-step changes, edit files, run terminal commands, and iterate until tests pass. Great for “add a dark-mode toggle to my React app”-type tasks.
Switch: open the chat and pick Agent in the mode picker (or use the keyboard shortcut you assigned).
2. Ask mode – read-only exploration
A “search / Q&A” mode: the AI can browse your code, docs, and the web but will never write. Perfect for “what does this function do?” also we can utlize for a repository which we downloaded to learn how something works.
3. Manual mode – surgical edits
When you already know what to change and where, Manual obeys exact instructions and only touches the files you mention. No autonomous searching, no terminal. Think of it as an AI pair-programmer that does the keystrokes for you.
4. Cursor Tab – inline autocomplete
Separate from the chat: press Tab to accept multi-line, context-aware code completions (it can even delete or rearrange code, not just insert). Cursor’s Tab is its answer to Copilot.
Auto Import TypeScript & Python
This mode allows to automatically import of TypeScript modules and for Python as well. Personally, I mainly use Cursor with TypeScript so this setting is very useful for me. Corresponding setting is available for Python but still in beta.
AI Commit message
When we use this function, we can add a message to commit based on files which are prepared to be committed. We need to remember that our project and team can have defined rules applied what commit message should contain. Unfortunately, currently, we don’t have impact how commit should look like according to the documentation. Cursor relies on commits history.
“Currently, there isn’t a way to customize or provide specific instructions for how commit messages should be generated. Cursor will automatically adapt to your existing commit message style.”
https://docs.cursor.com/more/ai-commit-message
.cursorrules
The rules are written in MDC format (markdown with front matter). Here, we can define that Cursor modifies our tests using a specific structure, or that a page object must follow a particular implementation pattern. Another way rules can be useful is by determining how many lines a file should have, how a commit message should begin according to our project conventions. One more idea is to use them as an additional safeguard to prevent committing tokens to any sensitive resource.
How? We can define a rule that replaces any real token with a default fake value automatically. Overall, rules seem like an interesting concept – at the very least, they’re worth exploring.
Size of context?
One important thing to note is the size of the context — that is, how many tokens can be utilized by the tool. According to the documentation we can find an information that Cursor supports by default 10,000 token when we have a bigger context that our solution can propose better reponses for the code what we have in the project because it’s less change for getting a hallucination. If we have a big project or when we do a lot of followups to the current context window that in this case, we can get more hallucinations as well.
Is it possible to use the same prompt across different models?
Yes, it’s possible we can use the same prompts for different models. In this situation still we need to decide and check the model response to have confidence there is a proper solution. A few hints from Cursor documentation.
From my personal perspective, I admit that Claude copes the best with the code. Keep in mind, that for certain problems it’s useful to seek a dedicated model which contains necessary trained data.
https://docs.cursor.com/guides/selecting-models
Security Rules
It’s worth to remember that if we want to use Cursor in a real project, we must have the approval from the management that we can use that. We have some mechanism to exclude files from analyzing (even though it can help with correctness of the model).
Summary
Today, I showed you how the Cursor IDE can be leveraged with Playwright. It’s a useful IDE for people who create automated tests, but also for want to use LLMs in their solutions. Please note both LLMs and Cursor have their own limitations. They don’t always add value, so we still need to improve our test automation skills and programming skills.