Web testing is evolving with new tools that enhance reliability, user interactions, and accessibility. Let’s discuss a possible modern testing strategy for web interfaces that effectively prevents UI regressions.
About a year ago, Chromatic introduced a new feature for Storybook, allowing developers to write tests for interactions and verify how components behave in their stories (you can read more about it in their official blog post).
As very satisfied users of Storybook ourselves, we’ve been monitoring their updates, eager to refine our own product testing strategy. So today, I’d like to share a few insights from our experience testing web applications. Let’s break this down step by step.
No discussion on testing would be complete without mentioning the well-known Testing Pyramid.
In case you’re not familiar with it, the Testing Pyramid is a model that suggests how you should organize your automated tests: at the base, you have a lot of unit tests that check the atomic functionalities of your codebase, like functions or components. The middle layer has fewer integration tests, ensuring your components work well together. And finally, at the top, you’ve got even fewer, slower end-to-end (E2E) tests that ensure the whole system works from start to finish.
But does this model still work in today’s web development world?
While the general idea still holds up (prioritizing fast, high-coverage unit tests and using a smaller number of slower, more complex integration and E2E tests), traditional unit tests usually fall short in verifying the components visual aspect when rendered in a browser.
So, what should unit tests look like in modern web development’s landscape?
With the increasing shift of business logic to the server for efficiency and security, frontend applications are becoming "dumber," meaning traditional unit tests may offer limited value. While it makes sense to extract complex UI logic (like intricate state machines or React hooks) and test it in the usual way by verifying input-output relationships, how do we ensure that these computations result in the correct visual output?
Historically, the trend has been to increase frontend code coverage by introducing testing tools (such as Jest + React Testing Library) that generate and assert the HTML output of components, culminating in the introduction of so-called snapshot testing, which verifies the produced HTML matches a previously approved snapshot of it.
// A snapshot test implemented using Jest and react-test-renderer
import renderer from 'react-test-renderer';
import Link from '../Link';
it('renders correctly', () => {
const tree = renderer
.create(<Link page="http://www.facebook.com">Facebook</Link>)
.toJSON();
expect(tree).toMatchSnapshot();
});
However, these tools present at least two relevant drawbacks:
So, what can we do?
A fast and practical approach to identifying visual regressions is Visual Testing.
This involves rendering a component (or piece of UI) in a real or simulated browser, capturing a screenshot, and visually comparing it against a previously approved baseline. If differences exceed a set threshold, the test fails, allowing developers to inspect changes visually.
If you’re already using Storybook for showcasing components, for example, you can easily integrate Chromatic, an online service that automatically deploys Storybook, captures screenshots of all stories, and flags visual differences for review.
With this method, you can easily catch unintended UI changes in components with minimal setup.
Combined with a well-structured component architecture that separates UI from data fetching, this method can extend beyond atomic components to entire application sections in different states.
We found this approach to be incredibly simple yet highly effective. Unlike traditional testing methods that require writing complex test scenarios, Visual Testing can be implemented almost immediately, making it a practical solution for teams looking to enhance UI reliability without significant overhead.
Relying only on this type of tests, however, we’re not verifying how components behave in response to user interactions or events. This is where the new Storybook Test comes in.
As mentioned earlier, Storybook has recently introduced built-in testing utilities, complementing Chromatic’s visual testing.
With this extension, developers can enhance their Storybook stories with interactions and assertions, increasing test coverage by validating not only static UI elements but also the effects of user interactions.
Example Story with Testing Assertions:
// Example of a Storybook test-integrated story
export const Default = () => <Button onClick={() => console.log('Clicked!')}>Click me</Button>;
Default.play = async ({ canvasElement }) => {
const canvas = within(canvasElement);
await userEvent.click(canvas.getByText('Click me'));
expect(console.log).toHaveBeenCalledWith('Clicked!');
};
These so-called Component Tests are a good compromise between traditional UI and E2E tests. They indeed test a single component in isolation (like traditional unit tests). Still, they do so by rendering the components in a real browser environment, with real interactions, instead of relying on simulated Node renderers like traditional unit tests.
In this way, they ensure we’re testing not only the visual aspect but also that the behavior of the components aligns with the specifications.
With the new European accessibility regulations, ensuring your web app meets accessibility standards became more critical than ever.
While no automated tool can fully replace manual accessibility testing, several tools can provide a solid initial assessment and are a great and easy addition to any testing pipeline.
Storybook’s testing add-on now includes automatic accessibility validation, flagging issues within rendered components. But remember, accessibility also depends on how components are assembled into full pages—things like focus order, navigation flow, and screen reader usability.
For full accessibility compliance, additional tools such as Axe, Lighthouse, or manual keyboard navigation tests should be considered. In a previous blog post, we explored various solutions that can help ensure your project meets accessibility standards. You can find detailed insights and practical implementations here.
In web development, E2E testing builds upon unit and integration testing by simulating real user interactions in an actual browser (headless or not), ensuring that a web application behaves as expected from start to finish.
E2E tests, especially when written with best practices like Page Objects or custom assertions, can also be a valuable documentation resource for expected behaviors, thanks to the ease with which even less technical people can read and understand the tested interaction and the expected behavior. In a recent project, we experimented with combining E2E testing with Behavior-Driven Development (BDD), to better describe user interactions and expected results in a way that was both structured and accessible, further enhancing test clarity and maintainability.
// Example of an E2E test in Playwright to test an invalid user login
test(`Given the user enters incorrect credentials,
When they submit the form
Then an error message should be displayed`, async () => {
await loginForm.enterUsername('wrongUser');
await loginForm.enterPassword('wrongPassword');
await loginForm.submit();
await loginForm.hasErrorMessage('Invalid username or password');
});
Unlike older tools like Selenium or Puppeteer, modern frameworks like Playwright and Cypress improve the reliability of E2E tests by including built-in retry mechanisms in every assertion. For example, testing UI elements that appear after user interactions often leads to flaky tests due to delays in backend responses or slow rendering times, requiring tweaking timeouts to wait for the elements to appear. Playwright and Cypress address this by automatically retrying assertions until elements are present and interactive (or a timeout expires), significantly improving test reliability without impacting readability.
Another game changer is their ability to inspect the application’s visual state before and after each test step, making debugging easier. This adds up to the ability to record the browser's screen during test execution, which allows developers to observe the application’s behavior in case of visual failures.
Both Playwright and Cypress also support screenshot-based visual testing, which allows the implementation of visual diffing tests on the whole page.
Finally, thanks to their ability to interact with the browser, they can intercept every HTTP request the application makes (the ones we typically debug from the Network tab in Chrome DevTools, for instance) and provide a mocked response to them. This is very useful if we want to complement slower E2E tests with faster “UI tests” that only test the behavior of the UI, eliminating the latency of underlying requests.
A common issue in E2E testing is handling data fetched from third-party APIs and databases.
If tests rely on a pre-deployed test environment, it may be impossible to control API responses for specific scenarios because the environment would be configured to point to real or mocked services with predefined seeded data. This means all test cases must rely on preset responses, making it difficult for individual tests to modify service behaviour dynamically.
One possible solution we found compelling is shifting the control of the environment under test to the tests themselves.
If the test runner is responsible for spinning up the tested environment (either directly or using tools like Testcontainers), tests can also launch additional mock servers (e.g., an Express.js server) before starting the test environment. The environment can then be configured through environment variables to point to these mock servers instead of real external APIs. This approach allows tests to control the mock servers directly, dynamically modifying endpoints to return the responses needed for each scenario. As a result, test execution becomes more flexible, independent from external dependencies, and highly reliable.
The future of web development is increasingly AI-driven, which, of course, is also impacting the world of testing.
Cypress, for example, is developing several AI features to integrate into its product, Cypress Studio. These features will assist with writing tests by identifying untested user interactions and suggesting assertions based on observing the page’s behavior after users interact with it.
Stagehand presents itself as Playwright's evolution, allowing testers to express user interactions with the browser with commands in natural language (e.g. “click on the order button”).
Other tools, like Mabl and Testim, use machine learning to automatically adapt tests to application changes, reducing flakiness caused by these shifts in the product’s behavior. Applitools is enhancing visual diffing using AI to determine if the differences detected in screenshots are truly relevant.
AI can also optimize test execution times by prioritizing the most critical tests based on, for example, the latest changes in the codebase. Some CI/CD platforms are already experimenting with these approaches to speed up pipelines without sacrificing coverage.
We will closely follow developments in this field to understand how AI will change how we verify our applications.
The world of web interface testing is evolving rapidly. New tools and methodologies are making it easier, faster, and more reliable to test web applications.
However, dedicating sufficient time to properly verifying implemented software remains a challenge. From our experience, teams too often either skip testing altogether or invest too little time in validating their interfaces.
Defining a lean and effective testing strategy from the early stages of a project is essential to avoid neglecting this critical aspect during delivery.
In this context, new tools such as Visual Testing, modern E2E testing frameworks, and AI-powered automation can be incredibly helpful in streamlining this phase.
What do you think about the future of web testing? Do you already have a defined strategy, or have you started using these new tools and methods? Let’s chat!
Vincenzo joined Buildo as a Fullstack engineer in 2016, becoming the Frontend tech lead of the company. Passionate about frontend and backend technologies and music.
Stai cercando un partner affidabile per sviluppare la tua soluzione software su misura? Ci piacerebbe sapere di più sul tuo progetto.