Simplify Compliance Workflows With New C/C++test 2024.2 & AI-Driven Automation | Register Now
Jump to Section
Top Tips for Selenium Experts
Once you've been using Selenium for a while and are comfortable writing test cases, you can focus on techniques and design principles to get your UI test automation to the next level. Check out these techniques and practices prepared for Selenium users.
Jump to Section
Jump to Section
Before We Begin
This article assumes you have been using Selenium and are comfortable writing test cases. You’ve already gone through the trial of inspecting DOMs to create your XPaths. Maybe you are using the Page Object Model. By now you are probably pretty good at looking up solutions on the internet. If you want some help in that department, I highly recommend this article from my colleague with some great Selenium hacks.
What follows is broken down into techniques and design patterns. You may already know all about some of these. That’s fine, just skip to the sections that interest you. I’ll be using Java here, but the techniques and practices should be generally applicable.
7 Selenium Tips for Expert Testers
1. Better Web Element Locators
Bad locators cause spurious test failures. They waste time and distract from the intended function of the test. Bad locators typically rely on some unstable quality of the web page, whether that’s a dynamic value or the element’s position on the page. When those qualities invariably change, the locator breaks. Good locators just work. They correctly identify their intended element and allow the test to do its job.
The key here is to identify the qualities of the element that are stable, then choose the minimum subset of these qualities that uniquely identify that element to reduce your exposure to change. We all know absolute XPaths are bad, but it’s good to understand exactly why.
/html/body/div[25]/div[2]/div/span/span[2]/div/h2/p
Take one subsection of this XPath: div[25]/div[2] . This represents the following qualities:
- The node is somewhere in a div
- That div is the second div
- That div is directly within another div
- That div is the 25th div
In this small subsection we have at least 4 qualities used to identify an element. If any of them change, our locator breaks. We have no reason to believe they will not change because none of them actually describe the element. Looking at this XPath, we couldn’t even guess the purpose of the element.
Locators that reflect the purpose of the element are less likely to break. While the position or appearance of the element might change, its function should not. Ideally, you can guess what the element does by looking at its locator.
//p[contains(@class, “content”)][contains(.,”Guidance for Success”)]
With the above, it’s clear the element represents content that describes “Guidance for Success.” Notice the class attribute, while usually used to control the appearance, also describes the element’s function.
Often the unique, stable qualities of an element are not found in the element itself. Instead, you will need to go to some relative of the element whose function is well described in the DOM. Take the above example in the “Guidance for Success” element. While this locator is good, copy text can often change, or may not be an option if the site supports multiple languages. Searching the DOM we might find that the parent div has a descriptive id. In that case, we might prefer a locator like:
//div[@id=”success_guide”]//p
2. Explicit Waits
The temptation here is to set an implicit wait and hope that will handle most issues. The problem is this broad approach treats all situations the same without even addressing most problems associated with waiting. What if the element becomes present after a few seconds, but is not yet ready to be clicked? What if the element is present, but is obscured by an overlay? The solution is to use well-crafted explicit waits.
Explicit wait conditions take the form of:
WebDriverWait wait = new WebDriverWait(webdriver, timeOutInSeconds) wait.until(/* some condition is true */)
Explicit waits are powerful because they are descriptive. They allow you to state the necessary conditions in order to proceed. A common example of this is when the test needs to click an element. It is insufficient for the element to just be present; it needs to be visible and enabled. Explicit waits allow you to describe these requirements directly in the test. We can be confident the conditions have been met before the test attempts to continue, making your tests more stable:
WebDriverWait wait = new WebDriverWait(webdriver, timeOutInSeconds) wait.until(ExpectedConditions.elementToBeClickable(element))
Descriptive explicit waits also let you write tests that focus on the behavioral aspects of the application. Because the application behavior does not change often, this allows you to create more stable tests. Given the example above, it’s necessary for an element to be visible and enabled to be clicked, but it also needs to not be obscured by another element. If there is a large “loading” overlay covering your element, the click will fail. We can create an explicit wait that describes this loading behavior and waits for the necessary conditions to be met:
WebDriverWait wait = new WebDriverWait(webdriver, timeOutInSeconds) wait.until(ExpectedConditions.invisibilityOf(loadingOverlay))
3. ExpectedConditions
You may have noticed the ExpectedConditions utility methods being used in the above examples. This class contains a large number of helpful conditions to be used when writing your Selenium tests. If you haven’t already, it’s worth taking a moment to go over the full API. You will commonly use ExpectedConditions.elementToBeClickable(..) or ExpectedConditions.invisibilityOf(..), but you may also find uses for alertIsPresent(), jsReturnsValue(…), or titleContains(..). You can even chain the conditions together using ExpectedConditions.and(..) or ExpectedConditions.or(..).
4. Executing JavaScript
WebDrivers provide the ability to execute JavaScript within the context of the browser. This is a simple feature with incredible versatility. This can be used for common tasks, such as forcing a page to scroll to an element, as with:
driver.executeScript("arguments[0].scrollIntoView(false)", element)
It can also be used to leverage the JavaScript libraries used in an application such as JQuery or React. For example, you could check if the text in a rich editor had been changed by calling:
driver.executeScript(“return EDITOR.instances.editor.checkDirty()”)
The executeScript feature opens up the entire library API to your test. These APIs often provide useful insight into the state of the application that would otherwise be impossible or unstable to query using WebElements.
Using library APIs does couple your test to a library implementation. Libraries can often be swapped during the development of an application, so caution is required when using executeScript this way. You should consider abstracting these library-specific calls behind a more abstract interface to help reduce your tests’ exposure to instability, such as with the Bot pattern (see below).
5. Mastering the Bot Pattern
The bot pattern abstracts Selenium API calls into actions. The actions can then be used throughout your tests to make them more readable and concise.
We have already seen a few examples where this would be useful in this article. Because it is necessary for an element to be clickable before we try a click it, we may always want to wait for the element to be clickable before each click:
void test() { /* test code */ WebDriverWait wait = new WebDriverWait(driver, 5); wait.until(ExpectedConditions.elementToBeClickable(element)); element.click(); wait.until(ExpectedConditions.elementToBeClickable(element2)); element2.click(); }
Rather than write the wait condition every time the test clicks an element, the code can be abstracted into its own method:
public class Bot { public void waitAndClick(WebElement element, long timeout) { WebDriverWait wait = new WebDriverWait(driver, timeout); wait.until(ExpectedConditions.elementToBeClickable(element)); element.click(); } }
Then our code becomes:
void test() { /* test code */ bot.waitAndClick(element, 5); bot.waitAndClick(element2, 5); }
The bot can also be extended to create library-specific implementations. If the application ever begins using a different library, all of the test code can remain the same and only the Bot needs to be updated:
public class Bot { private WebDriver driver; private RichEditorBot richEditor; public Bot(WebDriver driver, RichEditorBot richEditor) { this.driver = driver; this.richEditor = richEditor; } public boolean isEditorDirty() { richEditor.isEditorDirty(); } } public class RichEditorBot() { public boolean isEditorDirty() { return ((JavascriptExecutor) driver).executeScript(“return EDITOR.instances.editor.checkDirty()”); } } void test() { /* test code */ bot.isEditorDirty(); }
An example Bot is available as part of the WebDriverExtensions library, as well as a library-specific implementation:
- https://github.com/webdriverextensions/webdriverextensions/blob/master/src/main/java/com/github/webdriverextensions/Bot.java
- https://github.com/webdriverextensions/webdriverextensions/blob/master/src/main/java/com/github/webdriverextensions/vaadin/VaadinBot.java
The Bot pattern and the Page Object model can be used together. In your tests, the top-level abstraction is the Page Object representing the functional elements of each component. The Page Objects then contain a Bot to use in the Page Object functions, making their implementations simpler and easier to understand:
public class LoginComponent { private Bot bot; @FindBy(id = “login”) private WebElement loginButton; public LoginComponent(Bot bot) { PageFactory.initElements(bot.getDriver(), this); this.bot = bot; } public void clickLogin() { bot.waitAndClick(loginButton, 5); } }
6. Simplifying WebDriver Management
Instantiating and configuring a WebDriver instance such as ChromeDriver or FirefoxDriver directly in test code means the test now has two concerns:
- Building a specific WebDriver.
- Testing an application.
A WebDriver Factory separates these concerns by moving all WebDriver instantiation and configuration out of the test. This can be accomplished in many ways, but the concept is simple: create a factory that provides a fully configured WebDriver.
In your test code, get the WebDriver from the factory rather than constructing it directly. Now any concerns regarding the WebDriver can be handled in a single place. Any changes can happen in that one place and every test will get the updated WebDriver.
The Web Driver Factory project uses this concept to manage the lifespan of WebDrivers across multiple tests. This complex task is abstracted to the factory allowing tests to just request a WebDriver with the provided options:
The WebDriver factory makes it easy to reuse a single test across multiple browsers. All configurations can be handled through an external file. The test only asks the WebDriver factory for an instance of the WebDriver and the factory handles the details. An example of this is used to support parallel grid testing in the TestNG framework:
7. Extending the Page Object Model
The Page Object model provides a layer of abstraction that presents the functions of an applications’ components while hiding the details of how Selenium interacts with these components. This is a powerful design pattern that makes code reusable and easier to understand. However, there can be a lot of overhead in creating a class for each page and component. There’s the boilerplate for each class, then shared components between classes such as initializing the instance and passing around the WebDriver or Bot object. This overhead can be reduced by extending the Page Object model. If you are using the Bot pattern along with your Page Object model, then each Page Object will need an instance of the Bot. This might look like:
public class LoginComponent { private Bot bot; @FindBy(id = “login”) private WebElement loginButton; public LoginComponent(Bot bot) { PageFactory.initElements(bot.getDriver(), this); this.bot = bot; } public void clickLogin() { bot.waitAndClick(loginButton, 5); } }
Instead of including the Bot code in every constructor, this code could be moved to another class that each component extends. This allows the individual component code to focus on details of the component and not on initialization code or passing the Bot around:
public class Component { private Bot bot; public Component(Bot bot) { PageFactory.initElements(bot.getDriver(), this); this.bot = bot; } public Bot getBot() { return bot; } } public class LoginComponent extends Component { @FindBy(id = “login”) private WebElement loginButton; public LoginComponent(Bot bot) { super(bot); } public void clickLogin() { getBot().waitAndClick(loginButton, 5); } }
Similarly, it is common for a component to verify that it is being instantiated at the right moment, to make it easier to debug when components are used incorrectly. We might want to check that the title is correct:
public class LoginPage extends Component { public LoginPage(Bot bot) { super(bot); bot.waitForTitleContains(“Please login”); } }
Rather than include this call to the Bot in every class, we can move this checkup into a specialized version of Component that other pages extend. This provides a small benefit that adds up when creating many Page Objects:
public class TitlePage extends Component { public LoginPage(Bot bot, String title) { super(bot); bot.waitForTitleContains(title); } } public class LoginPage extends TitlePage { public LoginPage(Bot bot) { super(bot, “Please login”); } }
Other libraries provide helper classes for exactly this purpose. The Selenium Java library includes the LoadableComponent object which abstracts the functionality and checks around loading a page:
The WebDriverExtensions goes even further by abstracting much of the code around Page Objects into annotations, creating simpler, easier to read components:
Best Practices for Selenium Testing
Best practices ensure you get the most out of your Selenium test scenarios, both immediately and in the long term. The following best practices leverage the tips listed above to create robust, cross-platform, cross-browser scenarios. Keeping these practices in mind will help simplify the transition from creating less-optimal scripts to scenarios that are fully optimized and resilient to change.
Maintainable Test Code
Improve efficiency and productivity by writing tests that are easy to maintain. You will spend less time updating tests and more time focusing on real issues when web applications change. Many of the techniques in this blog are useful in part because they improve maintainability.
Writing test scenarios that are easy to understand by just looking at them, or self-documenting code, makes them easier to maintain. Web element locators that describe the element, explicit wait conditions that specify necessary preconditions, and page objects that match actual pages all make the code easier to read and understand. Even the bot pattern helps make test scripts self-documenting by mapping real user actions with function names defined in the bot.
Descriptive web element locators and explicit waits are also less likely to break over time because they focus on the features of the web application rather than implementation details. For example, locators like the following:
//p[contains(@class, “content”)][contains(.,”Guidance for Success”)]
do a much better job of describing the element than those that rely on unrelated, structural details such as:
/html/body/div[25]/div[2]/div/span/span[2]/div/h2/p
A web application’s features are less likely to change than the underlying implementation of the features. Consequently, test scripts written against the features are less likely to require updates.
Data-Driven Testing
Providing a variety of data to your test scenarios better exercises your application and provides wider test coverage.
Test frameworks like JUnit and TestNG provide built-in feature sets to support data-driven testing. Any test scenario that inputs data, such as filling out a form or logging in as a user, can benefit from data-driven testing.
When enhancing your scenarios through data-driven testing, avoid the tendency to create different logical flows through your scenario for different data. Instead, separate your scenarios according to the kind of data they expect. This allows for clearly defined test scenarios with straightforward, linear steps.
Cross-Browser Testing
While details may be different across browsers, web application core functionality typically remains consistent, so tests that are centered on this core functionality through descriptive web element locators and explicit waits are less likely to break when run in different browsers.
The bot pattern is a place where low-level differences between browsers can be handled. For example, if different browsers require different JavaScript to accomplish the same action, the bot can serve as a place to abstract this difference. Both JavaScript implementations can be included in a single function that describes the action and handles branching between the two JavaScript implementations, depending on the browser. Test script code can then call the function on the bot without having to worry about implementation details, making the test scenario code easier to read and keeping out browser-specific details.
Similarly, the page object code can be used to abstract differences between browsers in the web application interface. For example, if logging in requires extra steps in some browsers, that difference can be handled in a high-level function such as “doLogin.” Test scenarios only need to know to log in and none of the details required to log in using different browsers.
Parallel Test Execution
To run your scenarios across multiple browsers and environments, you will want to run tests in parallel to save time. This is typically a final step after you have ensured your test scenarios run seamlessly across browsers and they are data-driven where applicable.
Selenium Grid combined with features in frameworks like JUnit and TestNG provide built-in support for parallel testing. Other services like BrowserStack provide parallel testing, pre-built environments, and diagnostic features to maintain your parallel test runs more easily.
Master Selenium for Efficient Testing
This article has only touched on a few useful techniques and design patterns. Books have been written on the subject. Writing good tests means writing good software, and writing good software is a complex task.
I’ve covered some information I’ve learned over the years to create better Selenium tests. Parasoft Selenic leverages our expertise to further simplify the task. With the proper tools and knowledge, we can improve the process and create stable, readable, and maintainable tests.