How To Find Broken Links with Selenium

Automatically find broken links using selenium

Links are used for navigating between webpages. Users are directed to a web page when they click or type a link on a web browser. So a broken link indicates a link that is not working. In other words, it will not navigate the user properly to the requested web page. It happens due to several reasons such as server-side errors, the absence of webpages, typing errors of users.

When a user visits a broken link, they are notified with an error message. While Valid URLs give 2XX status codes, broken URLs give status codes that begin with 400 series, and 500 series .4XX status codes indicate client-side errors, and 5XX status codes indicate server response errors.

Below are some reasons for broken links.

  • 400 Bad request error: This error code is received because of the wrong URL address. So the server cannot process the link to get the requested web page.
  • 404 Page Not Found error: The web page is not existing or removed by the owner.
  • Sometimes the system firewall can restrict reaching some web sites.
  • Users can insert the link incorrectly.

Having broken links on your website creates a bad experience for your users. It can seriously affect the reputation of your website. A website usually contains a large number of links. Manually testing each of these links is a time-consuming task. Therefore automating the Selenium Web Driver to check broken links is the best solution for this issue.

Testing broken links can be done, as shown in the steps below. The below code is a sample code for a test carried out to https://www.google.co.uk, and relevant facts are discussed below.

package automationproject;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.HttpURLConnection;
import java.util.Iterator;
import java.net.URL;
import java.util.List;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
public class MyBrokenLinks {

  public static void main(String[] args) {

System.setProperty("webdriver.ie.driver","C:\\Users\\tushar\\eclipse-workspace\\first test\\chromedriver.exe");
    WebDriver mydriver = new ChromeDriver();
    String myhomePage = "https://www.google.co.uk";
    String myurl = "";
    HttpURLConnection myhuc = null;
    int responseCode = 200;
    mydriver = new ChromeDriver();
    mydriver.manage().window().maximize();
    mydriver.get(myhomePage);
    List < WebElement > mylinks = mydriver.findElements(By.tagName("a"));
    Iterator < WebElement > myit = mylinks.iterator();
    while (myit.hasNext()) {

      myurl = myit.next().getAttribute("href");
      System.out.println(myurl);
      if (myurl == null || myurl.isEmpty()) {
        System.out.println("Empty URL or an Unconfigured URL");
        continue;
      }

      if (!myurl.startsWith(myhomePage)) {
        System.out.println("This URL is from another domain");
        continue;
      }

      try {
        myhuc = (HttpURLConnection)(new URL(myurl).openConnection());
        myhuc.setRequestMethod("HEAD");
        myhuc.connect();
        responseCode = myhuc.getResponseCode();
        if (responseCode >= 400) {
          System.out.println(myurl + " This link is broken");
        }
        else {
          System.out.println(myurl + " This link is valid");
        }

      } catch(MalformedURLException ex) {
        ex.printStackTrace();
      } catch(IOException ex) {
        ex.printStackTrace();
      }
    }

    mydriver.quit();
  }
}

Below are my test results.

blank

Each link that is  used in the codes of the web page can be found with the aid of the anchor tag‘<a>.’ The identified links are listed down

List<WebElement> mylinks = drive.findElements(By.tagName("a"));

Then an iterator is placed to move through the created list of links.

Iterator<WebElement> myit = mylinks.iterator();

Identification and Validation of URLs

This step is provided to check the URLs generated with a third party domain or to check it is empty or null. HREF of the anchor tag is stored in a variable called “URL,” and then it is checked as above.

myurl = myit.next().getAttribute("href");

For empty URLs, the below code is used.

if(myurl == null || myurl.isEmpty()){
    System.out.println("Empty URL or an Unconfigured URL");
    continue;
    }
    

The following code is used to determine where the URL belongs to, whether it belongs to the created domain or it is obtained from a third-party provider.

if(!myurl.startsWith(homePage)){
    System.out.println("This URL is from another domain");
    continue;
    }

HTTP Request Sending

Methods in the above, imported “HttpURLConnection” class allows you to send requests and capture responses from the HTTP response codes.

myhuc = (HttpURLConnection)(new URL(myurl).openConnection());

Here “HEAD” is set as request type without using  “GET” to return only headers instead of the body of the document.

myhuc.setRequestMethod("HEAD");

When the connect method is invoked, the actual connection of the URL will be established.

myhuc.connect();

HTTP response should be obtained by the getResponseCode() method.

responseCode = huc.getResponseCode();

Broken links can be determined by the response code number, as mentioned above. Any code that is larger than or equal to 400  can be identified as broken links.

if(responseCode >= 400){
System.out.println(myurl+" This link is broken");
}
else{
System.out.println(myurl+" This link is valid");
}

Testing a broken link is a crucial function to make a good website with an excellent user experience. Users can identify malfunctioning links using Selenium Web Driver testing quickly. This is a tester-friendly version to create a better website.

Tushar Sharma
Tushar Sharmahttps://www.automationdojos.com
Hi! This is Tushar, the author of 'Automation Dojos'. A passionate IT professional with a big appetite for learning, I enjoy technical content creation and curation. Hope you are having a good time! Don't forget to subscribe and stay in touch. Wishing you happy learning!

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts

RELATED POSTS

Getting Started with Selenium WebDriver

Table of Contents 1. Selenium and Selenium Web Driver 2. Setting-Up the Environment 3. Test Script with Selenium Web Driver 3.1) Creating a project 3.2) Creating...

Finding Web Elements with Selenium

I'm going to explain in this tutorial about the usage of the findElement and findElements method of Selenium Webdriver on the Chrome web browser....

Working with Selenium WebElements

Table of Contents 1. Selenium WebElements 2. WebElement Locators 3. Working With Text Box 4. Working With Buttons 5. Checkboxes and Radio Buttons 6....

Desired Capabilities in Selenium Web Driver

1. Desired Capabilities in Selenium The performance of a Web application may vary according to different browsers and operating systems. Hence to ship out a...

Â

RECENT 'HOW-TO'

How To Install Oh-My-Posh On Windows PowerShell

Oh-My-Posh is a powerful custom prompt engine for any shell that has the ability to adjust the prompt string with a function or variable. It does not...

MORE ON CODEX

MORE IN THIS CATEGORY

Common Issues with HP Load Runner

HP Load Runner is a popular automated load and performance testing tool that emulates actual load to check the performance and behavior of a...

Common Errors with QTP/UFT – Part 1

As the name suggests, this article focuses on the common problems faced during or after QTP/UFT installation. We shall also have a separate article...

How To Change Font for Eclipse Editor Pane

This article shows how to change the text size and style for the Eclipse editor pane. The font used for Eclipse editor pane can be...

HP Borland’s Silk Test for Multi-Channel Testing

The growing web and mobile applications have posed several challenges for software QA teams. While usability and scalability are highly prioritized, compatibility across multiple...

CHECKOUT TUTORIALS

VBS Part 2 – Fundamentals and Concepts

Having gone through the Introductory part, it is time to look at some crucial fundamentals and concepts. This article is to refresh some of...
- Advertisement -spot_img