Ninja QA

Visual QA with AForge

A while back, I was tasked with assessing the viability of doing Automated Visual QA within the organization I worked in. The idea was that we would create Automated Selenium tests that would be able to perform run time visual checks on the pages and elements as we needed and return a pass or fail to say if the content had changed enough to warrant a bug being raised.

I assessed a few options, one of the commercial tools considered was Applitools, which integrates seamlessly into Selenium. In fact, their own AppliTools driver is merely a custom build IWebDriver class, so it can be wrapped around a Chrome WebDriver just as easily as an Internet Explorer driver.

All in all, Applitools was a good candidate, however, I also wanted to see if it was feasible to perform the validations ourselves internally, as our company had some concerns around the idea of sending screenshots of our test environments or UAT environments to a third party provider such as Applitools. After all, while we are agreeing to Applitools' Terms and Conditions, they are not necessarily signing an Non Disclosure Agreement to ensure that our screenshots remain private and deleted when no longer needed.

So to begin with developing our own solution, I looked at some open source libraries and AForge looked very promising.
It had Imaging libraries that were designed to give % ranking on similarity between images. 

All in all, it looked like AForge would be a simple solution to use, but then I encountered other issues.

For any visual comparison, you need two things.

  1. The image you are comparing
  2. The exemplar image you are comparing against

For number 2, this was going to be a static image taken of the element or page as it should appear. To ensure that we captured the exemplar image in such a state that it has a chance of matching the image taken at test execution time, we used our test itself to capture the image.
This was essentially accomplished by calling a custom screenshot method that I created that would use X,Y coordinates to capture the bounding box of the element we are interested in.
It should be noted that if you perform your validations at a lower level, you stand a higher chance of getting more accurate results.
Capturing the whole page may be quick, but it can also be dirty. You can end up with false positives being raised due to text differences, offsets, date times etc.


When using AForge, I discovered that a 'DifferenceMask' was a really good way to perform the comparison. If the comparison fails, it can output a difference image to show the actual differences as they appear on screen. This also allowed me to debug and tinker with the tool to make it more robust and reliable. 

I found that when the tool ran on other browsers and even if just ran remotely on selenium grid node, sometimes this would result in 1-2 pixel offsets in the images.
1-2 pixels doesnt sound like a whole lot, but it can be enough to make the entire image register as 'different'.

How do we solve that?
The way I solved it was to create an algorithm that would eliminate the false positives, but also extrude and grow the real issues we care about. This was accomplished through a sequence of brightening the image, while burring it at the same time.
Taking into consideration that the image at this stage is a Difference Mask image - so it is typically inverted before it gets to this stage:

public static Bitmap EnlargeDifferences(Bitmap btm)
            FiltersSequence filterSequence = new FiltersSequence();
            GaussianBlur filterBlur = new GaussianBlur(3.4D, 800);
            HSLLinear filterBrighten = new HSLLinear();
            // configure the filter
            filterBrighten.InLuminance = new Range(0.00f, 1.00f);
            filterBrighten.OutLuminance = new Range(0.00f, 10.00f);


            // Do 5 passes - to try and expand the changes
            FilterIterator fi = new FilterIterator(filterSequence, 5);

            Bitmap bReturn = fi.Apply(btm).To24bppRgbFormat();
            return bReturn;

What we are aiming to accomplish with this code, is that 1-2 pixel differences will be blurred out of existence, but anything that remains will be brightened so the next pass of the blur will not remove it from existence.

An example can be seen below:
Lets imagine our Exemplar image is this:

However, at run time, we capture this:

Selenium will have no easy way to determine that the image has not loaded due to a 404 issue.
It will see that a div or img is in the DOM, and assume that it must be fine.

With AForge, you can however build a difference map.

It then shows something like what you see above.

The 1-2 pixel false positives I told you about, looks like this - see below.

To eliminate these false positives, but retain the real difference, namely that the car image has not loaded, we use the process of Blur and Enhance.


An other example of how it might look would be :

In the above difference map, the car is not meant to be hidden by a popup, but it is.

The important thing with Visual QA is not that the tool can understand what the differences are, but that it can spot them and distinguish them from false positives. After all, we only want a boolean result. Does it match, or does it not.

For the final comparison, I recommend using an ExhaustiveTemplateMatching from AForge.
Compare your blurred image against a black image of the same dimensions. (Difference mask images are always set against a black background - assuming the images match) 

var tm = new ExhaustiveTemplateMatching(similarityThreshold);
var results = tm.ProcessImage(blurred, black.To24bppRgbFormat());

Your results class will then contain a % match, which you can then fail or pass your test on.

If you really want to identify the causes of the image differences, then AForge provides the ability for you to identify blobs in your blurred image, and then draw blob boundaries around the same coordinates.
I would recommend drawing these boxes on the un-blurred image, so the image makes more sense to the tester reviewing the results.


Collaborative Development - using Jira, Bitbucket and Git to work effectively

Like many Quality Analysts, the first form of source control I was exposed to would have been something like SVN or CVS or the like. They were simple tools to work with. You check code out, you check code in. Many of these tools however had problematic aspects associated with them with regards to merging and collaborating in a busy development environment.

The company I work for has been using Bitbucket for a while now for the development aspect of our product. To briefly describe this process...

Our base product is in the master repository in git (this would be equivelant to 'trunk' in other Source Control systems)
When we have a new feature that needs developed in Jira - we are able to assign that task to a developer and he would then create a 'feature branch' of the master repository.
This effectively clones the base product into an identical repository, but this repository is dedicated to the development of that one feature that he is assigned to. As time goes on and he completes his feature, he is then ready to try and merge it back into the master branch. Now this process should have unit tests or some sort of quality check in place to ensure that nothing dodgy is merged into master. We use a mix of unit tests and peer reviews. This merge process is called the 'Pull Request'. We are effectively requesting 'master' to pull our changes into itself. 
In a tool such as BitBucket, the pull request system is very configurable. You can configure your pull requests to require 1 or more reviewers and approvals, it could also require that the code successfully builds.



Generally speaking when 2 or more developers are developing in a system, they would try to stay away from class files that each would be interacting with. This is just a courtesy to avoid messing with code that the other developer might need to work on.

Short of telepathy or the invention of a hive mind for developers, there is no way to resolve situations where two developers do end up working on the same code.
This is where 'merge' issues may arise. Tools such as BitBucket and Git provide features to mitigate the pain caused by merges. Git will provide you with options to perform intelligent merging etc. That being said, Git is just a tool and will never be able to tell if the merged code is up to standard. The typical way for handling merges, is for the developer who is about to do his pull request, to initiate what is called a Git 'rebase'. This tells GIT to re-update the repo from the master branch, to include any new changes in master in their own modified code. The onus is on the developer doing the pull request to ensure that the code is properly merged and no conflicts arise.

So what has this got to do with Quality Assurance or Automation.

As I said above, this is the process that many development organizations are working to. We are now trialing it in Automation and Quality Assurance in general internally.
I had mentioned previously that I have a framework of my own design that I use for Automation within the company I work for.
A dependency diagram of it is shown below...

Your probably thinking - Holy Crap!!

That's a lot of modules.
Your right... it is.
The framework was developed to be modular, decoupled and highly cohesive. However, the larger it grew, suddenly there become issues with maintenance and how to manage the future development of the framework. This is when the collaborative development process above comes into play with coding standards and other processes for managing code quality.
Full disclosure, when I first developed this framework, I was the only author of it and it kinda grew organically and it was all done in GIT master branch. No branches for features....

Now that other people within the organization are contributing to the framework, we need a quality control mechanism to ensure that bad code cannot be committed to master.
Borrowing from the above process that our developers use, we introduces the Feature Branching strategy along with peer review approval process for pull requests.
In addition, we are in the process of adding unit test projects for each feature. These then get built and executed by TeamCity which will either pass or fail. Failed builds / executions will not get deployed to our Artifactory server and will not get consumed by the users of the automation framework via NuGet. (Each feature is a separate NuGet package).

Using examples for this process, it might be like this...

The framework exists, but we want to add Appium functionality, we also want to add functionality to allow for database communication with a Redis database.

Sally is free to work on one of the features, so she clones the whole framework to an Appium branch - she will build the Appium integration feature.
James is going to work on the Redis functionality, he clones to a Redis branch.

They both work on their respective features, getting the functionality working, ensuring that everything builds, existing unit tests are run to ensure no regression and they create new tests to ensure their feature never breaks either. James is finished, he performs a rebase to pull down any changes that have made it into master while he was working on his branch. No changes are detected, so he commits to his branch.
He goes into BitBucket and raises a Pull Request from his branch, into master - Sally and John are listed as reviewers.

Sally is busy finishing her Appium functionality, but she has enough time to perform a quick review of the pull request, she examines the code and approves it, John performs a review also. The pull request is approved and makes it into the master branch.
Sally now wants to commit her work to master, she performs a rebase and finds that James modified one of the classes she was working with. She reviews it and merges it manually to ensure there are no compatibility issues. She runs the unit tests to ensure everything passes. It seems fine. She commits to her branch and starts a pull request.
John and James now review her work, approve it and merge it into master.

TeamCity would then build their respective features and deploy them to Artifactory as NuGet packages. These are then available to anyone in the organization from the private NuGet repository URL.

This way features can be developed collaboratively, tested and deployed and consumed by people in the organization.

Once the code is merged into Master, the theory is that you should be able to close out the Jira task that the feature belonged to, the feature branch should be able to be deleted and cleaned up. As it is no longer relevant - since its code now exists in master.

Page object model - with added Synchonization

Perhaps one of the most common approaches to Web UI Automation is the concept of 'Page Object' model. The idea that you can represent the page before you via code or some other interface to facilitate interactions with its various elements. For most, the page object approach is a simple case of cramming your objects into a class and using the objects directly, not caring about the page they are in or any of the logic that may need to be considered beforehand.

For me however, I have a somewhat different approach..

It is true that Selenium provides an implicit synchronization feature which allows you to 'not care' about waiting for objects to display. Selenium will naturally have to wait for the object to appear before it can interact with it. While this feature works well, sometimes it is not enough. It can be augmented with Page Object Synchronization.


 Firstly, the interface I am going to be using for this system is really just added for best practice. It is designed to encourage users to provide a page name and description for the page in question.

//Use of interface helps to prevent user error when creating new page object models
    interface IPage
        void GetPage();
        string GetPageName();
        string GetPageDescription();


When I use this interface to create the base page class we end up with a class like following. Yes, I am also implementing IDisposable, I will explain why soon.

public class Page : IPage, IDisposable
        /// <summary>
        /// Use this constructor if you are not concerned about synchronization
        /// </summary>
        public Page()


        /// <summary>
        /// Use this constructor if you wish to automatically synchronize on an object on screen.
        /// </summary>
        /// <param name="locator"></param>
        public Page(By locator)
            this.LocatorForPage = locator;

        public void GetPage()

	public string GetPageName()
            return null;

        public string GetPageDescription()
            return null;

Anything that returns 'null' above- is self explanatory. It is recommended you populate those with useful information. These may be useful when you want to use logging and the like to tell the test framework what page it failed on.

Similar to how we use our BaseElement / Specialized object classes, we can also feed in a 'locator' or 'By' class to the constructor for this class. The purpose for this is so we can tell the framework what objects we should 'synchronize' on. There is no point in looking for the Username field on a login form, if the login button has not appeared yet for instance. As a rule of thumb, I recommend your synchronization object should be the last object you anticipate appearing on screen.


IDisposable will require us to implement a dispose method.

public void Dispose()
            LocatorForPage = null;
public By LocatorForPage { get; set; }

When we use the constructor, we are storing the By information to this variable, when we dispose of the page class, we want to release it.
.Net C# should take care of this cleanup for us, but it is generally a good idea / principle the to try and dispose / clear up resources whenever you can.

Once again, similar to the BaseElement class and the 'FindObject' method. The 'GetPage' method is going to do much of the heavy lifting, but because we have already implemented most of the object functionality in our BaseElement class, we can reuse that functionality here.

public void GetPage()
            if (DetectErrorPageEventHandler != null)
            //If a locator is provided, sync on it, else dont
            if (LocatorForPage != null)
                    //Syncs on the object if found
                    //Else raises an exception
                    BaseElement be = new BaseElement(LocatorForPage,true);
                catch (Exception e)
                    if (GetPageName() == null)
                        this.PageName = this.GetType().Name;
                    throw new Exception("Page not loaded:(" + GetPageName() + "), The object identified by " + LocatorForPage + " was not located within the timeout period. This may indicate that the page has not loaded.");

You can see that this method does not concern itself with any timeouts or WebDriverWaits - instead it is just trying to instantiate the BaseElement class on the locator you have specified for the page. In the event the object is not found within the implicit wait timespan and exception will be thrown to say that the page did not load.

You may be wondering what the DetectErrorPageEventHandler is...
In most company / product websites - the product will typically have an 'error' page. A catch all page that is displayed in the event that something went wrong.
This could be a maintenance page or perhaps just the bland IIS error page. 
When these pages appear, you know instantly your page has not loaded and will not load from that point on. However, Selenium would continue to wait for the timeout to occur. It has no idea or concept of what an error page looks like.

With event handlers, you can tell it how to recognize these error pages.

Add the following region to your page class.

#region Error Detection
        /// <summary>
        /// This delegate is used to define an Error Page detection method
        /// </summary>
        public delegate void DetectErrorPageDelegate();

        /// <summary>
        /// This event is fired when an error page is detected.
        /// </summary>
        public static event DetectErrorPageDelegate DetectErrorPageEventHandler;

First thing to be aware of is that these handlers are static, so they will be applied globally within your test project. You do not set error page detection for a single page class, you will be setting it for all page classes.

How do you inject your detection code into the class?

At the start of your test, you need to call something like this.
This is still wrote in the syntax of using Selenium along with Specflow. If you are just using nUnit, then you might use 'TestFixtureSetup' instead.

public static void SetupErrorDetection(){
	Page.DetectErrorPageEventHandler += ErrorDetection;

public static void ErrorDetection(){
	BaseElement errorMessage = new BaseElement(By.Id("error_warn"),false);
		throw new Exception("The test environments error page was detected - please investigate...");


This basically means that when you instantiate a Page class, it will automatically check for the existence of the 'error_warn' element.
Note the false being provided to the second argument in the BaseElement. This is to prevent it from performing the default implicit wait.
We do not want to make our page object classes wait for 15 seconds for an object that we do not want to appear.
This will allow the code to instantiate the errorMessage object, and then perform a straight assertion on whether the errorMessage exists.
Note - in Selenium there is the difference between existence and displayed. So you may need to adjust your code to the way your DOM works. More recent technologies such as bootstrap and jquery and the like result in objects always being present, but not visible. 

Putting this all together, how does this look in a real live example...

public class LoginPage : Page
        public LoginPage()
            : base(By.Id("signin_page"))


        public Textbox AccountNumberTextbox()
            return new Textbox(By.Id("AccountNumber"));

        public Element AccountNumberValidationElement()
            return new Element(By.Id("AccountNumberValidation"));


If our login page class looks like the above, then our usage of it could look like...

	using(LoginPage page = new LoginPage()){

It is my opinion that the 'using' statement helps keep your test code clean and concise - it prevents you from interacting with objects that exist on 'other' page classes and allows you to synchronize on pages more accurately than simple implicit waits.

C# also allows you to do more complex page object model approaches using inheritance.

Lets imagine you have a menu bar that is accessible on ALL pages within your application.
You could define this as one page object class, and then have it inherited - this would allow you to merge the contents of two page object classes.

public class MenuBar : Page
          public MenuBar() : base(By.Id("menu_bar_authed")){


          public Link AccountInformation(){
                     return new Link(By.Id("acc_info"));


If this is your menu bar page, it is synchronizing on an element 'menu_bar_authed' - this is just something I made up...

public class AccountSummary : MenuBar
          public AccountSummary() : base(By.Id("acc_summary_index")){


          public Element AccountNumber(){
                     return new Element(By.Id("acc_number"));

You can see in the example above, we are inheriting from MenuBar instead of the 'Page' class.

The side effect of us doing this and then feeding in a locator for something other than the MenuBar locator, is that the 'using' statement that we use will only synchronize on the locator specified by the inheriting class.
Eg: Instead of synchronizing on the MenuBar, we will Synchronize on something called 'acc_summary_index'.

This is fair enough, because maintaining with object oriented principles, anything that inherits is naturally a 'superior' class. We do not care about the menu bar at this stage, we want to synchronize on something relevant to the AccountSummary page.

	using(LoginPage page = new LoginPage()){

Using the approach above, you can see that the account information link and the account number text element are defined in separate classes, but accessible within the same using statement using an inheritance model.