Ninja QA

Google Maps + Drawing API - Selenium Automation

So a client I have been working with uses google maps in conjunction with google maps API & Drawing tools.
I won't go into details about who or what the client does - but in short - they need to draw shapes on a map, to mark an area that is affected by an event that may happen in the future - very vague, I know - but intentionally so.

The problem is that Selenium does not have any native support for drawing shapes or interacting with google maps.
Sure, you can click on buttons on the map GUI - but it becomes problematic when you want to draw shapes over specific locations. The problem is mapping x and y on your screen, to longitude and latitude on a geolocation map. Calculating this yourself would be too much hassle as you would have to worry about page offsets, what area of the world is centred in the map etc.

Surely there must be an easier way.

There is indeed...

In my case - I needed to examine the page source to see two things.
How the shape information was saved and how the shape information was loaded.
When I say loaded, I mean - when you arrive at the page - if there is stored data, how does it get loaded into the map.

What I discovered in my examination of the page source, was that the google api used a DomListener to push the shape data into an array called 'shapes'. This variable was however declared within a scope that I could not interact with - I didn't fancy asking the developers to make it global just so our automation could manipulate it and besides - I don't think it would have helped anyway....
So what I did was followed the 'shapes' variable through the javascript code to see where it eventually ended up. I also discovered that there was a function being used called 'Initialize' which seemed to be called when the page loaded.
Possibly the load function?

 

function initialize() {
        var goo = google.maps,
            map_in = new goo.Map(document.getElementById('maps'),
                                          {
                                              zoom: 5,
                                              streetViewControl: false,
                                              center: new goo.LatLng(54.8667, -4.45)
                                          }),          
            shapes = [],
            //selected_shape = null,
            drawingManager = new goo.drawing.DrawingManager({
...


Above you can see the shapes array.

Eventually as I followed the code - I found another DomListener being put into action - this one was listening for the user clicking the 'save button'

goo.event.addDomListener(byId('save_btn'), 'click', function () {
            //var data1 = IO.IN(shapes, false);
            var data = shapes;
            var updatedData = MergeArrays(data);
            Save(JSON.stringify(updatedData));
        });


I can now see that the data from 'shapes' is being fed into the 'Save' function - lets find the Save function now.

function Save(result) {
        $("#map_json_data").val(result);
        }

It looks like it is grabbing an object on the screen - which has an ID of 'map_json_data' and then setting its value to be equal to the JSON value of the shape data.
Lets look for other references of #map_json_data.


Right enough - I found a function called 'Reload'

function reload() {
            var mapdata = $("#map_json_data").val();
            if (mapdata != null) {
                shapes = IO.OUT(JSON.parse(mapdata), maps);                
            }
        }

And this function was being called from the initialize function.
So I had come full circle.
It looks like the when the page loads - the server back end populates the #map_json_data hidden element with json data relating to the shape data that needs to be displayed - then the Initialize function is executed which reads from this hidden field and then renders the data.

In order for us to modify the shapes displayed on the map, we simply need to call the 'Save' function to store JSON data to the hidden field, and then call Initialize again - to fool the map into thinking it has just loaded.

My code ended up looking like this:

string saveShapeJS = "Save('[{\"type\":\"CIRCLE\",\"id\":null,\"radius\":139204.9220945524,\"CircleLatLng\":[42.288119312630066,11.962269477805648],\"ID\":0}]');initialize();";

It is important to note that the " symbols need to be escaped - or alternatively, you can convert your javascript string to base64 and then execute it via atob() - which should free it from having to have the quotes escaped.
This code when executed, will save that Json string to the hidden field, but then call the initialize function, which will make the map load the shape data specified.
Further more - if you then click any relevant submit button - it should store the data to your server / database back end.

See below the before and after images...

 


 

Many automation engineers I have worked with in the past have often said 'That wont be achievable with automation etc...'
In my own experience, almost anything is possible with automation, if you have the willingness to investigate and maybe bend the rules with regards to your approach to the problem. While the above code may not be clicking and dragging to generate a circle on a map, it does however test the functionality of our web-site - we are able to store a circle over a geographic area and it persists to the database.

If we want to get into real semantics - do you think Selenium is actually clicking on buttons in chrome the way a Human does? This may come as a shock - but Selenium does that via javascript manipulations through the ChromeDriver.exe.
So since we are already bending the rules - and not testing 'exactly' the same way a user of the site may, why not go a step further, if it gives you that extra bit of coverage in your testing.

 

Visual QA with AForge

A while back, I was tasked with assessing the viability of doing Automated Visual QA within the organization I worked in. The idea was that we would create Automated Selenium tests that would be able to perform run time visual checks on the pages and elements as we needed and return a pass or fail to say if the content had changed enough to warrant a bug being raised.

I assessed a few options, one of the commercial tools considered was Applitools, which integrates seamlessly into Selenium. In fact, their own AppliTools driver is merely a custom build IWebDriver class, so it can be wrapped around a Chrome WebDriver just as easily as an Internet Explorer driver.

All in all, Applitools was a good candidate, however, I also wanted to see if it was feasible to perform the validations ourselves internally, as our company had some concerns around the idea of sending screenshots of our test environments or UAT environments to a third party provider such as Applitools. After all, while we are agreeing to Applitools' Terms and Conditions, they are not necessarily signing an Non Disclosure Agreement to ensure that our screenshots remain private and deleted when no longer needed.

So to begin with developing our own solution, I looked at some open source libraries and AForge looked very promising.
It had Imaging libraries that were designed to give % ranking on similarity between images. 

All in all, it looked like AForge would be a simple solution to use, but then I encountered other issues.

For any visual comparison, you need two things.

  1. The image you are comparing
  2. The exemplar image you are comparing against

For number 2, this was going to be a static image taken of the element or page as it should appear. To ensure that we captured the exemplar image in such a state that it has a chance of matching the image taken at test execution time, we used our test itself to capture the image.
This was essentially accomplished by calling a custom screenshot method that I created that would use X,Y coordinates to capture the bounding box of the element we are interested in.
It should be noted that if you perform your validations at a lower level, you stand a higher chance of getting more accurate results.
Capturing the whole page may be quick, but it can also be dirty. You can end up with false positives being raised due to text differences, offsets, date times etc.

 

When using AForge, I discovered that a 'DifferenceMask' was a really good way to perform the comparison. If the comparison fails, it can output a difference image to show the actual differences as they appear on screen. This also allowed me to debug and tinker with the tool to make it more robust and reliable. 

I found that when the tool ran on other browsers and even if just ran remotely on selenium grid node, sometimes this would result in 1-2 pixel offsets in the images.
1-2 pixels doesnt sound like a whole lot, but it can be enough to make the entire image register as 'different'.

How do we solve that?
The way I solved it was to create an algorithm that would eliminate the false positives, but also extrude and grow the real issues we care about. This was accomplished through a sequence of brightening the image, while burring it at the same time.
Taking into consideration that the image at this stage is a Difference Mask image - so it is typically inverted before it gets to this stage:

public static Bitmap EnlargeDifferences(Bitmap btm)
        {
            
            FiltersSequence filterSequence = new FiltersSequence();
            GaussianBlur filterBlur = new GaussianBlur(3.4D, 800);
            HSLLinear filterBrighten = new HSLLinear();
            // configure the filter
            filterBrighten.InLuminance = new Range(0.00f, 1.00f);
            filterBrighten.OutLuminance = new Range(0.00f, 10.00f);

            filterSequence.Add(filterBlur);
            filterSequence.Add(filterBrighten);

            // Do 5 passes - to try and expand the changes
            FilterIterator fi = new FilterIterator(filterSequence, 5);

           
            Bitmap bReturn = fi.Apply(btm).To24bppRgbFormat();
            
            return bReturn;
        }


What we are aiming to accomplish with this code, is that 1-2 pixel differences will be blurred out of existence, but anything that remains will be brightened so the next pass of the blur will not remove it from existence.

An example can be seen below:
Lets imagine our Exemplar image is this:


However, at run time, we capture this:


Selenium will have no easy way to determine that the image has not loaded due to a 404 issue.
It will see that a div or img is in the DOM, and assume that it must be fine.

With AForge, you can however build a difference map.


It then shows something like what you see above.

The 1-2 pixel false positives I told you about, looks like this - see below.



To eliminate these false positives, but retain the real difference, namely that the car image has not loaded, we use the process of Blur and Enhance.


 

An other example of how it might look would be :


In the above difference map, the car is not meant to be hidden by a popup, but it is.


The important thing with Visual QA is not that the tool can understand what the differences are, but that it can spot them and distinguish them from false positives. After all, we only want a boolean result. Does it match, or does it not.

For the final comparison, I recommend using an ExhaustiveTemplateMatching from AForge.
Compare your blurred image against a black image of the same dimensions. (Difference mask images are always set against a black background - assuming the images match) 

var tm = new ExhaustiveTemplateMatching(similarityThreshold);
var results = tm.ProcessImage(blurred, black.To24bppRgbFormat());

Your results class will then contain a % match, which you can then fail or pass your test on.

If you really want to identify the causes of the image differences, then AForge provides the ability for you to identify blobs in your blurred image, and then draw blob boundaries around the same coordinates.
I would recommend drawing these boxes on the un-blurred image, so the image makes more sense to the tester reviewing the results.

 

Inheriting Pages - C# Inheritance but with pages

In the previous article we covered the page classes. We showed that we could create a page class that contains one or more objects.
A question was asked recently around how do we share common objects across two different web applications.

 

The answer: Inheritance 

Imagine we have a 'Footer' that sits at the bottom of our page.
Some of the elements on the footer are identical on all of our web applications, except for a few, where some elements are specific to the application in question.

public class Footer : Page
    {
        public Footer() : base(By.Id("footerId"))
        {
            Console.WriteLine("The Footer has loaded on the page");
        }

        public Link TermsAndConditionsLink()
        {
            return new Link(By.Id("terms_conditions"));
        } 
    }

 

Instead of having 3 different page classes that have the Terms & Conditions link duplicated, we can use inheritance to share the Terms & Conditions link with the other two more specific types of footer.
This is nothing new, as we did it with our BaseElement and Specialized Objects.
Eg: BaseElement is shared across Link and Element.  This allows Link and Element to have access to the Verify Text and Get Text methods etc.

Lets imagine our specialized classes are called
SalesFooter and CallCenterFooter

public class SalesFooter : Footer
    {
        public SalesFooter() : base(By.Id("salesFooter"))
        {
            
        }

        public Element SalesDemoLink()
        {
            return new Element(By.Id("salesDemo"));
        }

    }

    public class CallCenterFooter : Footer
    {
        public CallCenterFooter()
            : base(By.Id("callCenterFooterId"))
        {

        }

        public Link CallTheCallCenter()
        {
            return new Link(By.Id("salesDemo"));
        }

    }



Both of these classes inherit from the 'Footer' class as such, they can share the TermsAndConditions link.
However, the SalesDemoLink cannot be shared to the CallCenterFooter class and vice versa.


Collaborative Development - using Jira, Bitbucket and Git to work effectively

Like many Quality Analysts, the first form of source control I was exposed to would have been something like SVN or CVS or the like. They were simple tools to work with. You check code out, you check code in. Many of these tools however had problematic aspects associated with them with regards to merging and collaborating in a busy development environment.

The company I work for has been using Bitbucket for a while now for the development aspect of our product. To briefly describe this process...

Our base product is in the master repository in git (this would be equivelant to 'trunk' in other Source Control systems)
When we have a new feature that needs developed in Jira - we are able to assign that task to a developer and he would then create a 'feature branch' of the master repository.
This effectively clones the base product into an identical repository, but this repository is dedicated to the development of that one feature that he is assigned to. As time goes on and he completes his feature, he is then ready to try and merge it back into the master branch. Now this process should have unit tests or some sort of quality check in place to ensure that nothing dodgy is merged into master. We use a mix of unit tests and peer reviews. This merge process is called the 'Pull Request'. We are effectively requesting 'master' to pull our changes into itself. 
In a tool such as BitBucket, the pull request system is very configurable. You can configure your pull requests to require 1 or more reviewers and approvals, it could also require that the code successfully builds.

 

 

Generally speaking when 2 or more developers are developing in a system, they would try to stay away from class files that each would be interacting with. This is just a courtesy to avoid messing with code that the other developer might need to work on.

Short of telepathy or the invention of a hive mind for developers, there is no way to resolve situations where two developers do end up working on the same code.
This is where 'merge' issues may arise. Tools such as BitBucket and Git provide features to mitigate the pain caused by merges. Git will provide you with options to perform intelligent merging etc. That being said, Git is just a tool and will never be able to tell if the merged code is up to standard. The typical way for handling merges, is for the developer who is about to do his pull request, to initiate what is called a Git 'rebase'. This tells GIT to re-update the repo from the master branch, to include any new changes in master in their own modified code. The onus is on the developer doing the pull request to ensure that the code is properly merged and no conflicts arise.

So what has this got to do with Quality Assurance or Automation.

As I said above, this is the process that many development organizations are working to. We are now trialing it in Automation and Quality Assurance in general internally.
I had mentioned previously that I have a framework of my own design that I use for Automation within the company I work for.
A dependency diagram of it is shown below...

Your probably thinking - Holy Crap!!

That's a lot of modules.
Your right... it is.
The framework was developed to be modular, decoupled and highly cohesive. However, the larger it grew, suddenly there become issues with maintenance and how to manage the future development of the framework. This is when the collaborative development process above comes into play with coding standards and other processes for managing code quality.
Full disclosure, when I first developed this framework, I was the only author of it and it kinda grew organically and it was all done in GIT master branch. No branches for features....

Now that other people within the organization are contributing to the framework, we need a quality control mechanism to ensure that bad code cannot be committed to master.
Borrowing from the above process that our developers use, we introduces the Feature Branching strategy along with peer review approval process for pull requests.
In addition, we are in the process of adding unit test projects for each feature. These then get built and executed by TeamCity which will either pass or fail. Failed builds / executions will not get deployed to our Artifactory server and will not get consumed by the users of the automation framework via NuGet. (Each feature is a separate NuGet package).

Using examples for this process, it might be like this...

The framework exists, but we want to add Appium functionality, we also want to add functionality to allow for database communication with a Redis database.

Sally is free to work on one of the features, so she clones the whole framework to an Appium branch - she will build the Appium integration feature.
James is going to work on the Redis functionality, he clones to a Redis branch.

They both work on their respective features, getting the functionality working, ensuring that everything builds, existing unit tests are run to ensure no regression and they create new tests to ensure their feature never breaks either. James is finished, he performs a rebase to pull down any changes that have made it into master while he was working on his branch. No changes are detected, so he commits to his branch.
He goes into BitBucket and raises a Pull Request from his branch, into master - Sally and John are listed as reviewers.

Sally is busy finishing her Appium functionality, but she has enough time to perform a quick review of the pull request, she examines the code and approves it, John performs a review also. The pull request is approved and makes it into the master branch.
Sally now wants to commit her work to master, she performs a rebase and finds that James modified one of the classes she was working with. She reviews it and merges it manually to ensure there are no compatibility issues. She runs the unit tests to ensure everything passes. It seems fine. She commits to her branch and starts a pull request.
John and James now review her work, approve it and merge it into master.

TeamCity would then build their respective features and deploy them to Artifactory as NuGet packages. These are then available to anyone in the organization from the private NuGet repository URL.

This way features can be developed collaboratively, tested and deployed and consumed by people in the organization.

Once the code is merged into Master, the theory is that you should be able to close out the Jira task that the feature belonged to, the feature branch should be able to be deleted and cleaned up. As it is no longer relevant - since its code now exists in master.

Page object model - with added Synchonization

Perhaps one of the most common approaches to Web UI Automation is the concept of 'Page Object' model. The idea that you can represent the page before you via code or some other interface to facilitate interactions with its various elements. For most, the page object approach is a simple case of cramming your objects into a class and using the objects directly, not caring about the page they are in or any of the logic that may need to be considered beforehand.

For me however, I have a somewhat different approach..

It is true that Selenium provides an implicit synchronization feature which allows you to 'not care' about waiting for objects to display. Selenium will naturally have to wait for the object to appear before it can interact with it. While this feature works well, sometimes it is not enough. It can be augmented with Page Object Synchronization.

 

 Firstly, the interface I am going to be using for this system is really just added for best practice. It is designed to encourage users to provide a page name and description for the page in question.

//Use of interface helps to prevent user error when creating new page object models
    interface IPage
    {
        void GetPage();
        string GetPageName();
        string GetPageDescription();
    }

 

When I use this interface to create the base page class we end up with a class like following. Yes, I am also implementing IDisposable, I will explain why soon.

public class Page : IPage, IDisposable
    {
        /// <summary>
        /// Use this constructor if you are not concerned about synchronization
        /// </summary>
        public Page()
        {

        }

        /// <summary>
        /// Use this constructor if you wish to automatically synchronize on an object on screen.
        /// </summary>
        /// <param name="locator"></param>
        public Page(By locator)
        {
            this.LocatorForPage = locator;
            this.GetPage();
        }

        public void GetPage()
        {

        }
		
	public string GetPageName()
        {
            return null;
        }

        public string GetPageDescription()
        {
            return null;
        }
    } 

Anything that returns 'null' above- is self explanatory. It is recommended you populate those with useful information. These may be useful when you want to use logging and the like to tell the test framework what page it failed on.

Similar to how we use our BaseElement / Specialized object classes, we can also feed in a 'locator' or 'By' class to the constructor for this class. The purpose for this is so we can tell the framework what objects we should 'synchronize' on. There is no point in looking for the Username field on a login form, if the login button has not appeared yet for instance. As a rule of thumb, I recommend your synchronization object should be the last object you anticipate appearing on screen.

 

IDisposable will require us to implement a dispose method.

public void Dispose()
        {
            LocatorForPage = null;
        }
public By LocatorForPage { get; set; }


When we use the constructor, we are storing the By information to this variable, when we dispose of the page class, we want to release it.
.Net C# should take care of this cleanup for us, but it is generally a good idea / principle the to try and dispose / clear up resources whenever you can.

Once again, similar to the BaseElement class and the 'FindObject' method. The 'GetPage' method is going to do much of the heavy lifting, but because we have already implemented most of the object functionality in our BaseElement class, we can reuse that functionality here.

public void GetPage()
        {
            if (DetectErrorPageEventHandler != null)
            {
                DetectErrorPageEventHandler();
            }
           
            //If a locator is provided, sync on it, else dont
            if (LocatorForPage != null)
            {
                try
                {
                    //Syncs on the object if found
                    //Else raises an exception
                    BaseElement be = new BaseElement(LocatorForPage,true);
                }
                catch (Exception e)
                {
                    if (GetPageName() == null)
                    {
                        this.PageName = this.GetType().Name;
                    }
                    throw new Exception("Page not loaded:(" + GetPageName() + "), The object identified by " + LocatorForPage + " was not located within the timeout period. This may indicate that the page has not loaded.");
                }
            }
        }

You can see that this method does not concern itself with any timeouts or WebDriverWaits - instead it is just trying to instantiate the BaseElement class on the locator you have specified for the page. In the event the object is not found within the implicit wait timespan and exception will be thrown to say that the page did not load.

You may be wondering what the DetectErrorPageEventHandler is...
In most company / product websites - the product will typically have an 'error' page. A catch all page that is displayed in the event that something went wrong.
This could be a maintenance page or perhaps just the bland IIS error page. 
When these pages appear, you know instantly your page has not loaded and will not load from that point on. However, Selenium would continue to wait for the timeout to occur. It has no idea or concept of what an error page looks like.

With event handlers, you can tell it how to recognize these error pages.

Add the following region to your page class.

#region Error Detection
        /// <summary>
        /// This delegate is used to define an Error Page detection method
        /// </summary>
        public delegate void DetectErrorPageDelegate();

        /// <summary>
        /// This event is fired when an error page is detected.
        /// </summary>
        public static event DetectErrorPageDelegate DetectErrorPageEventHandler;
#endregion

First thing to be aware of is that these handlers are static, so they will be applied globally within your test project. You do not set error page detection for a single page class, you will be setting it for all page classes.

How do you inject your detection code into the class?

At the start of your test, you need to call something like this.
This is still wrote in the syntax of using Selenium along with Specflow. If you are just using nUnit, then you might use 'TestFixtureSetup' instead.

[BeforeFeature]
public static void SetupErrorDetection(){
	Page.DetectErrorPageEventHandler += ErrorDetection;
}


public static void ErrorDetection(){
	BaseElement errorMessage = new BaseElement(By.Id("error_warn"),false);
    if(errorMessage.Exists()){
		throw new Exception("The test environments error page was detected - please investigate...");
	}

}

This basically means that when you instantiate a Page class, it will automatically check for the existence of the 'error_warn' element.
Note the false being provided to the second argument in the BaseElement. This is to prevent it from performing the default implicit wait.
We do not want to make our page object classes wait for 15 seconds for an object that we do not want to appear.
This will allow the code to instantiate the errorMessage object, and then perform a straight assertion on whether the errorMessage exists.
Note - in Selenium there is the difference between existence and displayed. So you may need to adjust your code to the way your DOM works. More recent technologies such as bootstrap and jquery and the like result in objects always being present, but not visible. 

Putting this all together, how does this look in a real live example...

public class LoginPage : Page
    {
        public LoginPage()
            : base(By.Id("signin_page"))
        {

        }

        public Textbox AccountNumberTextbox()
        {
            return new Textbox(By.Id("AccountNumber"));
        }

        public Element AccountNumberValidationElement()
        {
            return new Element(By.Id("AccountNumberValidation"));
        }
		
}

 

If our login page class looks like the above, then our usage of it could look like...

	using(LoginPage page = new LoginPage()){
		page.AccountNumberTextbox().SetText("1214124234");
	}

It is my opinion that the 'using' statement helps keep your test code clean and concise - it prevents you from interacting with objects that exist on 'other' page classes and allows you to synchronize on pages more accurately than simple implicit waits.

C# also allows you to do more complex page object model approaches using inheritance.

Lets imagine you have a menu bar that is accessible on ALL pages within your application.
You could define this as one page object class, and then have it inherited - this would allow you to merge the contents of two page object classes.

public class MenuBar : Page
    {
          public MenuBar() : base(By.Id("menu_bar_authed")){

          }

          public Link AccountInformation(){
                     return new Link(By.Id("acc_info"));
          }
    }

 

If this is your menu bar page, it is synchronizing on an element 'menu_bar_authed' - this is just something I made up...

public class AccountSummary : MenuBar
    {
          public AccountSummary() : base(By.Id("acc_summary_index")){

          }

          public Element AccountNumber(){
                     return new Element(By.Id("acc_number"));
          }
    }


You can see in the example above, we are inheriting from MenuBar instead of the 'Page' class.

The side effect of us doing this and then feeding in a locator for something other than the MenuBar locator, is that the 'using' statement that we use will only synchronize on the locator specified by the inheriting class.
Eg: Instead of synchronizing on the MenuBar, we will Synchronize on something called 'acc_summary_index'.

This is fair enough, because maintaining with object oriented principles, anything that inherits is naturally a 'superior' class. We do not care about the menu bar at this stage, we want to synchronize on something relevant to the AccountSummary page.

	using(LoginPage page = new LoginPage()){
		page.AccountNumber().ShouldEqual("1212132131");
                page.AccountInformation().ShouldExist();
	}

Using the approach above, you can see that the account information link and the account number text element are defined in separate classes, but accessible within the same using statement using an inheritance model.

 

Selenium Automation - Strongly typed classes

A while back on Stack Overflow, I was asked for examples of my work where I used custom wrapper classes to manage and enhance the functionality of Selenium.
I am going to briefly describe what I did, and how I did it.

Selenium as a tool, considers everything as an IWebElement.
While not inherently wrong, it is somewhat deceptive. While buttons and textboxes are DOM elements, they are not however alike, and executing SendKeys to set the text on a button or click on a textbox may not be what we want.
Why would you want to click on a non-hyperlink type object etc.

From a black box perspective, you could look at IWebElement as being a 'God Class' - that metaphysical term used in code quality analysis where a developer has crammed all the functionality into a singular class or interface.

Within my framework, I worked in a different fashion.

First, I recognize that yes, all objects on a website are indeed DOM objects, or as I like to call them 'BaseElement' type classes.
So in my selenium library, I created the following class structure

 

BaseElement is essentially my 'God Class' - where I define as much functionality as I could possibly want.
Before we get as far as showing the functionality, first we want to streamline the grabbing or acquiring of objects from on-screen.

Normally, you would have to do

IWebElement textbox = Driver.FindElement(By.Id("textbox1"));
textbox.SendKeys("testing 123123");

The issue is that you are having to repeatedly reference the Driver object and tell it to 'FindElement'.
This can be a little annoying to repeat that step over and over. It also looks more like a Factory / Script design approach, as opposed to object oriented.

What I do instead is have the constructor of my BaseElement class look like this:

        protected IWebElement InnerObject;
        protected By OriginalByData;


        /// <summary>
        ///  Constructor for the base element 
        /// </summary>
        /// <param name="locator"></param>
        /// <param name="checkExists"></param>
        public BaseElement(By locator, bool checkExists = false)
        {
            OriginalByData = locator;
            if (checkExists)
            {
                FindObject();
            }
        }

First thing you can see is that I am storing both the By information for the object, while also having a field for holding the actual IWebElement object itself.
When I instantiate a BaseElement, I will only need to provide the By information and optionally a boolean to say whether or not to check for existence of the object at when the class is instantiated. There may be instances where you want to not check for existence at instantiation time. Perhaps the object is not meant to be on screen at all.

        /// <summary>
        /// Private method that attempts to find the object
        /// Can be used for lazy / late loading of the object
        /// Eg: Define the By data, but then search for the object at a later time.
        /// </summary>
        private void FindObject()
        {
            if (this.InnerObject == null)
            {
                WebDriverWait wait = new WebDriverWait(GetWebDriver(), TimeSpan.FromSeconds(Constants.WaitTime));
                IWebElement obj = wait.Until(ExpectedConditions.ElementExists((OriginalByData)));
                InnerObject = obj;
                Highlight();
            }
        }

The FindObject method is the one that does the heavy lifting. For such a small method in my framework, it is responsible for acquiring the objects on screen and facilitating all the functionality we are about to go into.

Essentially - once the FindObject code executes, it will store a reference to the object in the IWebElement field on the BaseElement.
The Highlight method can be ignored - it is a method I have setup to optionally highlight the object on screen when it locates it.

I am about to describe to you how my click method works. Before we start, you are probably laughing, thinking....
Surely... it is just 
webObject.click();

Well......er... umm...

        /// <summary>
        /// Used for links, buttons, images etc
        /// </summary>
        protected void Click()
        {
            //Check to see if the object is still on screen
            FindObject();

//Scroll to object - this prevents the 'Object is not within view, or not clickable' issue.     GetWebDriver().ExecuteScript("arguments[0].scrollIntoView();", this.InnerObject);
            WebDriverWait wait = new WebDriverWait(GetWebDriver(),
                TimeSpan.FromSeconds(Constants.MediumWait));
            if (OriginalByData != null)
            {
                IWebElement obj = wait.Until(ExpectedConditions.ElementToBeClickable(OriginalByData));
            }
            if (InnerObject != null)
            {
                IWebElement obj = wait.Until(ExpectedConditions.ElementToBeClickable(InnerObject));
            }

            try
            {
                this.InnerObject.Click();
            }
            catch (InvalidOperationException e)
            {
                if (e.Message.Contains("Other element would receive the click"))
                {
                    //Try javascript click instead?
                    Console.WriteLine("===============================================");
                    Console.WriteLine("Warning: ATTEMPTING JAVASCRIPT CLICK INSTEAD");
                    Console.WriteLine("A soft exception warning that another element would receive the click was detected.");
                    Console.WriteLine("Javascript may be able to bypass this exception and progress the test.");
                    Console.WriteLine("===============================================");
                  GetWebDriver().ExecuteScript("arguments[0].click();", this.InnerObject);
                }
                else
                {
                    throw;
                }
            }
        }

 

Alot of code for something that only needs to 'click' an object.
Firstly, this method will recheck to see if the object is still on screen with FindObject. With a dynamic page, objects can vanish as quickly as they appear. We do not simply want to acquire our object and then trust that it will exist for the duration of our test script. If we keep a single variable, and never refresh it via 'FindObject' we are opening ourselves up to StaleElementExceptions. Refresh your objects frequently, if not every time you try to use them.

After the element is confirmed as being on screen and refreshed, it will then perform a web driver wait upto X seconds for the object to become clickable.
You can see that I am performing this check on both the stored WebElement and on the By data. This is intentional.
Remember, instantiating the BaseElement with a false for existence will not acquire a IWebElement on instantiation. So it may not have the IWebElement, so we can perform the check on the By data instead - assuming it is stored.

Now comes the actual click
You see I am doing it in a try and catch.
I could catch all exceptions, but I don't want to - I only want a specific type of exception here.
InvalidOperationExceptions are thrown on situations where DOM elements and hidden objects can sometimes overlay the object you want to click on. Selenium tries to be smart and warns you that the object is hidden, therefore should not receive the click.
I say - so what...
If I say 'Click on that damn object, I mean.. click on that damn object!!'
So what I do is check for the message on the exception, if it is the expected exception message, I then perform the 'click' using javascript instead of the traditional Selenium functionality.

Javascript is a powerful tool that can augment your automation capabilities, but you need to be aware that using it can be risky.
If your intent is to simulate human interactions, then Javascript is the wrong direction for you. 
If you use Javascript to set a textbox value, it will NOT trigger event handlers on those textboxes.
This can be important for things like Registration pages which have onchange event handlers.

We have all seen those registration pages where you type in a username and it magically tells us if the username is available. It typically does this when you keyup or change the text value of the box. When we type via keyboard, the javascript event is triggered, but if javascript itself changes the value, it generally does not trigger the event. This is possibly to avoid circular execution calls. Eg: Javascript changes box value, triggers event, event changes the box value which then further triggers the event... recursive etc..

In the case of our click method, I think it is an acceptable risk. It does however meant that you are going to have a more reliable click method, but it just wont be behaving 100% like a human being.

You should try to write as much methods in your BaseElement class as possible, ranging from SetText, SendKeys, DropDown interactions etc. I wont walk you through all of those, I am sure you can expand on what I have done above.

I should have mentioned before... why this method is protected. Well... that's because we are going to be inheriting this class.

Create a new class, call it Link.cs or something similar.

About the constructor here: Because we are inheriting the BaseElement class, we need to emulate its constructor. Instead of providing a false for the default behavior of bCheckExist I thought it should be true.
There is also another constructor available, for feeding an IWebElement into the constructor directly. Ignore this, as this is for some advanced functionality I have that casts from one class type to another. (Link to Image etc)

public class Link : BaseElement
    {

        public Link(By locator, bool bCheckExist = true)
            : base(locator, bCheckExist)
        {
            
        }

        public Link(IWebElement wb)
            : base(wb)
        {
            
        }

        public new void VerifyText(string arg, bool ignoreCase = false)
        {
            base.VerifyText(arg, ignoreCase);
        }
        public new void Click()
        {
            base.Click();
        }

    }

The only two methods declared here are 
VerifyText and Click

Because our 'Click' method in the BaseElement was declared as protected, it means that it cannot be accessed by the tester directly from a derived type. In order to provide access, we create the public new void Click() and have it call base.Click();

What is the value of this? 
In this particular case, not much, however, lets imaging you have a new type of object that your company has developed specially. Your click function no longer works, you need to do something special, but don't want to change it for all other buttons / links. You could then write a class that inherits from Link.cs and then does its own special Click functionality.

How does this look at test design time?

Link btn = new Link(By.XPath(p0));
btn.Click();

or

new Link(By.Id("myButton")).Click();

You can see that I am no longer having to worry about interacting with the WebDriver - it is handled in the background as a static variable that only gets used in the BaseElement class.

When we use this in a Specflow test project, it could end up looking like this: (note - this is just a dummy step)

        [When(@"the user clicks on button '(.*)'")]
        public void WhenTheUserClicksOnButtonI(string p0)
        {
            ButtonLink btn = new ButtonLink(By.XPath(p0));
            btn.Click();
        }

Of course, the recommended approach I would usually try to sell would be to implement a page object model approach. Where each object on screen is handled by either a method that instantiates and returns the object, or a field that does the same.
It might look like:

public class LoginPage : Page
{
     public TextBox UsernameTextBox(){
                return new TextBox(By.Id("username"));

     }
}

You can see I am using another special class that I inherit from for my page objects. This is to help synchronize on pages to improve reliability of tests.
I might cover that in a future blog post.

 

 

Framework Design - Statement of Intent

Before you start constructing your automation framework, you need to figure out what you are looking to achieve first, so you can then assess whether you have succeeded in the task later on. In the AGILE world, this is called defining your 'Definition of Done'.
Frameworks have a habit of growing organically and while this can be a great thing when it follows a design pattern, it can be horrible if it occurs randomly and without planning.
It essentially becomes the difference between an oak tree growing upwards towards the sky and something that is cancerous and growing out of control. One follows a pattern of behavior while the other is random and serves no benefit. 

 

Lets write our 'statement of intent' - what we want to achieve with our framework.

'I want an automation framework that can be installed into a test project and allows the user to start writing automation code almost instantly with minimal setup involved. The framework should be modular and decoupled to facilitate maintainability. The framework will be built using .Net C# so our developers can use the same framework to write their Unit Tests. BA's currently write their specifications using the Gherkin syntax, so the framework will have Specflow at its heart to allow mapping of tests to requirements. The framework should promote a solution / project layout to help keep its users consistent in their approach.'

From the above statement we get:

  1. Must be a quick installation process
  2. Modular and decoupled
  3. Using .Net C#
  4. Using Specflow
  5. Framework promotes a pattern/layout

While these are by no means the only requirements we care about, these are the ones that will be required for our definition of done to be achieved. This is how we determine if our framework is ready for consumption.

Now that we have our needs or 'requirements' - we can start planning how we want to achieve these at a high level.

Requirement #1 could be achieved using Nuget packages for instance. Nuget packages allow you to attach example files, class files and documentation that will get installed into a project when the Nuget package is consumed. We could have a Nuget Package called 'Automation.Framework.Base' for example.

Requirement #2 could also be achieved using Nuget packages, but instead of packaging the whole framework, we would separate the individual pieces of functionality into individual Nuget packages. This means that updating one piece of functionality should not necessarily impact another area of functionality.

Requirement #3 is achieved simply by using C# as our language of choice. Training and documentation will be important if your QA staff are inexperienced in .Net C#.

Requirement #4 requires that we use Specflow for our BDD Language interpreter. By using C# we have eliminated Cucumber and other Gherkin BDD tools due to incompatibilities with C#. Specflow remains the logical choice for C# based test frameworks. Installing this is two fold - we need to install the Nuget Package for Specflow.NUnit.Runners as well as the Visual Studio Specflow extension. Without the extension, visual studio will interpret .feature files as text files.

Requirement #5 can also be facilitated through use of Nuget. A properly configured .nuspec file can create folder structures, create example files and setup app.configs seamlessly without requiring the user to provide input. 

Our above statement has given us 5 requirements which can be solved through various tools and strategies.
The next post will delve into Visual Studio and show how to get started with our decoupled framework.

Automation Frameworks - design and planning

During my career, I've had the pleasure and sometimes displeasure of having to build and then maintain automation frameworks for numerous clients. All of them ended up being used and eventually evolved over time to become in some cases something better than envisioned or in some cases they devolved into a monstrosity of high maintenance and aggravation.

Its inevitable that requirements will change when dealing with software development and testing. All we can do is plan ahead and plan for change when it eventually strikes. Anticipate the eventuality that your manager is going to one day tell you that he wants to change the credentials used by 2000 tests, if you plan for this eventuality, you can make those extreme cases easier to resolve if and when they arise.

Decoupling & Cohesion

Anyone who is familiar with object-oriented development principles will know that decoupled code is generally good and cohesion is often the measure of whether something is meaningful. 
These principles are often left out of consideration when it comes to QA Automation Frameworks. After all, we aren't real developers are we?  Wrong...

If you write a script, a program or a piece of code that ends up automating another application, I would put forward that you have taken a step or two towards being a developer. You may not have trained or become qualified to wear the job title of Software Developer, but at the end of the day, you have produced software, as simple as it is...

Why is Decoupling good for an Automation Framework? Or any piece of software for that matter.
Coupling software basically means that you are interlinking modules or segments of an application together in such a way that if one piece changes, lots of the application code has to change to accommodate the change requested.
This turns the task of fixing a bug or changing a small piece of code, into a headache that takes more man-hours and eventually costs the company more money.

Decoupling software or making the software modular helps alleviate this issue by making each piece of software self-contained and highly cohesive. 
If you have all code pertaining to a specific piece of functionality in one place, or in one module - then you will have less changes to make in other areas of your application. It also means that you wont have random pieces of code from one area of functionality sticking out like a sore thumb in the wrong module or library.

This is often achieved using Interfaces or the use of Reflection.
Developers all over the world are reading the word Reflection above and gasping in horror. Reflection is often considered a dirty word when you work in software development and have full control of your code base. I mean... why would you need to use Reflection after all if you can just use an interface. How you achieve the feat of decoupling is not as important as achieving it in itself. Everyone has their personal preferences.
Reflection within object oriented programming is a tool like any others and should not be something to fear. Like XPath, it will only produce good results if used by someone who knows what they are doing.

In the next article, I will explain the issues posed by poorly designed automation frameworks and the necessity for objective planning.

 

 

 

 

A little bit about me

Having worked as a consultant previously in another life (another career line), this is the bit that is commonly referred to as 'showing ones credentials' or explaining why my opinion is worth anything. Don't get me wrong, at the end of the day everything on this blog is my opinion, but its opinion based on experience and in most cases fact ;-)

I am currently a .Net Developer in Test, although I have experience in the following languages, oldest to most recent.

VB Script
LUA
PHP
Javascript
C#
Java
C++
Solidity (Ethereum)

I first started working in the QA industry around 2005 for the technology division of a certain insurance company **cough cough Allstate **
Hardly a feather in my cap considering the pittance they paid me, but it was my first step on the road of the QA Industry. Immediately I was working with HP Quality Center, HP Quick Test Professional and this is where the VB Script comes in.

During and after the above role, I was messing around with games and programming - getting familiar with LUA which is a sort of cross over between programming and scripting. I was also at this time managing my own game servers and building websites for those servers which is where the Javascript and PHP comes in - of course there is HTML as well.

After the above role, I went to work for another firm, this time one that specializes in security and surveillance software / hardware.
At this stage Microsoft was creating its first automation tool - Visual Studio Coded UI. Our company was among the beta testers for the Coded UI test functionality and this is how I got my foot in the door with regards to C#. I have since gone through that door and then some. 

After the C# role I moved into Consultancy and Contracting and got experience with a wide variety of languages and tools.
Business Process Testing via HP QTP / ALM, Silk4J (Java), Selenium + Cucumber, Selenium + Specflow. 

Those are just the QA related ones. I eventually moved into development quality and began participating in audits and evaluations of client code, escrow engagements where I had to act as a broker between a buyer and seller of software rights etc.

After a long and arguably 'varied' journey through the development quality, I have come back to my roots to perfect my craft in the automation industry.

So that is my career from 2005 to present - 12 years in the software industry and now I am heading up automation strategy for a Fin-Tech firm in the UK.

Blog created...

Well its the first day of the blog. Why not just create a Tumblr blog, after all - they will host it, manage it, theme it etc...
What can I say - I like to re-invent the wheel, there is something to be said about being able to see how an application works under the hood, thus being able to expand and enhance it.

I plan on using this blog to post tips, tricks and general views on Automation in the QA/Development industry.
Some of the ideas may be controversial, I have never been one to go with the flow in the QA / Dev industry and some of my posts will elaborate on that further.