Flaky tests handling with Jenkins & JIRA

Last modified by Vincent Massol on 2019/09/26 09:34

Nov 08 2017

Flaky tests are a plague because they lower the credibility in your CI strategy, by sending false positive notification emails.

In a previous blog post, I detailed a solution we use on the XWiki project to handle false positives caused by the environment on which the CI build is running. However this solution wasn't handling flaky tests. This blog post is about fixing this!

So the strategy I'm proposing for Flaky tests is the following:

  • When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it (we currently have the following open issues related to Flaky tests)
  • The JIRA issue is marked as containing a flaky test by filling a custom field called "Flickering Test", using the following format: <package name of test class>.<test class name>#<test method name>. There can be several entries separated by commas.

    Example:

    jiraexample.png

  • In our Pipeline script, after the tests have executed, review the failing ones and check if they are in the list of known flaky tests in JIRA. If so, indicate it in the Jenkins test report. If all failing tests are flickers, don't send a notification email.

    Indication in the job history:

    joblist.png

    Indication on the job result page:

    jobpage.png

    Information on the test page itself:

    testpage.png

Note that there's an alternate solution that can also work:

  • When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it
  • Add an @Ignore annotation in the test with a detail pointing to the JIRA issue (something like @Ignore("WebDriver doesn't support uploading multiple files in one input, see http://code.google.com/p/selenium/issues/detail?id=2239")
    ). This will prevent the build from executing this flaky test.

This last solution is certainly low-tech compared to the first one. I prefer the first one though for the following reasons:

  • It allows flaky tests to continue executing on the CI and thus serve as a constant reminder that something needs to be fixed. Adding the @Ignore annotation feels like putting the dust under the carpet and there's little chance you're going to come back to it in the future...
  • Since our script acts as postbuild script on the CI agent, there's the possibility to add some logic to auto-discover flaky tests that have not yet been marked as flaky.

Also note that there's a Jenkins plugin for Flaky test but I don't like the strategy involved which is to re-run failing tests a number of times to see if they pass. In theory it can work. In practice this means CI jobs that will take a lot longer to execute, making it impractical for functional UI tests (which is where we have flaky tests in XWiki). In addition, flakiness sometimes only happens when the full test suite is executed (i.e. it depends on what executes before) and sometimes require a large number of runs before passing.

So without further ado, here's the Jenkins Pipeline script to implement the strategy we defined above (you can check the full pipeline script):

/**
 * Check for test flickers, and modify test result descriptions for tests that are identified as flicker. A test is
 * a flicker if there's a JIRA issue having the "Flickering Test" custom field containing the FQN of the test in the
 * format {@code <java package name>#<test name>}.
 *
 * @return true if the failing tests only contain flickering tests
 */

def boolean checkForFlickers()
{
   boolean containsOnlyFlickers = false
    AbstractTestResultAction testResultAction =  currentBuild.rawBuild.getAction(AbstractTestResultAction.class)
   if (testResultAction != null) {
       // Find all failed tests
       def failedTests = testResultAction.getResult().getFailedTests()
       if (failedTests.size() > 0) {
           // Get all false positives from JIRA
           def url = "https://jira.xwiki.org/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?".concat(
                   "jqlQuery=%22Flickering%20Test%22%20is%20not%20empty%20and%20resolution%20=%20Unresolved")
           def root = new XmlSlurper().parseText(url.toURL().text)
           def knownFlickers = []
            root.channel.item.customfields.customfield.each() { customfield ->
               if (customfield.customfieldname == 'Flickering Test') {
                    customfield.customfieldvalues.customfieldvalue.text().split(',').each() {
                        knownFlickers.add(it)
                   }
               }
           }
            echoXWiki "Known flickering tests: ${knownFlickers}"

           // For each failed test, check if it's in the known flicker list.
           // If all failed tests are flickers then don't send notification email
           def containsAtLeastOneFlicker = false
            containsOnlyFlickers = true
            failedTests.each() { testResult ->
               // Format of a Test Result id is "junit/<package name>/<test class name>/<test method name>"
               def parts = testResult.getId().split('/')
               def testName = "${parts[1]}.${parts[2]}#${parts[3]}"
               if (knownFlickers.contains(testName)) {
                   // Add the information that the test is a flicker to the test's description
                   testResult.setDescription(
                       "<h1 style='color:red'>This is a flickering test</h1>${testResult.getDescription() ?: ''}")
                    echoXWiki "Found flickering test: [${testName}]"
                    containsAtLeastOneFlicker = true
               } else {
                   // This is a real failing test, thus we'll need to send athe notification email...
                   containsOnlyFlickers = false
               }
           }

           if (containsAtLeastOneFlicker) {
                manager.addWarningBadge("Contains some flickering tests")
                manager.createSummary("warning.gif").appendText("<h1>Contains some flickering tests</h1>", false,
                   false, false, "red")
           }
       }
   }

   return containsOnlyFlickers
}

Hope you like it! Let me know in comments how you're handling Flaky tests in your project so that we can compare/discuss.

Created by Vincent Massol on 2017/11/08 09:28