On software quality, collaborative development, wikis and Open Source...

Last modified by Vincent Massol on 2015/11/23 11:59

May 09 2018

Automatic Test Generation with DSpot

DSpot is a mutation testing tool that automatically generates new tests from existing test suites. It's being developed as part of the STAMP European research project (to which XWiki SAS is participating to, represented by me).

Very quickly, DSpot works as follows:


  • Step 1: Finds an existing test and remove some API call. Also remove assertions (but keep the calls on the code being tested). Add logs in the source to capture object states
  • Step 2: Execute the test and add assertions that validate the captured states
  • Step 3: Run a selector to decide which test to keep and which ones to discard. By default PITest/Descartes is used, meaning that only tests killing mutants than the original test didn't kill are kept. It's also possible to use other selector. For example a Clover selector exists that will keep the tests which generate more coverage than the original test.
  • Step 4: Repeat (with different API calls removed) or stop if good enough.

For full details, see this presentation by Benjamin Danglot (main contributor of DSpot).

Today I tested the latest version of DSpot (I built it from its sources to have the latest code) and tried it on several modules of xwiki-common.

FTR here's what I did to test it:

  • Cloned Dspot and built it with Maven by running mvn clean package -DskipTests. This generated a dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar JAR.
  • For each module on which I tested it, I created a dspot.properties file. For example for xwiki-commons-core/xwiki-commons-component/xwiki-commons-component-api, I created the following file:

    Note that project is pointing to the root of the project.

  • Then I executed: java -jar /some/path/dspot/dspot/target/dspot-1.1.1-SNAPSHOT-jar-with-dependencies.jar path-to-properties dspot.properties
  • Then checked results in output/* to see if new tests have been generated

I had to test DSpot on 6 modules before getting any result, as follows:

  • xwiki-commons-core/xwiki-commons-cache/xwiki-commons-cache-infinispan/: No new test generated by DSpot. One reason was because DSpot modifies the test sources and the tests in this module were extending Abstract test classes located in other modules and this DSpot didn't touch those and was not able to modify them to generate new tests.
  • xwiki-commons-core/xwiki-commons-component/xwiki-commons-component-api/: No new test generated by DSpot.
  • xwiki-commons-core/xwiki-commons-component/xwiki-commons-component-default/: No new test generated by DSpot.
  • xwiki-commons-core/xwiki-commons-component/xwiki-commons-component-observation/: No new test generated by DSpot.
  • xwiki-commons-core/xwiki-commons-context/: No new test generated by DSpot.
  • xwiki-commons-core/xwiki-commons-crypto/xwiki-commons-crypto-cipher/: Eureka! One test was generated by Dspot emoticon_smile

Here's the original test:

public void testRSAEncryptionDecryptionProgressive() throws Exception
    Cipher cipher = factory.getInstance(true, publicKey);
    cipher.update(input, 0, 17);
    cipher.update(input, 17, 1);
    cipher.update(input, 18, input.length - 18);
   byte[] encrypted = cipher.doFinal();
    cipher = factory.getInstance(false, privateKey);
    cipher.update(encrypted, 0, 65);
    cipher.update(encrypted, 65, 1);
    cipher.update(encrypted, 66, encrypted.length - 66);
    assertThat(cipher.doFinal(), equalTo(input));

    cipher = factory.getInstance(true, privateKey);
    cipher.update(input, 0, 15);
    cipher.update(input, 15, 1);
    encrypted = cipher.doFinal(input, 16, input.length - 16);
    cipher = factory.getInstance(false, publicKey);
    assertThat(cipher.doFinal(), equalTo(input));

And here's the new test generated by DSpot, based on this test:

public void testRSAEncryptionDecryptionProgressive_failAssert2() throws Exception {
--> try {
        Cipher cipher = factory.getInstance(true, publicKey);
        cipher.update(input, 0, 17);
        cipher.update(input, 17, 1);
        cipher.update(input, 18, ((input.length) - 18));
       byte[] encrypted = cipher.doFinal();
        cipher = factory.getInstance(false, privateKey);
        cipher.update(encrypted, 0, 65);
        cipher.update(encrypted, 65, 1);
        cipher.update(encrypted, 66, ((encrypted.length) - 66));
        cipher = factory.getInstance(true, privateKey);
        cipher.update(input, 0, 15);
        cipher.update(input, 15, 1);
        encrypted = cipher.doFinal(input, 16, ((input.length) - 16));
        cipher = factory.getInstance(false, publicKey);
-->     cipher.doFinal();
-->     CoreMatchers.equalTo(input);
-->     org.junit.Assert.fail("testRSAEncryptionDecryptionProgressive should have thrown GeneralSecurityException");
--> } catch (GeneralSecurityException eee) {
--> }

I've highlighted the parts that were added with the --> prefix. In short DSpot found that by calling cipher.doFinal() twice, it generates a GeneralSecurityException and that's killing some mutants that were not killed by the original test. Note that calling doFinal() resets the cipher, which explains why the second call generates an exception.

Looking at the source code, we can see:

public byte[] doFinal(byte[] input, int inputOffset, int inputLen) throws GeneralSecurityException
   if (input != null) {
       this.cipher.processBytes(input, inputOffset, inputLen);
   try {
       return this.cipher.doFinal();
   } catch (InvalidCipherTextException e) {
       throw new GeneralSecurityException("Cipher failed to process data.", e);

Haha... DSpot was able to automatically generate a new test that was able to create a state that makes the code go in the catch.

Note that it would have been even nicer if DSpot had put an assert on the exception message.

Then I wanted to verify if the test coverage had increased so I ran Jacoco before and after for this module:

  • Before: 70.5%
  • After: 71.2%



  • DSpot was able to improve the quality of our test suite automatically and as a side effect it also increased our test coverage (it's not always the case that new tests will increase the test coverage. DSpot's main intent, when executed with PIT/Descartes, is to increase the test quality - i.e. its ability to kill more mutants).
  • It takes quite a long time to execute, globally on those 6 modules it took about 15 minutes to build them with DSpot/PIT/Descartes (when it takes about 1-2 minutes normally).
  • DSpot doesn't generate a lot of tests: one test generated out of 100s of tests mutated (in this example session).
  • IMO one good strategy to use DSpot is the following:
    • Create a Jenkins pipeline job which executes DSpot on your code
    • Since it's time consuming, run it only every month or so
    • Have the pipeline automatically commit the generated tests to your SCM in a different test tree (e.g. src/test-dspot/)
    • Modify your Maven build to use the Build Helper Maven plugin to add a new test source tree so that your tests run on both your manually-written tests and the ones generated by DSpot
    • I find this an interesting strategy because it's automated and unattended. If you have to manually execute DSpot and look, find some generated tests and then manually incorporate them (with rewriting) to your existing test suite, then it's very tedious and time-consuming and IMO the ratio time spent vs added value is too low to be interesting.


EDIT: If you want to know more, check the presentation I gave at Devoxx France 2018 about New Generation of Tests.

Apr 23 2018

Devoxx France 2018

Another Devoxx France and another huge success emoticon_smile

This time I had 2 talks.

New Generation of Tests

Title: Advanced testing in action on a Java project
Subtitle: Demonstrating testing practices using Java, Maven, Docker and Jenkins.
Abstract: This talk will demonstrate advanced testing practices used by the XWiki open source project, and using Java, Maven, Docker and Jenkins:

This is a talk I also did at FOSDEM 2018 (In English). The part about Mutation testing and configuration testing concerns results from the STAMP research project I'm participating to.

I got lucky to get a lot of votes for the talk and thus I was in the big amphitheater (capacity of 800-850 people) and it was almost full!


LCC Podcast Recording

As usual we did a live recording of the LCC Podcast.



What I enjoy the most of technical conferences is the ability my friends and get some news from them about the fun stuff they're doing.

I also sent to the speaker dinner organized by Devoxx and we got a nice sight-seeing bus tour of Paris.


I also got some ideas to try out on the XWiki project:

Mar 21 2018

Dagstuhl 2018

I had the honor of being invited to a seminar on "Automatic Quality Assurance and Release" at Dagstuhl by Benoit Baudry (we collaborate together on the STAMP research project).

Dagstuhl is in the middle of nowhere but it was worth it emoticon_wink The venue is very nice. It's basically a seminar factory. Every year there are calls for seminars and you can propose research seminars.

It's subsidized by the German government and as such, the price is very low: you pay 50 euros per day and that includes everything: room, 3 meals, seminar rooms, etc. I don't know if something similar exists in France. That seems like a great idea to attract researchers from all over the world.

The seminars are generally organized as unconferences. In our case we had some talks planned every 1/2 day and that kicked of various discussion groups. Each group had a leader in charge of making sure that each session got an output in the form of a blog post (with you'll be able to find when they're published on the seminar page. I led a session on Onboarding developers.

At Dagstuhl they also have a huge and amazing book library (physical books, tens of thousands!). And they do a fun thing which I find was a nice touch: when a seminar is organized they gather all the books they have that have been written by the participants and they display them on shelf so that the authors can sign them. I got to sign the JUnit in Action book emoticon_wink


It was a very nice experience and the first time I was participating to a research seminar organized by academia. I'm looking forward for my next seminar at Dagstuhl!

Mar 20 2018

QDashboard & SonarQube

Here's a story from the past... emoticon_smile (it happened 10 years ago).

Arnaud Heritier just dug up some old page on the Maven wiki that I had created back in 2005/2006.

I had written the Maven1 Dashboard plugin and when Maven2 came out I thought about rewriting it with a new more performant architecture and with more features.

At the time, I wanted to start working on this full time and I proposed the idea to several companies to see if they would sponsor its development (Atlassian, Cenqua, Octo Technology). They were all interested but for various reasons, I ended up joining the XWiki SAS company to work on the XWiki open source project.

So once I knew I wouldn't be working on this, I shared my idea publicly on the Maven wiki to see if anyone else would be interested to implement it.

Back then, I was happily surprised to see that Freddy Mallet actually implemented the idea:

 In September 2006, I've discovered this page written by Vincent which has directly inspired the launch of an Open Source project. One year later we are pleased to announce that Sonar 1.0 release is now available. The missions of Sonar are to :
 * Centralize and share quality information for all projects under continuous quality control
 * Show you which ones are in pain
 * Tell you what are the diseases

 To do that, Sonar aggregates metrics from Checkstyle, PMD, Surefire, Cobertura / Clover and JavaNCSS. You can take a look to the screenshots gallery to get a quick insight.

 Have fun.


To give you the full picture, I'm now publishing something I never made public which are the slides that I wrote when I wanted to develop the idea:

Qdashboard and SonarQube have several differences. An important one is that in the idea of QDashboard, there was supposed to be several input sources such as mailing lists, issue tracker, etc. At the moment SonarQube derives metrics mostly from the SCM. But I'm sure that the SonarQube guys have a lot of ideas in store for the future emoticon_wink

In 2013 I got a very nice present from SonarSource: a T-shirt recognizing me as #0 "employee" in the company, as the "Inceptor". That meant a lot to me.


Several years after, SonarQube has come a long way and I'm in awe of the great successful product it has become. Congrats guys! 

Now, on to the following 10 years!

Onboarding Brainstorming

I had the honor of being invited to a seminar on "Automatic Quality Assurance and Release" at Dagstuhl by Benoit Baudry (we collaborate together on the STAMP research project). Our seminar was organized as un unconference and one session I proposed and led was the "Onboarding" one described below. The following persons participated to the discussion: V. Massol, D. Gagliardi, B. Danglot, H. Wright, B. Baudry.

Onboarding Discussions

When you're developing a project (be it some internal project or some open source project) one key element is how easy it is to onboard new users to your project. For open source projects this is essential to attract more contributors and have a lively community. For internal projects, it's useful to be able to have new employees or newcomers in general be able to get up to speed rapidly on your project.

This brainstorming session was about ideas of tools and practices to use to ease onboarding.

Here's the list of ideas we had (in no specific order):

  • 1 - Tag issues in your issue tracker as onboarding issues to make it easy for newcomer to get started on something easy and be in success quickly. This also validates that they're able to use your software.
  • 2 - Have a complete package of your software that can be installed and used as easily as possible. It should just work out of the box without having to perform any configuration or additional steps. A good strategy for applications is to provide a Docker image (or a Virtual Machine) with everything setup.
  • 3 - Similarly, provide a packaged development environment. For example you can provide a VM with some preinstalled and configured IDE (with plugins installed and configured using the project's rules). One downside of such an approach is the time it takes to download the VM (which could several GB in size). 
  • 4 - A similar and possibly better approach would be to use an online IDE (e.g. Eclipse Che) to provide a complete prebuilt dev environment that wouldn't even require any downloading. This provides the fastest dev experience you can get. The downside is that if you need to onboard a potentially large number of developers, you'll need some important infra space/CPU on your server(s) hosting the online IDE, for hosting all the dev workspaces. This makes this option difficult to implement for open source projects for example. But it's viable and interesting in a company environment.
  • 5 - Obviously having good documentation is a given. However too many projects still don't provide this or only provide good user documentation but not good developer documentation with project practices not being well documented or only a small portion being documented. Specific ideas:
    • Document the code structure
    • Document the practices for development
    • Develop a tool that supports newcomers by letting them know when they follow / don't follow the rules
    • Good documentation shall explicit assumptions (e.g. when you read this piece of documentation, I assume that you know X and Y)
    • Have a good system to contribute to the documentation of the project (e.g. a wiki)
    • Different documentation for users and for developers
  • 6 - Have homogeneous practices and tools inside a project. This is especially true in a company environment where you may have various projects, each using its own tools and practices, making it harder to move between projects.
  • 7 - Use standard tools that are well known (e.g. Maven or Docker). That increases the likelihood that a newcomer would already know the tool and be able to developer for your project.
  • 8 - It's good to have documentation about best practices but it's even better if the important "must" rules be enforced automatically by a checking tool (can be part of the build for example, or part of your IDE setup). For example instead of saying "this @Unstable annotation should be removed after one development cycle", you could write a Maven Enforcer rule (or a Checkstyle rule, or a Spoon rule) to break the build if it happens, with a message explaining the reason and what is to be done. Usually humans may prefer to have a tool telling them that than a way telling them that they haven't been following the best practices documented at such location...
  • 9 - Have a bot to help you discover documentation pages about a topic. For example by having a chat bot located in the project's chat, that when asked about will give you the link to it.
  • 10 - Projects must have a medium to ask questions and get fast answers (such as a chat tool). Forum or mailing lists are good but less interesting when onboarding when the newcomer has a lot of questions in the initial phase and requires a conversation.
  • 11 - Have an answer strategy so that when someone asks a question, the doc is updated (new FAQ entry for example) so that the next person who comes can find the answer or be given the link to the doc.
  • 12 - Mentoring (human aspect of onboarding): have a dedicated colleague to whom you're not afraid to ask questions and who is a referent to you.
  • 13 - Supporting a variety of platforms for your software will make it simpler for newcomers to contribute to your project.
  • 14 - Split your projects into smaller parts. While it's hard and a daunting experience to contribute to the core code of a project, if this project has a core as small as possible and the rest is made of plugins/extensions then it becomes simpler to start contributing to those extensions first.
  • 15 - Have some interactive tutorial to learn about your software or about its development. A good example of nice tutorial can be found at www.katacoda.com (for example for Docker, https://www.katacoda.com/courses/docker).
  • 16 - Human aspect: have an environment that makes you feel welcome. Work and discuss how to best answer Pull Requests, how to communicate when someone joins the project, etc. Think of the newcomer as you would a child: somebody who will occasionally stumble and need encouragment. Try to have as much empathy as possible.
  • 17 - Make sure that people asking questions always get an answer quickly, perhaps by establishing a role on the team to ensure answers are provided.
  • 18 - Last but not least, an interesting thought experiment to verify that you have some good onboarding processes: imagine that 1000 developers join your project / company on the same day. How do you handle this?

Onboarding on XWiki

I was also curious to see how those ideas apply to the XWiki open source project and what part we implement.

IdeasImplemented on XWiki?
1 - Tag simple issuesaccept Onboarding issues
2 - Complete install packageaccept Debian apt-get, Docker images.
3 - Dev packaged environmentaccept We have a Developer VM
4 - Online IDE onboardingcancel Hard to do provide for an OSS project in term of infra resources but would love to provide this
5 - Good documentationaccept User guide, Admin guide, Dev guide + there's a wiki dedicated to development practices and tools.
6 - Have homogeneous practices and tools inside a projectaccept See http://dev.xwiki.org
7 - Use standard tools that are well knownaccept Maven, Jenkins, Java, JUnit, Mockito, Selenium
8 - Automatically enforced important rulesaccept See Automatic checks in build
9 - Have a bot to help you discover documentation pages about a topicaccept IRC bot (used here - been broken since 2017-05-09).
10 - Projects must have a medium to ask questions and get fast answersaccept XWiki Chat
11 - Have an answer strategy so that when someone asks a questionaccept XWiki answer strategy and FAQ
12 - Mentoring (human aspect of onboarding)error Done to some extent by employees of XWiki SAS who are committers on the open source project but not a generic open source project practice.
13 - Supporting a variety of platforms for your softwareaccept Windows, Linux, Mac, multiple DBs, multiple browsers, multiple Servlet containers.
14 - Split your projects into smaller partsaccept Core getting smaller and more and more Extensions.
15 - Have some interactive tutorial to learn about your softwarecancel Would be nice to have
16 - Human aspect: have an environment that makes you feel welcome.accept This is subjective. Sometimes we may be a bit abrupt when answering (especially me! Sorry guys if I've been abrupt, it's more a consequence of doing too many things. I need to improve. I think we're globally a welcoming community, WDYT?
17 - Make sure that people asking questions always get an answer quicklyaccept I think we're very good at answering fast. See the Forum for example. We also answer fast on Matrix/IRC (we try).
18 - 1000 devs joining at once experimentaccept Actually we participated to Google CodeIn 2017 and this is exactly what we experienced: 756 students interacting with us.

So globally I'd say XWiki is pretty good at onboarding. I'd love to hear about things that we could improve on for onboarding. Any ideas?

If you own a project, we would be interested to hear about your ideas and how you perform onboarding. You could also use the list above as a way to measure your level of onboarding for your project and find out how you could improve it further.

Feb 05 2018


Once more I was happy to go to FOSDEM. This year XWiki SAS, the company I work for, had 12 employees going there and we had about 8 talks accepted + we had a stand for the XWiki open source project that we shared with our friends @ Tiki and Foswiki.

Here were the highlights for me:

  • I talked about what's next on Java testing and covered test coverage, backward compatibility enforcement, mutation testing and environment testing. My experience on the last two types of tests are directly issued from my participation the STAMP research project where we develop and use tools to amplify existing tests.
  • I did another talk about "Addressing the Long Tail of (web) applications", explaining how an advanced structured wiki such as XWiki can be used to quickly create ad-hoc application in wiki pages.
  • Since we had a stand shared between 3 wiki communities (Tiki, Foswiki and XWiki), I was also interested in comparing our features, and how our communities work.
    • I met the nice folks of Tiki at their TikiHouse and had long discussions about how similar and differently we do things. Maybe the topic for a future blog post? emoticon_smile
    • Then I had Michael Daum do a demo to me of the nice features of Foswiki. I was quite impressed and found a lot of similarities in features. 
    • Funnily our 3 wiki solutions are written in 3 different technologies: Tiki in PHP, Foswiki in Perl and XWiki in Java. Nice diversity!
  • I met a lot of people of course (Fosdem is really THE place to be to meet people from the FOSS communities) but I'd like to thank especially Damien Duportal who took the time to sit with me and go over several questions I had about Jenkins pipelines and Docker. I'll most likely blog about some of those solutions in the near future.

All in all, an excellent FOSDEM again, with lots of talks and waffles emoticon_wink

Dec 15 2017

POSS 2017

My company (XWiki SAS) had a booth at the Paris Open Source Summit 2017 (POSS) and we also organized a track about "One job, one solution!", promoting open source solutions.

I was asked to talk about using XWiki as an alternative to Confluence or Sharepoint.

Here are the slides of the talk:

There were about 30 persons in the room. I focused on showing what I believe are the major differences, especially with Confluence that I know better than SharePoint.

If you're interested by more details you can find a comparison between XWiki and Confluence on xwiki.org.

Nov 17 2017

Controlling Test Quality

We already know how to control code quality by writing automated tests. We also know how to ensure that the code quality doesn't go down by using a tool to measure code covered by tests and fail the build automatically when it goes under a given threshold (and it seems to be working).

Wouldn't it be nice to be also able to verify the quality of the tests themselves? emoticon_smile

I'm proposing the following strategy for this:

  • Integrate PIT/Descartes in your Maven build
  • PIT/Descartes generates a Mutation Score metric. So the idea is to monitor this metric and ensure that it keeps going in the right direction and doesn't go down. Similar than watching the Clover TPC metric and ensuring it always go up.
  • Thus the idea would be, for each Maven module to set up a Mutation Score threshold (you'd run it once to get the current value and set that value as the initial threshold) and have the PIT/Descartes Maven plugin fail the build if the computed mutation score is below this threshold. In effect this would tell that the last changes have introduced tests that are of lowering quality than existing tests (in average) and that the new tests need to be improved to the level of the others.

In order for this strategy to be implementable we need PIT/Descartes to implement the following enhancements requests first:

I'm eagerly waiting for this issues to be fixed in order to try this strategy on the XWiki project and verify it can work in practice. There are some reason why it couldn't work such as being too painful and not being easy enough to identify test problems and fix them.

WDYT? Do you see this as possibly working?

Nov 14 2017

Comparing Clover Reports

On the XWiki project, we use Clover to compute our global test coverage. We do this over several Git repositories and include functional tests (and more generally the coverage brought by some modules into other modules).

Now I wanted to see the difference between 2 reports that were generated:

I was surprised to see a drop in the global TPC, from 73.2% down to 71.3%. So I took the time to understand the issue.

It appears that Clover classifies your code classes as Application Code and Test Code (I have no idea what strategy it uses to differentiate them) and even though we've used the same version of Clover (4.1.2) for both reports, the test classes were not categorized similarly. It also seems that the TPC value given in the HTML report is from Application Code.

Luckily we asked the Clover Maven plugin to generate not only HTML reports but also XML reports. Thus I was able to write the following Groovy script that I executed in a wiki page in XWiki. I aggregated Application Code and Test code together in order to be able to compare the reports and the global TPC value.


def saveMetrics(def packageName, def metricsElement, def map) {
 def coveredconditionals = metricsElement.@coveredconditionals.toDouble()
 def coveredstatements = metricsElement.@coveredstatements.toDouble()
 def coveredmethods = metricsElement.@coveredmethods.toDouble()
 def conditionals = metricsElement.@conditionals.toDouble()
 def statements = metricsElement.@statements.toDouble()
 def methods = metricsElement.@methods.toDouble()
 def mapEntry = map.get(packageName)
 if (mapEntry) {
    coveredconditionals = coveredconditionals + mapEntry.get('coveredconditionals')
    coveredstatements = coveredstatements + mapEntry.get('coveredstatements')
    coveredmethods = coveredmethods + mapEntry.get('coveredmethods')
    conditionals = conditionals + mapEntry.get('conditionals')
    statements = statements + mapEntry.get('statements')
    methods = methods + mapEntry.get('methods')
 def metrics = [:]
  metrics.put('coveredconditionals', coveredconditionals)
  metrics.put('coveredstatements', coveredstatements)
  metrics.put('coveredmethods', coveredmethods)
  metrics.put('conditionals', conditionals)
  metrics.put('statements', statements)
  metrics.put('methods', methods)
  map.put(packageName, metrics)
def scrapeData(url) {
 def root = new XmlSlurper().parseText(url.toURL().text)
 def map = [:]
  root.project.package.each() { packageElement ->
   def packageName = packageElement.@name
    saveMetrics(packageName.text(), packageElement.metrics, map)
  root.testproject.package.each() { packageElement ->
   def packageName = packageElement.@name
    saveMetrics(packageName.text(), packageElement.metrics, map)
 return map
def computeTPC(def map) {
 def tpcMap = [:]
 def totalcoveredconditionals = 0
 def totalcoveredstatements = 0
 def totalcoveredmethods = 0
 def totalconditionals = 0
 def totalstatements = 0
 def totalmethods = 0
  map.each() { packageName, metrics ->
   def coveredconditionals = metrics.get('coveredconditionals')
    totalcoveredconditionals += coveredconditionals
   def coveredstatements = metrics.get('coveredstatements')
    totalcoveredstatements += coveredstatements
   def coveredmethods = metrics.get('coveredmethods')
    totalcoveredmethods += coveredmethods
   def conditionals = metrics.get('conditionals')
    totalconditionals += conditionals
   def statements = metrics.get('statements')
    totalstatements += statements
   def methods = metrics.get('methods')
    totalmethods += methods
   def elementsCount = conditionals + statements + methods
   def tpc
   if (elementsCount == 0) {
      tpc = 0
   } else {
      tpc = ((coveredconditionals + coveredstatements + coveredmethods)/(conditionals + statements + methods)).trunc(4) * 100
    tpcMap.put(packageName, tpc)
  tpcMap.put("ALL", ((totalcoveredconditionals + totalcoveredstatements + totalcoveredmethods)/
(totalconditionals + totalstatements + totalmethods)).trunc(4) * 100)
 return tpcMap

// map1 = old
def map1 = computeTPC(scrapeData('http://maven.xwiki.org/site/clover/20161220/clover-commons+rendering+platform+enterprise-20161220-2134/clover.xml')).sort()

// map2 = new
def map2 = computeTPC(scrapeData('http://maven.xwiki.org/site/clover/20171109/clover-commons+rendering+platform-20171109-1920/clover.xml')).sort()

  println "= Added Packages"
println "|=Package|=TPC New"
map2.each() { packageName, tpc ->
 if (!map1.containsKey(packageName)) {
    println "|${packageName}|${tpc}"
println "= Differences"
println "|=Package|=TPC Old|=TPC New"
map2.each() { packageName, tpc ->
 def oldtpc = map1.get(packageName)
 if (oldtpc && tpc != oldtpc) {
   def css = oldtpc > tpc ? '(% style="color:red;" %)' : '(% style="color:green;" %)'
    println "|${packageName}|${oldtpc}|${css}${tpc}"
println "= Removed Packages"
println "|=Package|=TPC Old"
map1.each() { packageName, tpc ->
 if (!map2.containsKey(packageName)) {
    println "|${packageName}|${tpc}"

And the result was quite different from what the HTML report was giving us!

We went from 74.07% in 2016-12-20 to 76.28% in 2017-11-09 (so quite different from the 73.2% to 71.3% figure given by the HTML report). Much nicer! emoticon_smile

Note that one reason I wanted to compare the TPC values was to see if our strategy of failing the build if a module's TPC is below the current threshold was working or not (I had tried to assess it before but it wasn't very conclusive).

Now I know that we won 1.9% of TPC in a bit less than a year and that looks good emoticon_smile

EDIT: I'm aware of the Historical feature of Clover but:

  • We haven't set it up so it's too late to compare old reports
  • I don't think it would help with the issue we faced with test code being counted as Application Code, and that being done differently depending on the generated reports.

Nov 08 2017

Flaky tests handling with Jenkins & JIRA

Flaky tests are a plague because they lower the credibility in your CI strategy, by sending false positive notification emails.

In a previous blog post, I detailed a solution we use on the XWiki project to handle false positives caused by the environment on which the CI build is running. However this solution wasn't handling flaky tests. This blog post is about fixing this!

So the strategy I'm proposing for Flaky tests is the following:

  • When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it (we currently have the following open issues related to Flaky tests)
  • The JIRA issue is marked as containing a flaky test by filling a custom field called "Flickering Test", using the following format: <package name of test class>.<test class name>#<test method name>. There can be several entries separated by commas.



  • In our Pipeline script, after the tests have executed, review the failing ones and check if they are in the list of known flaky tests in JIRA. If so, indicate it in the Jenkins test report. If all failing tests are flickers, don't send a notification email.

    Indication in the job history:


    Indication on the job result page:


    Information on the test page itself:


Note that there's an alternate solution that can also work:

  • When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it
  • Add an @Ignore annotation in the test with a detail pointing to the JIRA issue (something like @Ignore("WebDriver doesn't support uploading multiple files in one input, see http://code.google.com/p/selenium/issues/detail?id=2239")). This will prevent the build from executing this flaky test.

This last solution is certainly low-tech compared to the first one. I prefer the first one though for the following reasons:

  • It allows flaky tests to continue executing on the CI and thus serve as a constant reminder that something needs to be fixed. Adding the @Ignore annotation feels like putting the dust under the carpet and there's little chance you're going to come back to it in the future...
  • Since our script acts as postbuild script on the CI agent, there's the possibility to add some logic to auto-discover flaky tests that have not yet been marked as flaky.

Also note that there's a Jenkins plugin for Flaky test but I don't like the strategy involved which is to re-run failing tests a number of times to see if they pass. In theory it can work. In practice this means CI jobs that will take a lot longer to execute, making it impractical for functional UI tests (which is where we have flaky tests in XWiki). In addition, flakiness sometimes only happens when the full test suite is executed (i.e. it depends on what executes before) and sometimes require a large number of runs before passing.

So without further ado, here's the Jenkins Pipeline script to implement the strategy we defined above (you can check the full pipeline script):

 * Check for test flickers, and modify test result descriptions for tests that are identified as flicker. A test is
 * a flicker if there's a JIRA issue having the "Flickering Test" custom field containing the FQN of the test in the
 * format {@code <java package name>#<test name>}.
 * @return true if the failing tests only contain flickering tests

def boolean checkForFlickers()
   boolean containsOnlyFlickers = false
    AbstractTestResultAction testResultAction =  currentBuild.rawBuild.getAction(AbstractTestResultAction.class)
   if (testResultAction != null) {
       // Find all failed tests
       def failedTests = testResultAction.getResult().getFailedTests()
       if (failedTests.size() > 0) {
           // Get all false positives from JIRA
           def url = "https://jira.xwiki.org/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?".concat(
           def root = new XmlSlurper().parseText(url.toURL().text)
           def knownFlickers = []
            root.channel.item.customfields.customfield.each() { customfield ->
               if (customfield.customfieldname == 'Flickering Test') {
                    customfield.customfieldvalues.customfieldvalue.text().split(',').each() {
            echoXWiki "Known flickering tests: ${knownFlickers}"

           // For each failed test, check if it's in the known flicker list.
           // If all failed tests are flickers then don't send notification email
           def containsAtLeastOneFlicker = false
            containsOnlyFlickers = true
            failedTests.each() { testResult ->
               // Format of a Test Result id is "junit/<package name>/<test class name>/<test method name>"
               def parts = testResult.getId().split('/')
               def testName = "${parts[1]}.${parts[2]}#${parts[3]}"
               if (knownFlickers.contains(testName)) {
                   // Add the information that the test is a flicker to the test's description
                       "<h1 style='color:red'>This is a flickering test</h1>${testResult.getDescription() ?: ''}")
                    echoXWiki "Found flickering test: [${testName}]"
                    containsAtLeastOneFlicker = true
               } else {
                   // This is a real failing test, thus we'll need to send athe notification email...
                   containsOnlyFlickers = false

           if (containsAtLeastOneFlicker) {
                manager.addWarningBadge("Contains some flickering tests")
                manager.createSummary("warning.gif").appendText("<h1>Contains some flickering tests</h1>", false,
                   false, false, "red")

   return containsOnlyFlickers

Hope you like it! Let me know in comments how you're handling Flaky tests in your project so that we can compare/discuss.

Created by Admin on 2013/10/22 14:34
Created by Admin on 2013/10/22 14:34
This wiki is licensed under a Creative Commons 2.0 license