On software quality, collaborative development, wikis and Open Source...
Mar 20 2018
Onboarding Brainstorming
I had the honor of being invited to a seminar on "Automatic Quality Assurance and Release" at Dagstuhl by Benoit Baudry (we collaborate together on the STAMP research project). Our seminar was organized as un unconference and one session I proposed and led was the "Onboarding" one described below. The following persons participated to the discussion: V. Massol, D. Gagliardi, B. Danglot, H. Wright, B. Baudry.
Onboarding Discussions
When you're developing a project (be it some internal project or some open source project) one key element is how easy it is to onboard new users to your project. For open source projects this is essential to attract more contributors and have a lively community. For internal projects, it's useful to be able to have new employees or newcomers in general be able to get up to speed rapidly on your project.
This brainstorming session was about ideas of tools and practices to use to ease onboarding.
Here's the list of ideas we had (in no specific order):
- 1 - Tag issues in your issue tracker as onboarding issues to make it easy for newcomer to get started on something easy and be in success quickly. This also validates that they're able to use your software.
- 2 - Have a complete package of your software that can be installed and used as easily as possible. It should just work out of the box without having to perform any configuration or additional steps. A good strategy for applications is to provide a Docker image (or a Virtual Machine) with everything setup.
- 3 - Similarly, provide a packaged development environment. For example you can provide a VM with some preinstalled and configured IDE (with plugins installed and configured using the project's rules). One downside of such an approach is the time it takes to download the VM (which could several GB in size).
- 4 - A similar and possibly better approach would be to use an online IDE (e.g. Eclipse Che) to provide a complete prebuilt dev environment that wouldn't even require any downloading. This provides the fastest dev experience you can get. The downside is that if you need to onboard a potentially large number of developers, you'll need some important infra space/CPU on your server(s) hosting the online IDE, for hosting all the dev workspaces. This makes this option difficult to implement for open source projects for example. But it's viable and interesting in a company environment.
- 5 - Obviously having good documentation is a given. However too many projects still don't provide this or only provide good user documentation but not good developer documentation with project practices not being well documented or only a small portion being documented. Specific ideas:
- Document the code structure
- Document the practices for development
- Develop a tool that supports newcomers by letting them know when they follow / don't follow the rules
- Good documentation shall explicit assumptions (e.g. when you read this piece of documentation, I assume that you know X and Y)
- Have a good system to contribute to the documentation of the project (e.g. a wiki)
- Different documentation for users and for developers
- 6 - Have homogeneous practices and tools inside a project. This is especially true in a company environment where you may have various projects, each using its own tools and practices, making it harder to move between projects.
- 7 - Use standard tools that are well known (e.g. Maven or Docker). That increases the likelihood that a newcomer would already know the tool and be able to developer for your project.
- 8 - It's good to have documentation about best practices but it's even better if the important "must" rules be enforced automatically by a checking tool (can be part of the build for example, or part of your IDE setup). For example instead of saying "this @Unstable annotation should be removed after one development cycle", you could write a Maven Enforcer rule (or a Checkstyle rule, or a Spoon rule) to break the build if it happens, with a message explaining the reason and what is to be done. Usually humans may prefer to have a tool telling them that than a way telling them that they haven't been following the best practices documented at such location...
- 9 - Have a bot to help you discover documentation pages about a topic. For example by having a chat bot located in the project's chat, that when asked about will give you the link to it.
- 10 - Projects must have a medium to ask questions and get fast answers (such as a chat tool). Forum or mailing lists are good but less interesting when onboarding when the newcomer has a lot of questions in the initial phase and requires a conversation.
- 11 - Have an answer strategy so that when someone asks a question, the doc is updated (new FAQ entry for example) so that the next person who comes can find the answer or be given the link to the doc.
- 12 - Mentoring (human aspect of onboarding): have a dedicated colleague to whom you're not afraid to ask questions and who is a referent to you.
- 13 - Supporting a variety of platforms for your software will make it simpler for newcomers to contribute to your project.
- 14 - Split your projects into smaller parts. While it's hard and a daunting experience to contribute to the core code of a project, if this project has a core as small as possible and the rest is made of plugins/extensions then it becomes simpler to start contributing to those extensions first.
- 15 - Have some interactive tutorial to learn about your software or about its development. A good example of nice tutorial can be found at www.katacoda.com (for example for Docker, https://www.katacoda.com/courses/docker).
- 16 - Human aspect: have an environment that makes you feel welcome. Work and discuss how to best answer Pull Requests, how to communicate when someone joins the project, etc. Think of the newcomer as you would a child: somebody who will occasionally stumble and need encouragment. Try to have as much empathy as possible.
- 17 - Make sure that people asking questions always get an answer quickly, perhaps by establishing a role on the team to ensure answers are provided.
- 18 - Last but not least, an interesting thought experiment to verify that you have some good onboarding processes: imagine that 1000 developers join your project / company on the same day. How do you handle this?
Onboarding on XWiki
I was also curious to see how those ideas apply to the XWiki open source project and what part we implement.
Ideas | Implemented on XWiki? |
1 - Tag simple issues | ![]() |
2 - Complete install package | ![]() |
3 - Dev packaged environment | ![]() |
4 - Online IDE onboarding | ![]() |
5 - Good documentation | ![]() |
6 - Have homogeneous practices and tools inside a project | ![]() |
7 - Use standard tools that are well known | ![]() |
8 - Automatically enforced important rules | ![]() |
9 - Have a bot to help you discover documentation pages about a topic | ![]() |
10 - Projects must have a medium to ask questions and get fast answers | ![]() |
11 - Have an answer strategy so that when someone asks a question | ![]() |
12 - Mentoring (human aspect of onboarding) | ![]() |
13 - Supporting a variety of platforms for your software | ![]() |
14 - Split your projects into smaller parts | ![]() |
15 - Have some interactive tutorial to learn about your software | ![]() |
16 - Human aspect: have an environment that makes you feel welcome. | ![]() |
17 - Make sure that people asking questions always get an answer quickly | ![]() |
18 - 1000 devs joining at once experiment | ![]() |
So globally I'd say XWiki is pretty good at onboarding. I'd love to hear about things that we could improve on for onboarding. Any ideas?
If you own a project, we would be interested to hear about your ideas and how you perform onboarding. You could also use the list above as a way to measure your level of onboarding for your project and find out how you could improve it further.
Feb 05 2018
Once more I was happy to go to FOSDEM. This year XWiki SAS, the company I work for, had 12 employees going there and we had about 8 talks accepted + we had a stand for the XWiki open source project that we shared with our friends @ Tiki and Foswiki.
Here were the highlights for me:
- I talked about what's next on Java testing and covered test coverage, backward compatibility enforcement, mutation testing and environment testing. My experience on the last two types of tests are directly issued from my participation the STAMP research project where we develop and use tools to amplify existing tests.
- I did another talk about "Addressing the Long Tail of (web) applications", explaining how an advanced structured wiki such as XWiki can be used to quickly create ad-hoc application in wiki pages.
- Since we had a stand shared between 3 wiki communities (Tiki, Foswiki and XWiki), I was also interested in comparing our features, and how our communities work.
- I met the nice folks of Tiki at their TikiHouse and had long discussions about how similar and differently we do things. Maybe the topic for a future blog post?
- Then I had Michael Daum do a demo to me of the nice features of Foswiki. I was quite impressed and found a lot of similarities in features.
- Funnily our 3 wiki solutions are written in 3 different technologies: Tiki in PHP, Foswiki in Perl and XWiki in Java. Nice diversity!
- I met the nice folks of Tiki at their TikiHouse and had long discussions about how similar and differently we do things. Maybe the topic for a future blog post?
- I met a lot of people of course (Fosdem is really THE place to be to meet people from the FOSS communities) but I'd like to thank especially Damien Duportal who took the time to sit with me and go over several questions I had about Jenkins pipelines and Docker. I'll most likely blog about some of those solutions in the near future.
All in all, an excellent FOSDEM again, with lots of talks and waffles
Dec 15 2017
POSS 2017
My company (XWiki SAS) had a booth at the Paris Open Source Summit 2017 (POSS) and we also organized a track about "One job, one solution!", promoting open source solutions.
I was asked to talk about using XWiki as an alternative to Confluence or Sharepoint.
Here are the slides of the talk:
There were about 30 persons in the room. I focused on showing what I believe are the major differences, especially with Confluence that I know better than SharePoint.
If you're interested by more details you can find a comparison between XWiki and Confluence on xwiki.org.
Nov 17 2017
Controlling Test Quality
We already know how to control code quality by writing automated tests. We also know how to ensure that the code quality doesn't go down by using a tool to measure code covered by tests and fail the build automatically when it goes under a given threshold (and it seems to be working).
Wouldn't it be nice to be also able to verify the quality of the tests themselves?
I'm proposing the following strategy for this:
- Integrate PIT/Descartes in your Maven build
- PIT/Descartes generates a Mutation Score metric. So the idea is to monitor this metric and ensure that it keeps going in the right direction and doesn't go down. Similar than watching the Clover TPC metric and ensuring it always go up.
- Thus the idea would be, for each Maven module to set up a Mutation Score threshold (you'd run it once to get the current value and set that value as the initial threshold) and have the PIT/Descartes Maven plugin fail the build if the computed mutation score is below this threshold. In effect this would tell that the last changes have introduced tests that are of lowering quality than existing tests (in average) and that the new tests need to be improved to the level of the others.
In order for this strategy to be implementable we need PIT/Descartes to implement the following enhancements requests first:
- Threshold check to prevent build in Maven
- Handle multimodule projects
- Improve efficiency. Even though this one is very important so that developers can run it as part of the build locally before pushing their commits, the PIT/Descartes Maven plugin could be executed on the CI. But even for that to be possible, I believe that the execution speed needs to be improved substantially.
I'm eagerly waiting for this issues to be fixed in order to try this strategy on the XWiki project and verify it can work in practice. There are some reason why it couldn't work such as being too painful and not being easy enough to identify test problems and fix them.
WDYT? Do you see this as possibly working?
Nov 14 2017
Comparing Clover Reports
On the XWiki project, we use Clover to compute our global test coverage. We do this over several Git repositories and include functional tests (and more generally the coverage brought by some modules into other modules).
Now I wanted to see the difference between 2 reports that were generated:
- Report from 2016-12-20
- Report from 2017-11-09
I was surprised to see a drop in the global TPC, from 73.2% down to 71.3%. So I took the time to understand the issue.
It appears that Clover classifies your code classes as Application Code and Test Code (I have no idea what strategy it uses to differentiate them) and even though we've used the same version of Clover (4.1.2) for both reports, the test classes were not categorized similarly. It also seems that the TPC value given in the HTML report is from Application Code.
Luckily we asked the Clover Maven plugin to generate not only HTML reports but also XML reports. Thus I was able to write the following Groovy script that I executed in a wiki page in XWiki. I aggregated Application Code and Test code together in order to be able to compare the reports and the global TPC value.
def saveMetrics(def packageName, def metricsElement, def map) {
def coveredconditionals = metricsElement.@coveredconditionals.toDouble()
def coveredstatements = metricsElement.@coveredstatements.toDouble()
def coveredmethods = metricsElement.@coveredmethods.toDouble()
def conditionals = metricsElement.@conditionals.toDouble()
def statements = metricsElement.@statements.toDouble()
def methods = metricsElement.@methods.toDouble()
def mapEntry = map.get(packageName)
if (mapEntry) {
coveredconditionals = coveredconditionals + mapEntry.get('coveredconditionals')
coveredstatements = coveredstatements + mapEntry.get('coveredstatements')
coveredmethods = coveredmethods + mapEntry.get('coveredmethods')
conditionals = conditionals + mapEntry.get('conditionals')
statements = statements + mapEntry.get('statements')
methods = methods + mapEntry.get('methods')
def metrics = [:]
metrics.put('coveredconditionals', coveredconditionals)
metrics.put('coveredstatements', coveredstatements)
metrics.put('coveredmethods', coveredmethods)
metrics.put('conditionals', conditionals)
metrics.put('statements', statements)
metrics.put('methods', methods)
map.put(packageName, metrics)
def scrapeData(url) {
def root = new XmlSlurper().parseText(url.toURL().text)
def map = [:]
root.project.package.each() { packageElement ->
def packageName = packageElement.@name
saveMetrics(packageName.text(), packageElement.metrics, map)
root.testproject.package.each() { packageElement ->
def packageName = packageElement.@name
saveMetrics(packageName.text(), packageElement.metrics, map)
return map
def computeTPC(def map) {
def tpcMap = [:]
def totalcoveredconditionals = 0
def totalcoveredstatements = 0
def totalcoveredmethods = 0
def totalconditionals = 0
def totalstatements = 0
def totalmethods = 0
map.each() { packageName, metrics ->
def coveredconditionals = metrics.get('coveredconditionals')
totalcoveredconditionals += coveredconditionals
def coveredstatements = metrics.get('coveredstatements')
totalcoveredstatements += coveredstatements
def coveredmethods = metrics.get('coveredmethods')
totalcoveredmethods += coveredmethods
def conditionals = metrics.get('conditionals')
totalconditionals += conditionals
def statements = metrics.get('statements')
totalstatements += statements
def methods = metrics.get('methods')
totalmethods += methods
def elementsCount = conditionals + statements + methods
def tpc
if (elementsCount == 0) {
tpc = 0
} else {
tpc = ((coveredconditionals + coveredstatements + coveredmethods)/(conditionals + statements + methods)).trunc(4) * 100
tpcMap.put(packageName, tpc)
tpcMap.put("ALL", ((totalcoveredconditionals + totalcoveredstatements + totalcoveredmethods)/(totalconditionals + totalstatements + totalmethods)).trunc(4) * 100)
return tpcMap
// map1 = old
def map1 = computeTPC(scrapeData('http://maven.xwiki.org/site/clover/20161220/clover-commons+rendering+platform+enterprise-20161220-2134/clover.xml')).sort()
// map2 = new
def map2 = computeTPC(scrapeData('http://maven.xwiki.org/site/clover/20171109/clover-commons+rendering+platform-20171109-1920/clover.xml')).sort()
println "= Added Packages"
println "|=Package|=TPC New"
map2.each() { packageName, tpc ->
if (!map1.containsKey(packageName)) {
println "|${packageName}|${tpc}"
println "= Differences"
println "|=Package|=TPC Old|=TPC New"
map2.each() { packageName, tpc ->
def oldtpc = map1.get(packageName)
if (oldtpc && tpc != oldtpc) {
def css = oldtpc > tpc ? '(% style="color:red;" %)' : '(% style="color:green;" %)'
println "|${packageName}|${oldtpc}|${css}${tpc}"
println "= Removed Packages"
println "|=Package|=TPC Old"
map1.each() { packageName, tpc ->
if (!map2.containsKey(packageName)) {
println "|${packageName}|${tpc}"
And the result was quite different from what the HTML report was giving us!
We went from 74.07% in 2016-12-20 to 76.28% in 2017-11-09 (so quite different from the 73.2% to 71.3% figure given by the HTML report). Much nicer!
Note that one reason I wanted to compare the TPC values was to see if our strategy of failing the build if a module's TPC is below the current threshold was working or not (I had tried to assess it before but it wasn't very conclusive).
Now I know that we won 1.9% of TPC in a bit less than a year and that looks good
EDIT: I'm aware of the Historical feature of Clover but:
- We haven't set it up so it's too late to compare old reports
- I don't think it would help with the issue we faced with test code being counted as Application Code, and that being done differently depending on the generated reports.
Nov 08 2017
Flaky tests handling with Jenkins & JIRA
Flaky tests are a plague because they lower the credibility in your CI strategy, by sending false positive notification emails.
In a previous blog post, I detailed a solution we use on the XWiki project to handle false positives caused by the environment on which the CI build is running. However this solution wasn't handling flaky tests. This blog post is about fixing this!
So the strategy I'm proposing for Flaky tests is the following:
- When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it (we currently have the following open issues related to Flaky tests)
- The JIRA issue is marked as containing a flaky test by filling a custom field called "Flickering Test", using the following format: <package name of test class>.<test class name>#<test method name>. There can be several entries separated by commas.
- In our Pipeline script, after the tests have executed, review the failing ones and check if they are in the list of known flaky tests in JIRA. If so, indicate it in the Jenkins test report. If all failing tests are flickers, don't send a notification email.
Indication in the job history:
Indication on the job result page:
Information on the test page itself:
Note that there's an alternate solution that can also work:
- When a Flaky test is discovered, create a JIRA issue to remember to work on it and fix it
- Add an @Ignore annotation in the test with a detail pointing to the JIRA issue (something like @Ignore("WebDriver doesn't support uploading multiple files in one input, see http://code.google.com/p/selenium/issues/detail?id=2239")
). This will prevent the build from executing this flaky test.
This last solution is certainly low-tech compared to the first one. I prefer the first one though for the following reasons:
- It allows flaky tests to continue executing on the CI and thus serve as a constant reminder that something needs to be fixed. Adding the @Ignore annotation feels like putting the dust under the carpet and there's little chance you're going to come back to it in the future...
- Since our script acts as postbuild script on the CI agent, there's the possibility to add some logic to auto-discover flaky tests that have not yet been marked as flaky.
Also note that there's a Jenkins plugin for Flaky test but I don't like the strategy involved which is to re-run failing tests a number of times to see if they pass. In theory it can work. In practice this means CI jobs that will take a lot longer to execute, making it impractical for functional UI tests (which is where we have flaky tests in XWiki). In addition, flakiness sometimes only happens when the full test suite is executed (i.e. it depends on what executes before) and sometimes require a large number of runs before passing.
So without further ado, here's the Jenkins Pipeline script to implement the strategy we defined above (you can check the full pipeline script):
* Check for test flickers, and modify test result descriptions for tests that are identified as flicker. A test is
* a flicker if there's a JIRA issue having the "Flickering Test" custom field containing the FQN of the test in the
* format {@code <java package name>#<test name>}.
* @return true if the failing tests only contain flickering tests
def boolean checkForFlickers()
boolean containsOnlyFlickers = false
AbstractTestResultAction testResultAction = currentBuild.rawBuild.getAction(AbstractTestResultAction.class)
if (testResultAction != null) {
// Find all failed tests
def failedTests = testResultAction.getResult().getFailedTests()
if (failedTests.size() > 0) {
// Get all false positives from JIRA
def url = "https://jira.xwiki.org/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?".concat(
def root = new XmlSlurper().parseText(url.toURL().text)
def knownFlickers = []
root.channel.item.customfields.customfield.each() { customfield ->
if (customfield.customfieldname == 'Flickering Test') {
customfield.customfieldvalues.customfieldvalue.text().split(',').each() {
echoXWiki "Known flickering tests: ${knownFlickers}"
// For each failed test, check if it's in the known flicker list.
// If all failed tests are flickers then don't send notification email
def containsAtLeastOneFlicker = false
containsOnlyFlickers = true
failedTests.each() { testResult ->
// Format of a Test Result id is "junit/<package name>/<test class name>/<test method name>"
def parts = testResult.getId().split('/')
def testName = "${parts[1]}.${parts[2]}#${parts[3]}"
if (knownFlickers.contains(testName)) {
// Add the information that the test is a flicker to the test's description
"<h1 style='color:red'>This is a flickering test</h1>${testResult.getDescription() ?: ''}")
echoXWiki "Found flickering test: [${testName}]"
containsAtLeastOneFlicker = true
} else {
// This is a real failing test, thus we'll need to send athe notification email...
containsOnlyFlickers = false
if (containsAtLeastOneFlicker) {
manager.addWarningBadge("Contains some flickering tests")
manager.createSummary("warning.gif").appendText("<h1>Contains some flickering tests</h1>", false,
false, false, "red")
return containsOnlyFlickers
Hope you like it! Let me know in comments how you're handling Flaky tests in your project so that we can compare/discuss.
Oct 29 2017
Softshake 2017
I had the chance to participate to Softshake (2017 edition) for the first time. It's a small but very nice conference held in Geneva, Switzerland.
From what I gathered, this year there were less attendees than in the previous years (About 150 vs 300 before). However, the organization was very nice:
- Located inside the Hepia school with plenty of rooms available
- 6 tracks in parallel, which is incredible for a small conference
- Breakfast, lunch and snacks organized with good food
- Speaker dinner with Fondue and all
I got to present 2 talks:
- XWiki: The Web's Swiss Army Knife. This is my usual "XWiki: A web development platform" talk that I've given a few times already but with a more Swiss-related title
- Creating your own project's Quality Dashboard. This one was brand new and was a big love demo of how to use XWiki to create a custom Quality Dashboard by aggregating metrics from other sites (Jenkins, SonarQube, JIRA and GitHub), saving them locally to draw history graphs and sending emails when combined metric thresholds are crossed. A lot more people attended this one and I like this new angle of defining a real-life use case and using XWiki just as a tool to achieve it. I'll continue exploring this new way of presenting XWiki since people liked it a lot more and it feels more natural.
I was also very happy to see my friend and ex-OCTO Technology colleague Philippe Kernevez, and to meet new OCTO consultants. Reminded me of the good times at OCTO
Oct 28 2017
Google Summer of Code Summit 2017
This year the XWiki project had 5 4 GSOC students (we lost one to FOSSASIA who clicked faster than us!). This is way cool and we're glad that Google is organizing this every year. XWiki has been participating since the beginning (2005 AFAIR).
Every year the XWiki project sends 2 mentors to participate to eh GSOC summit. This year it was Thomas Mortagne and me who got the honor to go.
Quiz: find us on the group picture:
This conference is organized as an unconference. These are the key highlights I gout out of it:
- Google will continue organizing GSOC in the future. Yeah!
- Lots of discussions about Google Code In (for students aged 12-17) and to handle it to the best for organizations. This convinced us to register the XWiki project to participate and... we just got selected 2 days ago. If you're interested to participate, see the XWiki Code-In page. I'm really curious to see how it'll go.
- I proposed a session on "Wikis: what's next", which turned into a "What is XWiki and why is it different from other wikis"
Incidentally this got me thinking about how to best express what XWiki is in one sentence. So far I've found the following:
- View 1: Bring the concepts of wiki (collaboration on same content, edit+save, history+rollback, links) to application development.
- View 2: A runtime web development platform. The default wiki you get is just one example (Similar to Eclipse IDE vs eclipse platform).
- View3: Provide ability to add semantics to content: from free form data to structured data. Pages can contain free form or structured data / metadata, and control how it's displayed.
- View 4: A web application server for content-related applications i.e. Provides all components / building blocks to easily create web applications.
- Extremely well organized by Google. And good food and lots of choice (I'm vegetarian). I'm vegetarian but I love chocolate and this was heaven (all attendees brought chocolate!):
Thanks Google. Till next time!
Sep 28 2017
Mutation testing with PIT and Descartes
XWiki SAS is part of an European research project named STAMP. As part of this project I've been able to experiment a bit with Descartes, a mutation engine for PIT.
What PIT does is mutate the code under test and check if the existing test suite is able to detect those mutations. In other words, it checks the quality of your test suite.
Descartes plugs into PIT by providing a set of specific mutators. For example one mutator will replace the output of methods by some fixed value (for example a method returning a boolean will always return true). Another will remove the content of void methods. It then generates a report.
Here's an example of running Descartes on a module of XWiki:
You can see both the test coverage score (computed automatically by PIT using Jacoco) and the Mutation score.
If we drill down to one class (MacroId.java) we can see for example the following report for the equals() method:
What's interesting to note is that the test coverage says that the following code has been tested:
(getId() == macroId.getId() || (getId() != null && getId().equals(macroId.getId())))
&& (getSyntax() == macroId.getSyntax() || (getSyntax() != null && getSyntax().equals(
However, the mutation testing is telling us a different story. It says that if you change the equals method code with negative conditions (i.e. testing for inequality), the test still reports success.
If we check the test code:
public void testEquality()
MacroId id1 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id2 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id3 = new MacroId("otherid", Syntax.XWIKI_2_0);
MacroId id4 = new MacroId("id", Syntax.XHTML_1_0);
MacroId id5 = new MacroId("otherid", Syntax.XHTML_1_0);
MacroId id6 = new MacroId("id");
MacroId id7 = new MacroId("id");
Assert.assertEquals(id2, id1);
// Equal objects must have equal hashcode
Assert.assertTrue(id1.hashCode() == id2.hashCode());
Assert.assertFalse(id3 == id1);
Assert.assertFalse(id4 == id1);
Assert.assertFalse(id5 == id3);
Assert.assertFalse(id6 == id1);
Assert.assertEquals(id7, id6);
// Equal objects must have equal hashcode
Assert.assertTrue(id6.hashCode() == id7.hashCode());
We can indeed see that the test doesn't test for inequality. Thus in practice if we replace the equals method by return true; then the test still pass.
That's interesting because that's something that test coverage didn't notice!
More generally the report provides a summary of all mutations it has done and whether they were killed or not by the tests. For example on this class:
Here's what I learnt while trying to use Descartes on XWiki:
- It's being actively developed
- It's interesting to classify the results in 3 categories:
- strong pseudo-tested methods: no matter the return values of a method, the tests still passes. This is the worst offender since it means the tests really needs to be improved. This was the case in the example above.
- weak pseudo-tested methods: the tests passes with at least one modified value. Not as bad as strong pseudo-tested but you may want still want to check it out.
- fully tested methods: the tests fail for all mutations and thus can be considered rock-solid!
- So in the future, the generated report should provide this classification to help analyze the results and focus on important problems.
- It would be nice if the Maven plugin was improved and be able to fail if the mutation score was below a certain threshold (as we do for test coverage).
- Performance: It's quite slow compared to Jacoco execution time for example. In my example above it took 34 seconds to execute will all possible mutations (for a project with 14 test classes, 31 tests and 20 classes).
- It would be nice to have a Sonar integration so that PIT/Descartes could provide some stats on the Sonar dashboard.
- Big limitation: ATM there's a big limitation: PIT (and/or Descartes) doesn't support being executed on a multi-module project. This means that right now you need to compute the full classpath for all modules and run all sources and tests as if it was a single module. This causes problems for all tests that depend on the filesystem and expect a given directory structure. It's also tedious and a error-prone problem since the classpath order can have side effects.
PIT/Descartes is very nice but I feel it would need to provide a bit more added-value out of the box for the XWiki open source project to use it in an automated manner. The test coverage report we have are already providing a lot of information about the code that is not tested at all and if we have 5 hours to spend, we would probably spend them on adding tests rather than improving further existing tests. YMMV. If you have a very strong suite of tests and you want to check its quality, then PIT/Descartes is your friend!
If Descartes could provide the build-failure-on-low-threshold feature mentioned above that could be one way we could integrate it in the XWiki build. But for that to be possible PIT/Descartes need to be able to run on multi-module Maven projects.
I'm also currently testing DSpot. DSpot uses PIT and Descartes but in addition it uses the results to generate new tests automatically. That would be even more interesting (if it can work well-enough). I'll post back when I've been able to run DSpot on XWiki and learn more by using it.
Now, the Descartes project could also use the information provided by line coverage to automatically generate tests to cover the spotted issues.
I'd like to thank Oscar Luis Vera Pérez who's actively working on Descartes and who's shown me how to use it and how to analyze the results. Thanks Oscar! I'll also continue to work with Oscar on improving Descartes and executing it on the XWiki code base.
Sep 17 2017
Using Docker + Jenkins to test configurations
On the XWiki project, we currently have automated functional tests that use Selenium and Jenkins. However they exercise only a single configuration: HSQLDB, Jetty and FireFox (and all on a fixed version).
XWiki SAS is part of the STAMP research project and one domain of this research is improving configuration testing.
As a first step I've worked on providing official XWiki images but I've only provided 2 configurations (XWiki on Tomcat + Mysql and on Tomcat + PostgreSQL) and they're not currently exercised by our functional tests.
Thus I'm proposing below an architecture that should allow XWiki to be tested on various configurations:
- Various supported databases and versions
- Various Servlet containers and versions
- Various Browsers and versions
Here's what I think it would mean in term of a Jenkins Pipeline (note that at this stage this is pseudo code and should not be understood literally):
agent {
docker {
image 'xwiki-maven-firefox'
args '-v $HOME/.m2:/root/.m2'
stages {
stage('Test') {
steps {
docker.image('mysql:5').withRun('-e "MYSQL_ROOT_PASSWORD=my-secret-pw"') { c ->
docker.image('tomcat:8').withRun('-v $XWIKIDIR:/usr/local/tomcat/webapps/xwiki').inside("--link ${c.id}:db") {
wrap([$class: 'Xvnc']) {
withMaven(maven: mavenTool, mavenOpts: mavenOpts) {
sh "mvn ..."
Some explanations:
- We would setup a custom Docker Registry so that we can prepare images that would be used by the Jenkins pipeline to create containers
- Those images could themselves be refreshed regularly based on another pipeline that would use the docker.build() construct
- We would use a Jenkins Agent dynamically provisioned from an image that would contain: sshd and a Jenkins user (so that Jenkins Master can communicate with it), Maven, VNC Server and a browser (FireFox for ex). We would have several such images, one per browser we want to test with.
- Note that since we want to support only the latest browsers versions for FF/Chrome/Safari we could use apt to update (and commit) the browser version in the container prior to starting it, from the pipeline script.
- Then the pipeline would spawn two containers: one for the DB and one for the Servlet container. Importantly for the Servlet container, I think we should mount a volume that points to a local directory on the agent, which would contain the XWiki exploded WAR (done as a pre-step by the Maven build). This would save time and not have to recreate a new image every time there's a commit on the XWiki codebase!
- The build that contains the tests will be started by the Agent (and we would mount the the Maven local repository as a volume in order to sped up build times across runs).
- Right now the XWiki build already knows how to run the functional tests by fetching/exploding the XWiki WAR in the target directory and then starting XWiki directly from the tests, so all we would need to do is to make sure we map this directory in the container containing the Servlet container (e.g. in Tomcat it would be mapped to [TOMCATHOME]/webapps/xwiki).
This is just architecture at this stage. Now we need to put that in practice and find the gotchas (there always are ).
WDYT? Could this work? Are you doing this yourself?
Stay tuned, I should be able to report about it went in the coming weeks/months.