Cucumber is (almost always) waste of time

Most organizations will see no benefit from introducing Cucumber (or its cousins, SpecFlow and JBehave) into their testing process.

What is Cucumber?

Cucumber is a tool for "Behavior-Driven Development". If you already know what BDD is, because you are part of an organization that uses it successfully, this article does not apply to you. I'm sure you know better than I how useful Cucumber is or isn't to your development workflow. Thanks for visiting, and have a nice day :)

I used to be under the impression that BDD mostly means writing prettier tests, and doing so earlier in the developer workflow (for example, in a Backlog Refinement meeting). If this is your plan, Cucumber will not help you.

Why might I use Cucumber?

The pitch is this: Cucumber allows you to write very pretty tests. For example (straight from the Cucumber home page):


Scenario: Eating a few cucumbers is no problem
	Given Alice is hungry
	When she eats 3 cucumbers
	Then she will be full
	

Gorgeous, right? It's so straightforward, so matter-of-fact! Anyone could read this test, and everyone will agree what it means. In fact it's so minimal, anyone could write it!

If we introduce Cucumber, we wouldn't need the developers to write automated tests; the QA testers could do it themselves! We could even have Business Analysts or Product Owners write the tests, before development even starts! Project Managers and even Vice Presidents could read the tests, to make sure the software will do what they want, before it's even built!

That's all nonsense. Sorry.

Why wouldn't I use Cucumber?

Everyone will not agree what it means

Take the above example. What does it mean to say that "Alice is hungry"? Has a certain amount of time passed since her last meal? How much time? Did she skip a meal? Do we track her exercise along with her calorie intake? Do we abstract that away with a "satiation meter"? Where on the meter is the threshold for "hungry"? What if she's really hungry - is "starving" different from "hungry"? Does this specification apply to starving people as well? Are we modeling a system with multiple types of food? What if Alice eats two tomatos and a cucumber? What if she eats three pickles?

My point is that programs can be very complicated. Hiding the complexity behind a pretty test does not make that complexity go away; it only hides it. The true meaning is still known only to the developers; to those whose job it is to manage all that complexity.

Only developers can write it

Reading the example above, one might think it would be easy to write similar tests. You might copy it and change some details to see what happens. Perhaps:


Given Alice is not hungry
	When she eats 1 cucumber
	Then she will be sick
	

It's true; anyone with a propensity for tinkering could write that, and be reasonably sure it's valid Cucumber syntax. But will it run? Only if the developers make it run.

Behind every set of Cucumber tests is a step definition program, written in a general-purpose programming language (like Ruby, C#, or Java). As the Cucumber engine reads the pretty Given-When-Thens, it will scan the step file for matching definitions, in order to figure out what to actually do. The steps behind our example tests might look like this:


[Binding]
public class HungerSteps
{
	private Person alice;

	[StepArgumentTransformation(@"( not)?")]
	public bool NotToBool(string maybeNot)
	{
		return maybeNot != " not";
	}

	[Given(@"(.*) is( not)? hungry")]
	public void GivenAPerson(string name, bool hungry)
	{
		alice = new Person(name, hungry);
	}

	[When(@".* eats (\d+) cucumbers")]
	public void WhenTheyEatSomeCucumbers(int count)
	{
		alice.EatCucumbers(count);
	}

	[Then(@".* will be (.*)")]
	public void ThenTheyWillHaveStatus(string status)
	{
		// using FluentAssertions
		alice.Status.Should().Be(status);
	}
}
	

You can see how the annotations on each method roughly correspond to the pretty syntax from the tests. You can also see how this code would be mind-numbing for anyone not steeped in C-like programming languages - not to mention the use of regular expressions, and the SpecFlow-specific StepArgumentTransformation method needed to translate "hungry" and "not hungry" to true/false values (which I absolutely had to look up, just to write this very simple example)!

And if you are very observant, you can also see the bug. My "tinkered" example test will not run, because the "When" clause does not match the regular expression annotation on the C# method.

I am not gatekeeping here. I would love to teach C#, regular expressions, and my Stack Overflow search strategy to anyone who will listen! But it should be clear that this is a very specific skillset, and most people find other skillsets more worthy investments of their valuable time. My point is that any non-developer attempting to write a pretty Cucumber test can only do so with the help of a developer writing the steps.

This mismatch in complexity can also compound the problem of understanding. When the behavior changes and the tests start to fail, the "right" way to fix them will be to change the Cucumber tests and the binding annotations and the backing code. The easy way to fix them will be to change the backing code only. When the easy way is not the right way, humans will be humans, and the quality of the tests will decay over time.

No one will read it

Have you ever written an email with two questions, and had only one of them answered? I have. It can feel like a slight. "Did you not read my whole email? Do you not care about my other question? Why do I bother writing to you at all, if you're not even going to read it?"

But the fact is, people are busy. With every sentence you write, you are asking a favor from the reader: a few more seconds of their precious time and attention. You cannot blame them for starting to scan, and then starting to skip.

Any system complicated enough to justify a test suite will require a lot of tests to fully specify. Every function, every branch, and every loop adds at least a test or two. So an exhaustive set of Cucumber tests will be a very pretty, very massive wall of text.

Can you really imagine a Product Owner, let alone a Vice President, spending an hour to carefully read each test, and coming back to point out an edge case that was missed? I used to be able to imagine that - until a missed edge case made its way to production and needed to be hotfixed. Introducing Cucumber will not change how the development team works.

Cucumber does not add anything

Every test written in Cucumber relies on step definitions written in another langauge. Therefore nothing done in Cucumber could not also be done in that other language.

One proposed counterexample might be Cucumber's syntax for generating tests from tables. I would counter that the parameters in a Cucumber table could just as easily be parameters to an MSTest, or instances of a custom class, or rows in a CSV. I'll omit any examples; I trust you to imagine clean solutions.

If not Cucumber, then what?

Write your tests in the same language you wrote the application. It can be made darn near as pretty, and takes only a fraction of the effort:


[TestClass]
public class HungerTests
{
	[TestMethod]
	public void EatingAFewIsNoProblem()
	{
		var alice = new Person(name: "Alice", hungry: true);
		alice.EatCucumbers(count: 3);
		alice.Status.Should().Be("full");
	}

	[TestMethod]
	public void EatingWhenFullMakesYouSick()
	{
		var alice = new Person(name: "Alice", hungry: false);
		alice.EatCucumbers(count: 1);
		alice.Status.Should().Be("sick");
	}
}
	

Is that so bad?

It's the job of a developer to produce a fully-functional, bug-free application. It's the job of the developer to manage all of the details of the implementation, and then account for those details in the tests. We developers cannot do these jobs perfectly, but we cannot offload them to other roles and expect good results.

The creator's take

I would be remiss not to mention that the man who developed Cucumber would disagree with my premise. He would agree that Cucumber is not inherently better for writing tests than any other tool, because Cucumber is not a testing tool, but he would also say that my arguments miss the point of BDD, or are artifacts of poor implementations. That's true! The core of my argument is that most organizations will also miss the point, and implement BDD poorly.