Unclean Code

On this page I take an example from Uncle Bob's 2009 book Clean Code and give it a very detailed review. My summary review of the book is more positive overall; I will post it soon.

Normal code review, book code review

Before we dive in, let me say that I am not usually this mean. When I review a peer's code, I know that they are writing under constraints. They don't get to choose what problem to solve. They don't get to choose the deadline. They don't get to choose the language, the behavior, or often even the approach. They certainly don't get to choose to simply not do the work.

None of that is true for Uncle Bob. He chose a problem to solve with this example. He chose the language, the behavior, the approach. He chose to write a book in the first place, and that book is called Clean Code. Finally, he prefaced this example with the following quote:

Please read it very carefully. I worked hard on the style and structure and hope it is worth emulating.

So believe it's fair to hold this code to a higher standard than we typically might. And that's what I'm going to do.

The problem we're solving

Suppose you have written an application with some options. This app can run with logging turned on or off; it will listen to some port; and it will put its logs in some directory. If this program is invoked via the command line, it might look like this:


$ java myProgram -l -p 80 -d /home/logs
	

How will you turn the string array ["-l", "-p", "80", "-d", "/home/logs"] into


log = true;
port = 80;
directory = "/home/logs";
	

The correct answer is to use an argument-parsing library. Any library you can find will be known to dozens (if not hundreds or thousands or millions) of developers. I am confident it will meet your needs, because it will have already met the needs of those other developers. There is even a chance that your successor will already know how to use it. Friends, I urge you, use a library.

Bob's answer is to write his own library. It is known only to him. There is a 100% chance that every developer maintaining his code must first learn the rules of his argument parsing library. Let's be generous and say that Bob made this choice, not to write maintainable code, but purely as a teaching example. So let's have a look.

The code

Here is all of the code, as I transcribed it from the book. 1

Main.java (example usage)


public static void main(String[] args) {
	try {
		Args arg = new Args("l,p#,d*", args);
		boolean logging = arg.getBoolean('l');
		int port = arg.getInt('p');
		String directory = arg.getString('d');
		executeApplication(logging, port, directory);
	} catch (ArgsException e) {
		System.out.printf("Argument error: %s\n", e.errorMessage());
	}
}
	

Args.java


package com.objectmentor.utilities.args;

import static com.objectmentor.utilities.args.ArgsException.ErrorCode.*;
import java.util.*;

public class Args {
	private Map<Character, ArgumentMarshaler> marshalers;
	private Set<Character> argsFound;
	private ListIterator<String> currentArgument;

	public Args(String schema, String[] args) throws ArgsException {
		marshalers = new HashMap<Character, ArgumentMarshaler>();
		argsFound = new HashSet<Character>();

		parseSchema(schema);
		parseArgumentStrings(Arrays.asList(args));
	}

	private void parseSchema(String schema) throws ArgsException {
		for (String element : schema.split(","))
			if (element.length() > 0)
				parseSchemaElement(element.trim());
	}

	private void parseSchemaElement(String element) throws ArgsException {
		char elementId = element.charAt(0);
		String elementTail = element.substring(1);
		validateSchemaElementId(elementId);
		if (elementTail.length() == 0)
			marshalers.put(elementId, new BooleanArgumentMarshaler());
		else if (elementTail.equals("*"))
			marshalers.put(elementId, new StringArgumentMarshaler());
		else if (elementTail.equals("#"))
			marshalers.put(elementId, new IntegerArgumentMarshaler());
		else if (elementTail.equals("##"))
			marshalers.put(elementId, new DoubleArgumentMarshaler());
		else if (elementTail.equals("[*]"))
			marshalers.put(elementId, new StringArrayArgumentMarshaler());
		else
			throw new ArgsException(INVALID_ARGUMENT_FORMAT, elementId, elementTail);
	}

	private void validateSchemaElementId(char elementId) throws ArgsException {
		if (!Character.isLetter(elementId))
			throw new ArgsException(INVALID_ARGUMENT_NAME, elementId, null);
	}

	private void parseArgumentStrings(List<String> argsList) throws ArgsException
	{
		for (currentArgument = argsList.listIterator(); currentArgument.hasNext();)
		{
			String argString = currentArgument.next();
			if (argString.startsWith("-")) {
				parseArgumentCharacters(argString.substring(1));
			} else {
				currentArgument.previous();
				break;
			}
		}
	}

	private void parseArgumentCharacters(String argChars) throws ArgsException {
		for (int i = 0; i < argChars.length(); i++)
			parseArgumentCharacter(argChars.charAt(i));
	}

	private void parseArgumentCharacter(char argChar) throws ArgsException {
		ArgumentMarshaler m = marshalers.get(argChar);
		if (m == null) {
			throw new ArgsException(UNEXPECTED_ARGUMENT, argChar, null);
		} else {
			argsFound.add(argChar);
			try {
				m.set(currentArgument);
			} catch (ArgsException e) {
				e.setErrorArgumentId(argChar);
				throw e;
			}
		}
	}

	public boolean has(char arg) {
		return argsFound.contains(arg);
	}

	public int nextArgument() {
		return currentArgument.nextIndex();
	}

	public boolean getBoolean(char arg) {
		return BooleanArgumentMarshaler.getValue(marshalers.get(arg));
	}

	public String getString(char arg) {
		return StringArgumentMarshaler.getValue(marshalers.get(arg));
	}

	public int getInt(char arg) {
		return IntegerArgumentMarshaler.getValue(marshalers.get(arg));
	}

	public double getDouble(char arg) {
		return DoubleArgumentMarshaler.getValue(marshalers.get(arg));
	}

	public String[] getStringArray(char arg) {
		return StringArrayArgumentMarshaler.getValue(marshalers.get(arg));
	}
}
	

ArgumentMarshaler.java


public interface ArgumentMarshaler {
	void set(Iterator<String> currentArgument) throws ArgsException;
}
	

BooleanArgumentMarshaler.java


public class BooleanArgumentMarshaler implements ArgumentMarshaler {
	private boolean booleanValue = false;

	public void set(Iterator<String> currentArgument) throws ArgsException {
		booleanValue = true;
	}

	public static boolean getValue(ArgumentMarshaler am) {
		if (am != null && am instanceof BooleanArgumentMarshaler)
			return ((BooleanArgumentMarshaler) am).booleanValue;
		else
			return false;
	}
}
	

StringArgumentMarshaler.java


import static com.objectmentor.utilities.args.ArgsException.ErrorCode;

public class StringArgumentMarshaler implements ArgumentMarshaler {
	private String stringValue = "";

	public void set(Iterator<String> currentArgument) throws ArgsException {
		try {
			stringValue = currentArgument.next();
		} catch (NoSuchElementException e) {
			throw new ArgsException(MISSING_STRING);
		}
	}

	public static String getValue(ArgumentMarshaler am) {
		if (am != null && am instanceof StringArgumentMarshaler)
			return ((StringArgumentMarshaler) am).stringValue;
		else
			return "";
	}
}
	

IntegerArgumentMarshaler.java


import static com.objectmentor.utilities.args.ArgsException.ErrorCode;

public class IntegerArgumentMarshaler implements ArgumentMarshaler {
	private int intValue = 0;

	public void set(Iterator<String> currentArgument) throws ArgsException {
		String parameter = null;
		try {
			parameter = currentArgument.next();
			intValue = Integer.parseInt(parameter);
		} catch (NoSuchElementException e) {
			throw new ArgsException(MISSING_INTEGER);
		} catch (NumberFormatException e) {
			throw new ArgsException(INVALID_INTEGER, parameter);
		}
	}

	public static Integer getValue(ArgumentMarshaler am) {
		if (am != null && am instanceof IntegerArgumentMarshaler)
			return ((IntegerArgumentMarshaler) am).intValue;
		else
			return 0;
	}
}
	

ArgsException.java


import static com.objectmentor.utilities.args.ArgsException.ErrorCode.*;

public class ArgsException extends Exception {
	private char errorArgumentId = '\0';
	private String errorParameter = null;
	private ErrorCode errorCode = OK;

	public ArgsException() {}

	public ArgsException(String message) { super(message); }

	public ArgsException(ErrorCode errorCode) {
		this.errorCode = errorCode;
	}

	public ArgsException(ErrorCode errorCode, String errorParameter) {
		this.errorCode = errorCode;
		this.errorParameter = errorParameter;
	}

	public ArgsException(ErrorCode errorCode,
	                     char errorArgumentId, String errorParameter) {
		this.errorCode = errorCode;
		this.errorParameter = errorParameter;
		this.errorArgumentId = errorArgumentId;
	}

	public char getErrorArgumentId() {
		return errorArgumentId;
	}

	public void setErrorArgumentId(char errorArgumentId) {
		this.errorArgumentId = errorArgumentId;
	}

	public String getErrorParameter() {
		return errorParameter;
	}

	public void setErrorParameter(String errorParameter) {
		this.errorParameter = errorParameter;
	}

	public ErrorCode getErrorCode() {
		return errorCode;
	}

	public void setErrorCode(ErrorCode errorCode) {
		this.errorCode = errorCode;
	}

	public String errorMessage() {
		switch (errorCode) {
			case OK:
				return "TILT: Should not get here.";
			case UNEXPECTED_ARGUMENT:
				return String.format("Argument -%c unexpected.", errorArgumentId);
			case MISSING_STRING:
				return String.format("Could not find string parameter for -%c.",
				                     errorArgumentId);
			case INVALID_INTEGER:
				return String.format("Argument -$c expects an integer but was '%s'.",
				                     errorArgumentId, errorParameter);
			case MISSING_INTEGER:
				return String.format("Could not find integer parameter for -%c.",
				                     errorArgumentId);
			case INVALID_DOUBLE:
				return String.format("Argument -$c expects a double but was '%s'.",
				                     errorArgumentId, errorParameter);
			case MISSING_DOUBLE:
				return String.format("Could not find double parameter for -%c.",
				                     errorArgumentId);
			case INVALID_ARGUMENT_NAME:
				return String.format("'%c' is not a valid argument name.",
				                     errorArgumentId);
			case INVALID_ARGUMENT_FORMAT:
				return String.format("'%s' is not a valid argument format.",
				                     errorParameter);
		}

		return "";
	}

	public enum ErrorCode {
		OK, INVALID_ARGUMENT_FORMAT, UNEXPECTED_ARGUMENT, INVALID_ARGUMENT_NAME,
		MISSING_STRING,
		MISSING_INTEGER, INVALID_INTEGER,
		MISSING_DOUBLE, INVALID_DOUBLE}
}
	

DoubleArgumentMarshaler.java and StringArrayArgumentMarshaler.java were omitted from the book for brevity.

Did you read all that? Did you take your time? Do you feel like you understand how it works?

What does this code actually do?

Brian Will published a video in 2016 called Object-Oriented Programming is Embarrassing. In it, he takes 4 examples of Object-Oriented code and rewrites them in a more procedural style. One of those examples is a version of this code — although not the same version; Brian took it from a talk Bob gave, and not directly from this book. That section starts at about 19:40 in Brian's video. Now I agree with the video generally, but there's something pretty funny about it. Brian says that this code parses command line arguments that look like this:


$ java myProgram -l -p80 -d/home/temp
	

Now. You know that this isn't the format, because I told you up front it's this:


$ java myProgram -l -p 80 -d /home/logs
	

So why does Brian say this? I speculate that there are three reasons, and only one of them is Brian's fault.

Reason one would be that Brian never tested Bob's code. I can forgive this, because Brian (I assume) transcribed the code from a video, where Bob (understandably!) did not provide the test cases. The video compression made even transcribing the code a challenge, I'm sure.

Reason two would be that Bob said so! At 16:58 in Bob's video, we can see this example invocation. Perhaps this is the format Bob originally had in mind when he started the project? At any rate, it doesn't line up with the code he then shows.

Reason three would be that the code is not clear. Can we say anything more damning about a class than this? That a clearly competent programmer could copy down the entire thing, and still not see what it actually does? Even now that you know this, can you scroll up and find the code that defines the expected format?

Violations of Bob's Rules

Let's measure using the standards provided elsewhere in Clean Code. These are instances where I believe even Bob would agree that this code should be improved.

Dead Code

"Dead code is code that isn't executed... Delete it from the system." 2 The function Args.nextArgument() is never called from within this package, and there is no reason it would ever be called from outside (since it deals with an internal enumerator, which should be exhausted by the time the Args constructor finishes). The ArgsException class also has two dead constructors, three dead get methods, and two dead set methods.

Object design

"We do not want to expose the details of our data... The worst option is to blithely add getters and setters." 3 Let's look again at the ArgsException class. It has three properties: errorArgumentId, errorParameter, and errorCode. These are used within the class to build a friendly, readable error message. And all three of them have public getters and setters. I ask: if a consumer can call getErrorMessage(), why would they ever call getErrorParameter()? Perhaps to build their own error message, better than ours? But the public setters are even worse. There is no scenario where it makes sense for a consumer to change the ID of the error that we just threw.

Gratuitous Context

"Shorter names are generally better than longer ones, so long as they are clear. Add no more context to a name than is necessary." 4 Inside of the ArgsException class, why does every field begin with "error"? Again, we have errorArgumentId, errorParameter, errorCode — whose names are also baked into their getters and setters — and errorMessage() to round out the set. Why not just argumentId? Why not just getMessage()? The same applies to the intValue field of the the IntegerArgumentMarshaler class, the stringValue field of the StringArgumentMarshaler class, and so on.

Don't pass null

"You should avoid passing null in your code whenever possible." 5 We see two instances where Bob passes null as an argument. They are both calls to the ArgsException(ErrorCode, char, String) constructor, with null as the string value. Why not define a constructor overload with just the first two parameters?

Liskov Substitution Principle

"An object (such as a class) may be replaced by a sub-object (such as a class that extends the first class) without breaking the program" 6 The ArgsException class extends Java's base Exception class, which has a method getMessage(). ArgsException does have friendly, readable messages… but they are accessed through the errorMessage() method. Calling getMessage() will return null every time. 7

Use Intention-Revealing Names

"The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used." 8 Let me ask you: what does the Args class do? From the name, we can only infer that it contains at least one "arg", and probably more. Why not call it an ArgumentParser? Or even a CommandLineArgumentParser? That would make the purpose and usage clearer, and also encourage us to name our variables better. Instead of Args arg = new Args(schema, args), it would feel natural to call it a parser, avoiding any potential arg/args confusion.

Surprising Behavior

One of the rules Bob applies throughout the book is the Principle of Least Suprise — that idea that code should do "pretty much what you expect". 9 This isn't always possible when you're writing code; sometimes an edge case leaves you with two bad options. It's also subjective — my "well of course" could easily be your "wait, what?" Here are the behaviors that surprised me.

Getting undefined arguments

Let's create an empty instance of this class, with no arguments defined and none provided:


Args arg = new Args("", new String[] {});
	

What would you expect to happen if we now call arg.getBoolean('x')? I expected an error, but the code instead returns false. Similarly getInt('x') returns zero, and getString('x') returns the empty string. Why?

Getting arguments of the wrong type

Let's define a string argument called 'x' with a value of "foo":


Args arg = new Args("x*", new String[] { "-x", "foo" });
	

What would you expect to happen if we now call arg.getInt('x')? I expected an error, but the code instead returns zero. Why?

Duplicate argument names

Let's try to force this code to throw an error. We'll define two arguments, an integer and a string, both called 'x':


Args arg = new Args("x#,x*", new String[] { "-x", "42" });
	

But there's no error here either. Instead the integer definition of 'x' is quietly overwritten by the string definition. If we now call getInt('x') we will get zero, and if we call getString('x') we will get "42". Why?

Combined Argument IDs

Let's define four arguments: two booleans, an integer, and a string.


Args arg = new Args("w,x,y#,z*", new String[] { "-w", "-x", "-y", "42", "-z", "foo" });
	

This works exactly as we expect! And if you have invoked a lot of Unix-y command line applications, you won't be surprised that both boolean flags can be combined, and passed in as -wx. However, I was surprised that all four arguments can be combined:


Args arg = new Args("w,x,y#,z*", new String[] { "-wxyz", "42", "foo" });
	

From here, getInteger('y') will correctly return 42, and getString('z') will correctly return "foo". 10 Perhaps Bob finds this perfectly natural; I was flabbergasted. 11

Interrupting the parse

Let's define two integer arguments, then attempt to pass in three.


Args arg = new Args("x#,y#", new String[] { "-x", "12", "34", "-y", "56" });
	

What do you expect here? Will 'x' be 12 or 34? Will 'y' be 56? Will we get an error? In fact, when the parser encounters the "34", it just stops — without an error! So getInt('x') gives us 12, and getInt('y') gives us zero. Why? 13

Provide an argument twice

Let's test out another edge case, with one argument given twice in the command line.


Args arg = new Args("x#", new String[] { "-x", "12", "-x", "34" });
	

What do you expect here? I would usually expect an error, 14 but instead 'x' quietly takes the value of 34.

Am I getting a little nitpicky here? Indeed I am. Surely Bob put this class in the book as an example of clean style, and not necessarily as an example of a foolproof library. So let's quote him again:

There is no replacement for due diligence. Every boundary condition, every corner case, every quirk and exception represents something that can confound an elegant and intuitive algorithm. Don't rely on your intuition. Look for every boundary condition and write a test for it. 15

Bad Design

Let's get into some of my more subjective complaints. These are principles that I've come to value over the years, which Bob might think are less important than his. I do think he'd be wrong, but these are ultimately only opinions.

Modifying exceptions after they're thrown

Within the parseArgumentCharacter() function, if a Marshaler throws an exception, we see this code:


catch (ArgsException e) {
	e.setErrorArgumentId(argChar);
	throw e;
}
	

This feels very unnatural to me. Bob does this because the argument ID is relevant to the exception, but it is not available to the Marshaler. One way to make it available would be to pass it as a second parameter to the Marshaler's set() function. I believe that would be worth it, to avoid this awkward behavior. Bob values minimizing function parameters more highly than I do.

Ideally, we would be able to find another way to make this context available at the site where the exception is originally thrown. Then Bob and I could both be happy about it.

What is a "Marshaler"?

This is another naming complaint. I include it in this section because ArgumentMarshaler is a less obviously bad name than Args. 16 Bob says that he needs a class to do three things: parse the schema, parse the arguments, and provide the argument values. My question is, why didn't we go for the obvious choice? What would be wrong with "Parser" instead of "Marshaler"? "Remember that the people who read your code will be programmers. So go ahead and use computer science terms... Choosing technical names for those things is usually the most appropriate course." 17

Meaningless Defaults

The ArgsException class has three fields, and it should not be possible to end up with an instance where those fields are not set to meaningful values. 18 Then what purpose is served by initializing errorArgumentId to the null character, errorParameter to null, and errorCode to OK? These values only clutter the code.

“Clever” For Loops

The usual format of a for loop declares a variable, specifies how long the loop should continue, and specifies something to change on each iteration. You've seen it before:


for (int i = 0; i < 10; i += 1) {
	// ...
}
	

Now, for loops are a good deal more flexible than this. Each part can be made arbitrarily complex, or omitted entirely. But if you're not doing the bog-standard index iteration, and you can't use foreach or for-in, I believe you should choose a while loop instead. Take a look at parseArgumentStrings():


for (currentArgument = argsList.listIterator(); currentArgument.hasNext();)
{
	String argString = currentArgument.next();
	

Don't these three statements all deserve their own line?


currentArgument = argsList.listIterator();
while (currentArgument.hasNext()) {
	String argString = currentArgument.next();
	

Strings Over Enums

I agree that there are times and places to use enum, but I believe that those times and places are much rarer than Bob thinks. In fact, I probably believe they're rarer than you think. Let's take a step back and look at how ArgsException currently works. We're trying to do some parsing, and we encounter something unparseable. At that moment, we know more about what went wrong than anyone else! 19

But do we say what went wrong? No. Instead we encode what went wrong, into a couple of primitive values. Therefore we must later decode those values, translating them back into a friendly error message:


// inside of Args.parseSchemaElement()
throw new ArgsException(INVALID_ARGUMENT_FORMAT, elementId, elementTail);

// inside of ArgsException.errorMessage()
case INVALID_ARGUMENT_FORMAT:
	return String.format("%s is not a valid argument format.",
	                     errorParameter);
	

What have we gained here? Is formatting an error message such a Responsibility that the Single Responsibility Principle forbids it from coexisting in a function with any other logic? 20 Is the 27-line switch statement in getErrorMessage() really worth keeping around? When a future developer adds support for a new type to Args, do we really want to force them to mess with the ErrorCode enum and the errorMessage() function, on top of the new Marshaler they're already writing?

Personally, I didn't even notice until writing out this example that the elementId is unused in the actual error message. I believe we would be better off with the message directly in-line.


// inside of Args.parseSchemaElement()
throw new ArgsException(String.format("%s is not a valid argument format.",
                                      elementTail));
	

Subclassing Exception

This is another point where my opinion goes against the grain, but I believe extending built-in exception classes is usually a waste of time. What do we gain from ArgsException? We gain three fields with getters and setters, and a function to build error messages from those fields. I've already argued we'd be better off without these things.

Our consumers gain the ability to distinguish between exceptions thrown from our code, and exceptions thrown from a lower level. When argument parsing fails, this will allow them to print the error message and gracefully exit the program. But what would they do for a lower-level exception? Hopefully they'll print the error message and gracefully exit the program!

There are many ways to classify errors. We can classify them by their source: did they come from one component or another? Or their type: Are they device failures, network failures, or programming errors? However, when we define exception classes in an application, our most important concern should be how they are caught. 21

On the other hand, what does ArgsException cost? It enables the awkward statefulness of being modifiable after construction. It's 89 lines of code — so far! And it will grow as types are added. But the worst cost is that it makes our library more complicated to use. Consumers aren't just able to catch ArgsException specifically, they must catch ArgsException specifically. And when they type:


catch (ArgsException e) {
	System.out.printf(e.
	

Their IDE will ask them, "Do you want e.getMessage() or e.errorMessage()?" And they will be forced to go read our code to learn which one will work!

Even if we change ArgsException to override its base class's getMessage(), fixing the Liskov Substitution Principle violation, I still don't think the extra complexity would be justified. Before you write a custom exception class, think twice! There's a good chance a fitting exception type already exists in your language's standard library. And in the end, a plain old Exception might be perfectly suitable.

Cluttered API / Presumptive Defaults

What purpose is served by the Args function has()? Under what circumstance would a developer call it? Generally, I shouldn't think that a developer would care if the user provided an argument on the command line, or omitted the argument to allow the default value to be used. I would prefer to delete has(), letting developers just call getBoolean(), getString(), and so on.

Alas, we cannot do this. There is a circumstance where has() is needed. Suppose we have an application with a throttle, forcing it to wait some number of milliseconds between each operation. And let's say that we want to wait half a second by default. With the way Args is currently written, we would need to write this code:


public static void main(String[] args) {
	Args arg = new Args("t#", args);
	int throttleMilliseconds = 500;
	if (arg.has('t')) {
		throttleMilliseconds = arg.getInt('t');
	}
	execute(throttleMilliseconds);
}
	

This is the only way for us to distinguish between "The user wants us to execute with zero throttle" and "The user doesn't care what the throttle is." And this is thanks to the default zero value for intValue hard-coded inside of IntegerArgumentMarshaler. And this leads us right into a discussion of missing features.

Inventing a new language

Let's go back to the very beginning. How would we tell an instance of Args:

My program expects four arguments: a boolean flag called l, an integer called p, a string called d, a double called t, and an array of strings called f.

We must encode this information as a string: "l,p#,d*,t##,f[*]". Why does the string look this way? Why are the type descriptors the characters they are? Well, that's just what made sense to Bob at the time.

And how, as consumers of this code, are we meant to learn this? Hopefully, Bob included some documentation along with his code. Hopefully, that documentation is kept up to date as new types are added. But my hopes aren't high.

This is the opposite of a discoverable API.

Missing features

Let's acknowledge again that Bob put this code in his book as a teaching example. It's meant to demonstrate good style, and not necessarily the best possible version of a command-line argument parser. It's still useful to ask how it might be extended. Or even, how it should be extended.

Following "The Principle of Least Surprise," any function or class should implement the behaviors that another programmer could reasonably expect. For example, consider a function that translates the name of a day to an enum that represents the day... We would expect the string "Monday" to be translated to Day.MONDAY. We would also expect the common abbreviations to be translated, and we would expect the function to ignore case.
When an obvious behavior is not implemented, readers and users of the code can no longer depend on their intuition about function names. They lose their trust in the original author and must fall back on reading the details of the code. 22

Default values

As detailed above, the only way to give an argument a default value (other than the one Bob chose on your behalf) is to write several lines of code including a call to has(). I believe the ability to provide defaults for each parameter is something we can reasonably expect a command-line parsing library to provide.

Long names

The convention for command line arguments is to name them with a word, and alias them with a character. The user can the provide the name with two dashes, or the the alias with one dash. So


$ java myApp --log --port 80
	

would be equivalent to


$ java myApp -l -p 80
	

I'd say this feature is a reasonable expectation. How would we fit it into Args?

Validation

The library does already have some validation, of course. But I expected a good deal more. I won't re-list my surprises here, but I believe most of them would be reasonable to include in a final version of this library. And all of them would be reasonable to expect in a production-ready command-line argument parser.

Required arguments

It would be nice if I could tell my parser that an argument is required. The ability to declare that parsing has failed when the user doesn't provide a crucial argument would currently rely on the use of has(). Since a developer would need to do this in main(), immediately after calling our code, we could reasonably say that argument parsing isn't really complete until all required arguments have values.

How would we fit this into Args as it currently stands?

Argument Descriptions and Help

This may be a bit of a reach, and perhaps my expectations have been colored by advances in command-line parsing since the book was published. 23 But if we've added all of the above features, then Args has enough information to automatically support java myApp --help

This feature has been supported by every command-line argument parser I've ever used. Is it reasonable to expect it in the next one I use?

If you're so smart, how would you write it?

If you've read this far, you've already endured a torturous amount of detail. Believe me, I appreciate your perseverance. However, I must include a "better" version of Args. I cannot sit on the sidelines and scoff at Uncle Bob, without offering my own code up for consideration.

Main.java (example usage)


public static void main(String[] args) {
	try {
		ArgumentParser parser = new ArgumentParser(
			args,
			new BooleanArgument('l'),
			new IntegerArgument('p'),
			new StringArgument('d')
		);
		boolean logging = parser.getBoolean('l');
		int port = parser.getInteger('p');
		String directory = parser.getString('d');
		executeApplication(logging, port, directory);
	} catch (Exception e) {
		System.out.printf("Error: %s\n", e.getMessage());
		return;
	}
}
	

ArgumentParser.java


public class ArgumentParser {
	private Map<Character, Argument> definitionsById;

	public ArgumentParser(String[] args, Argument ...argumentDefinitions) throws Exception {
		storeDefinitionsInMap(argumentDefinitions);
		parse(args);
	}

	private void storeDefinitionsInMap(Argument[] argumentDefinitions) throws Exception {
		definitionsById = new HashMap<Character, Argument>();
		for (Argument argument : argumentDefinitions) {
			char argumentId = argument.getId();
			if (definitionsById.containsKey(argumentId)) {
				throw new Exception(String.format("Argument -%c was specified more than once.",
				                                  argumentId));
			}
			definitionsById.put(argumentId, argument);
		}
	}

	private void parse(String[] args) throws Exception {
		ListIterator<String> argumentIterator = Arrays.asList(args).listIterator();
		while (argumentIterator.hasNext()) {
			String argumentString = argumentIterator.next();
			if (!argumentString.startsWith("-")) {
				throw new Exception(String.format("Expected an argument ID, found '%s'",
				                                  argumentString));
			}
			char[] argumentIds = argumentString.substring(1).toCharArray();
			for (char argumentId : argumentIds) {
				Argument argument = findDefinition(argumentId);
				argument.parse(argumentIterator);
			}
		}
	}

	private Argument findDefinition(char argumentId) throws Exception {
		if (!definitionsById.containsKey(argumentId)) {
			throw new Exception(String.format("Argument -%c was not defined.",
			                                   argumentId));	
		}
		return definitionsById.get(argumentId);
	}

	private <T extends Argument> T findDefinitionWithType(
		char argumentId, Class<T> type
	) throws Exception {
		Argument argument = findDefinition(argumentId);
		if (argument.getClass() == type) {
			return type.cast(argument);
		}
		else {
			throw new Exception(String.format("Argument -%c is not of type %s",
			                                  argumentId, type.getName()));
		}
	}

	public boolean getBoolean(char argumentId) throws Exception {
		return findDefinitionWithType(argumentId, BooleanArgument.class)
			.getValue();
	}

	public String getString(char argumentId) throws Exception {
		return findDefinitionWithType(argumentId, StringArgument.class)
			.getValue();
	}

	public int getInteger(char argumentId) throws Exception {
		return findDefinitionWithType(argumentId, IntegerArgument.class)
			.getValue();
	}
}
	

Argument.java


public abstract class Argument {
	protected char id;

	public Argument(char id) throws Exception {
		if (!Character.isLetter(id)) {
			throw new Exception(String.format("%c is not a valid argument name.",
			                                  id));
		}
		this.id = id;
	}

	public char getId() {
		return id;
	}

	public abstract void parse(ListIterator<String> args) throws Exception;
}
	

BooleanArgument.java


public class BooleanArgument extends Argument {
	private boolean value = false;

	public BooleanArgument(char id) throws Exception { super(id); }

	public void parse(ListIterator<String> args) throws Exception {
		value = true;
	}

	public boolean getValue() {
		return value;
	}
}
	

StringArgument.java


public class StringArgument extends Argument {
	private String value = "";

	public StringArgument(char id) throws Exception { super(id); }

	public void parse(ListIterator<String> args) throws Exception {
		try {
			value = args.next();
		} catch (NoSuchElementException e) {
			throw new Exception(String.format("Could not find string parameter for -%c",
			                                  id));
		}
	}

	public String getValue() {
		return value;
	}
}
	

IntegerArgument.java


public class IntegerArgument extends Argument {
	private int value = 0;

	public IntegerArgument(char id) throws Exception { super(id); }

	public void parse(ListIterator<String> args) throws Exception {
		String argumentValue = null;
		try {
			argumentValue = args.next();
			value = Integer.parseInt(argumentValue);
		}
		catch (NoSuchElementException e) {
			throw new Exception(String.format("Could not find integer parameter for -%c",
			                                  id));
		}
		catch (NumberFormatException e) {
			throw new Exception(String.format("Argument -%c expects an integer but was '%s'",
			                                  id, argumentValue));
		}
	}

	public int getValue() {
		return value;
	}
}
	

Like Bob, I have omitted doubles and string arrays. In fact, my code's structure is remarkably similar to his. And to examine the differences is to re-state most of my objections, so I'll be very brief:

Further work

This code still isn't perfect — obviously! 25 But I hope the path toward improvement is clear:

Conclusion

Why can't I recommend reading Clean Code? Because — despite containing some truly excellent, outstanding advice — it recommends writing code like this.

Friends, please. Do not write an argument parser.


  1. I did my best to be completely accurate in the name of fairness. Any brace placement or whitespace that seems odd is preserved from the book. We should not hold these against Bob, though. Fitting Java code onto the printed pages of a book is a constraint he had less control over. I'll ask you to be similarly kind when you get to the code I've written later on the page.
  2. Page 292.
  3. Page 95.
  4. Page 29.
  5. Page 111.
  6. This phrasing is from Wikipedia; I don't believe Bob defined the LSP in this book. However it is the "L" of the "SOLID" design principles, of which Bob is an enthusiastic proponent.
  7. It is possible to construct an ArgsException which will return a non-null result for getMessage(), but Bob never does so here.
  8. Page 18. I could also argue that this violates the Make Meaningful Distinctions rule on page 20, and the Use Searchable Names rule on page 22. Perhaps even the Don't Pun rule on page 26, since "args" is well-known to Java programmers as "the string array you pass into main()."
  9. Pages 11, 39, 288, and probably others.
  10. Incidentally, this is why the format used by Brian Will cannot work. Given the schema "x*,y" and the invocation $ java myApp -xy, our parser would not be able to tell whether to set 'x' to the string "y" and set 'y' to false, or to set 'y' to true and throw an error for the missing value of 'x'.
  11. Just to make sure this was actually odd behavior, and not just a result of my own inexperience with command line interfaces, I went and tested this behavior in three more widely-known parsers. 12 One of them supported this style, the other two did not. So Bob's choice is not unheard-of, but my surprise is forgivable too.
  12. Yargs, Eric Newton's excellent CommandLineParser, and Microsoft's System.CommandLine.
  13. We can dismiss the idea that he simply didn't think of it. This behavior is due to the break in parseArgumentStrings, which was put there deliberately. At 22:05 in the video linked above, Bob says, "If it doesn't start with a dash, it's not a command-line argument."
  14. Usually. But I've just been surprised by 5 non-errors in a row, so I'm getting less optimistic.
  15. Page 289, emphasis his.
  16. Although Bob himself admits Marshaler is not a great name, at 39:13 in the video linked above.
  17. Page 27.
  18. Of course it is possible, with the class as it's currently written. A future developer could construct an ArgsException with one of the currently-dead constructors, which would leave these properties unset. Or they could construct an instance that expects them to subsequently call setArgumentId(), and then fail to do so. Both of these are more serious issues than the meaningless defaults.
  19. Ideally. Again, the Marshalers currently have no knowledge of the argumentId, which would definitely be useful to report when an argument can't be parsed.
  20. One valid counterargument here might be internationalization; while formatting a single error message might not be a big deal to include inline, choosing an error message dependent on the current culture would certainly belong outside of the parsing function. This is valid, but I don't find it compelling — after all, Bob's code isn't internationalized. If and when we need internationalization, we can make the appropriate changes.
  21. Page 107, emphasis his. I believe this paragraph was actually written by Michael Feathers, because his name is at the start of the chapter. But the name on the cover of the book (twice) is Robert C. Martin, so we should hold him responsible.
  22. Page 288.
  23. Nope. The most popular (as far as I can tell) command-line argument parser for Java programs is Apache Commons CLI, and it shipped with built-in help formatting back in 2002 — seven years before Clean Code was published.
  24. "In general you should prefer nonstatic methods to static methods", page 296.
  25. You may be able to tell that I haven't used Java since I was a college student, over a decade ago.
  26. That's how far back the Apache Commons CLI changelog goes — so far!