We should format code on demand

8 min readMar 1, 2022

Currently our source code is saved to disk already formatted, and our editors display this saved format. There are many auto formatting tools, but the results always get saved back to disk. What happens if we save to a standardised text representation, and instead format code on demand, in the editor?

Most editors / IDEs will have to agree to save code to the same standardised representation, which will vary by language, but afterwards there are a lot of benefits.

The standardised representation would probably be text, and valid compilable code, so that it is human readable and all existing tooling still works.

There is no bikeshedding

There is no need for any bikeshedding, no need to discuss whether to use tabs or spaces, or semicolons, or brackets.

You don’t need to agree on a max line length or a preferred line length, and the line length could be responsive to the current size of your editor window.

The team doesn’t need to agree on which code formatter to use, or its settings. Style guides are unnecessary (at least for formatting and whitespace).

Authors of auto formatting tools don’t need to gain broad consensus, something which is often hotly debated.

Instead each team member can choose to view the code in whichever format they prefer, as long as the editor / auto formatting tool supports it.

No whitespace or formatting changes in Git

Because we are saving to a standardised representation, no whitespace or formatting changes are saved.

This in turn means that these changes do not show up in Git diffs, which makes reviewing easier, and reduces the chances of conflicts.

Also, if you view / review diffs in your editor (or in something else that supports it), then you can see the changes in whichever format you prefer.

The standard representation can be optimised for Git

Since the standardised representation is mostly used by the computer, we can optimise it to further improve Git diffs and reduce the chance of conflicts.

The standardised representation can layout the code vertically, with everything on a separate line.

With the code layout below, one person could edit the function name, and another could add / delete / rename a parameter, and the changes would occur on different lines. This makes it easy to spot in the diff, and does not cause a merge conflict.

Usually people prefer to see such code written on one line, which does cause a conflict if one person edits the function name and another adds / deletes / renames a parameter.

You can use a compact format for an overview

When you are reading a large file for the first time, you could use a compact format, in order to see more code on each page, and maybe even see it all on one page. This format could show a minimal amount of whitespace, and group code on the same line as much as possible, and potentially make more aggressive formattings such as:

Hide error / exception handling code to highlight the happy path
Hide function bodies / parameters for a really high level view
Just show jsx code in React, to highlight the component tree structure
Use a larger font to highlight function names
Replace Haskell operators (fmap, apply, mappend etc) with equivalent symbols (<$>, <*>, <> etc)

The code below was originally 90 lines long, but if we use a compact format and hide exception handling code it can be shown in 35. If we are really aggressive with whitespace it can even come down to 25.

This isn’t the format I would like to see when editing the code, or debugging the code, or trying to find a missing curly bracket, but it is very useful to get an overview. In the example above it is easy to see at a glance what the code is trying to do and how it is trying to do it.

You can use an expanded format for the details

When developing and especially when debugging it is often necessary to understand every little detail of the code, and for this it is useful to use an expanded format. There are many times when I have missed an important nuance of the code just because the code was difficult to scan and my eye skipped over something, or didn’t see something.

Get cedd burge’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

If we use an expanded format we can make the code easy to scan and make it difficult to miss any of the details. The code takes up a lot more vertical screen space, but this will usually be ok, as we will only be looking at a small section of code.

The code below makes it obvious thatDesignTurbulence is calculated, but the specifics of the calculation are harder to read

If we use an expanded format it is much easier to see what is going on, and which brackets match up. The format could even add brackets and indentation to make the precedence of the operators clear in cases where it is important. The format below has added brackets and indentation to 1.28 / windSpeed * 1.44, as the division operator has higher precedence than the multiplication operator, which is something that you could easily miss or forget when viewing a more compact format.

And the editor could even show this code in a mathematical format, which most people find clearer. This will also make it easier to check the code against the original calculation, when the original is shown in the same format (which is common).

Example showing code displayed in a mathematical format

This is useful for everything, not just calculations. There are many cases where the precedence and associativity of operators isn’t immediately obvious. This table of Typescript operator precedence, with over 60 operators, is enough to scare any programmer. Most languages require liberal use of various kinds of brackets, and in a complex expression it is often hard to see which brackets match up. In most editors you can hover over one to see it’s match, but this is slow and laborious. It requires you to remember the matches, and to specifically look for them. An expanded format can make all this immediately obvious, making it easy to fully understand the code with no extra effort.

The visual flow can match the logical flow

The code below reads top to bottom, but at 3 points in the code there are early return statements, and the logical flow jumps to the end. This is a fairly simple case, and it’s relatively easy to work out what is going on, but things can still be overlooked, and it still takes up some mental overhead that would be better applied to the actual problem we are trying to solve.

If we use a format that matches the visual flow with the logical flow we would get something like the example below. In this format we no longer have to work out the logical flow ourselves, and the code scans better visually, as everything is aligned horizontally. As a side benefit it also takes up less screen space.

Example showing visual flow matching logical flow for early returns

There are many other cases where this would be useful.

An if / else expression could show both branches side by side, probably with the true branch displayed on the left (assuming that true is the predominant case). Case statements could fan out horizontally when there is space. For loops could show the initial value, end condition and repeat condition in a flow chart style format, which would make off-by-one errors easier to spot. Similar things could be done for while and other loops.

The example below shows how a binary search function in python might look:

Example showing visual flow matching logical flow for binary search

Right to left languages

A string is an isogram if it contains no more than one of each letter, and the Haskell code below calculates this.

In Haskell the code flow is right to left, which is confusing for people that aren’t used to it, and most native left to right readers will find it easier to work out how the code below works. This isn’t legal in Haskell, but there is no reason why an editor couldn’t display it this way.

The opposite could also be done for all languages, to make it less confusing for people whose first language does read right to left.

Functional programming

I’m a big fan of functional programming, as are a lot of people that learn it. When the code is compiling and working, it is often very elegant, expressive and easy to read, even for beginners. However, if it isn’t working, I sometimes find it difficult to see where the problem is, and it can definitely be difficult for beginners to write.

Unless you are familiar with functional paradigms and the converge function from rambda, it is impossible to understand what this code does.

However, if the editor is aware of the converge function, or can follow the flow of the functions, it can display the code like this.

Example showing logical flow of functional code

This makes the flow of the code immediately obvious, and will highlight any problems. In this example, it is now clear that firstName and lastName should both be functions that take a single object parameter, and that the compliment function should take two parameters. We can also easily verify that the types returned by the firstName and lastName functions match those expected by compliment.

Conclusion

Currently editors rigidly display text exactly as saved on disk, but if we allow some flexibility in how the code is displayed, we can use formats like the ones above, which brings significant improvements.

There is also the possibility to go half way, and use on-demand formatting but without saving to a standardised representation. This makes adoption much easier, as you can try it without any commitment, without affecting the rest of the team, and only the tool you are using needs to support it. You still get a lot of the benefits, and just miss out on the git optimisation. JetBrains MPS is worth a look in this space.

ITNEXT