Preferences

This works until someone tries to vertically align something like a table or a line that is wrapped.

Yes, this. Which the counter point is either, Don't align things, or use tabs for indentation, and spaces for alignment.

And maybe you can enforce no alignment, but that's a hard fight to win.

And as far as tabs for indentation and spaces for alignment, I've found no practical way to enforce this via tooling/linting. And a rule without enforcement becomes inconsistent, which is how we get files full of mixtures of spaces and tabs, which is how people get frustrated with tabs, and we decide to throw it all out.

And inevitably, that's part of how spaces "won"

Disallowing /[^\t]\t/ and /^ / is a good start.
These regexes don't solve what I think is one of the major common problems/complaints though: using extra tabs past the logical indent point as a shortcut to avoid typing so many spaces for alignment purposes.

To take this example from a sibling post:

  if (foo) {
  »   frobnicate(bar,
  »   ...........baz);
  }

Many people will wind up doing this:

  if (foo) {
  »   frobnicate(bar,
  »   »   »   ...baz);
  }
And then your alignment is all messed up if you have a different tabs setting.

Checking for that requires something more like a linter with a detailed understanding of the syntax parse tree.

There are degrees of “detailed”.

For example, I considered writing a small Awk program to check C code in response to another poster’s complaint about lack of tooling, but then quickly came to the conclusion that, with C’s insistence that (say) /??/<newline>* is a valid comment starter, getting this exactly correct probably does need an actual lexer that would go character by character. That sounded like it wouldn’t fit in an HN comment, so I stopped there.

(That said, that’s as far as you’d need to go in the majority of cases. A dishonorable mention is warranted for languages that use the same character as an operator and a paired delimiter simultaneously, that is C++, Java, C#, TypeScript, and Rust with their abuse of the less-than and greater-than symbols, because that would in fact require a parser. In C++ especially, you’ll need full semantic analysis with template expansion, name resolution, and consteval evaluation. Because C++.)

Yet you probably don’t actually need to be that accurate, do you? The majority of syntax highlighters aren’t, and they are still useful. You can usually afford to say that code that perverse deserves to lose, and in return I expect you should be able to gain a fair amount of language independence, which could be worth the tradeoff.

So instead of checking if things are aligned with what they should be, you would just check they are aligned with something, like a left word boundary preceded by a delimiter, and so on. I can already see unpleasant corner cases after thinking about it for a few minutes, but it doesn’t look hellish yet, it looks something like you could experiment with over a weekend to see if it was viable.

At some point you just have to flog people ;)

Anyhow, if code reviewers always use tools that highlight the tabs during the reviews, there's a good chance to catch these things.

Maybe you could also have the tab width set randomly at every review, to make these horrors stand out

Or dont try to align lines with different indentation levels.

Addind a comment with the right amount of tabs as a table header and align all fields with spaces after the tabs would do the trick.

It would be crazy that spaces would have "won" because of the silly idiosyncratic alignment that some people seem to like and then teach to others for many decades.
That’s mostly editor braindamage (that has unfortunately leaked into some otherwise very good codebases, like LuaJIT). Indent things with tabs, align with spaces[1]:

  if (foo) {
  »   frobnicate(bar,
  »   ...........baz);
  }
Both camps will hate you, but things will work just as they should.

[1] https://www.emacswiki.org/emacs/SmartTabs

Actually for me this shows why tabs don’t deliver on their promise. As soon as the user’s tab size is small enough that baz doesn’t need to wrap, the user gets suboptimal formatting. As someone who prefers tabs of 2 and often views code authored with tabs of 4 I encounter this often.
Your team should settle on a maximum line length independent of which tab settings one has

Of course that will always be a compromise, either people who use narrow tab widths or those who use wide ones will have a (slightly) suboptimal experience.

Formatting by hand is probably only going to work for private hobby projects. In 99% of the other cases there needs to be a formatter that does the formatting.
That feels both largely irrelevant and false? False, because there are plenty of large projects that do have a house style but don’t use a formatter at code submission to enforce it—even if you reject the Linux kernel as an atypical example, basically every piece of commercial software from 20 years ago also fits, simoly because both formatters and presubmit checks weren’t nearly as common, and there were some chonkers there (like, I don’t know, Windows XP). Irrelevant, because it’s perfectly possible for a formatter to follow this convention (gofmt does, for example).
This problem is solved by gofmt because it automatically aligns with spaces after the tab so humans don't mess up the whitespace
Great, so now I need a special editor plugin, and a compatible editor just to be able to properly view code?

No wonder why spaces "won"

I use tabs for indentation and spacing for alignment. Tables should be aligned with spaces. A wrapped line can be tabbed up to the start of the previous line and then spaced for alignment.
tabs for indentation, spaces for alignment
Meanwhile, I'm trying to get away from languages where whitespace has semantics.
It's not about semantics at all. whitespace has (virtually) no semantics in C or C++, but there are few programmers who would feel comfortable reading such code without the suggestive hinting that indentation provides.
Very few where you can delete or insert whitespace at will, (even if only in the middle of 'identifiers').

I am fine with a (single) space being an important part of syntax, just don't use more than 1 please.

Wheb there i a formatter you should not care. You can rewrite the code if it bothers you.
Only a goblin would align code.

But if you must you can start with a new line and the right indent.

From my experience, it seems to be the people that learned PASCAL first for some reason.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal