Anyway, I still have friends in that business. It hasn't really changed, they have too few people covering systems that are quite complex and while there are checks and such, no one really understands things entirely from end to end in detail that can prevent all problems.
I will never invest directly in an investment bank- either through carelessness or maliciousness I could have easily caused a 9 figure loss, if not more, and there were probably a thousand other people in the same position.
When I read the detailed writeup around this a few years back, I think by far the biggest issue was reusing a tag that had been previously used to denote which strategy to use. I understand why they may have chosen to do so, at the Big Bank I was working at, getting a new fix tag to be passed through all the layers properly would involve at least two other teams and coordinating releases and probably several weeks worth of meetings. If you just reuse an old value you can avoid all that since everything is already set up.
That said, I went back to finance to work at one of the premier hedge funds out there, and they actually lived up to their expectations in terms of comp, that place was more like a tech firm though than any other firm I have ever worked at aside for maybe Knight. 8% annual increases were normal there. You can look in my post history back to 2018 if you want the name, I recently left after 5 years there and just want to stay out of their crosshairs- they monitor social media aggressively and there is deferred comp at stake.
At big banks, there are really only a very small number of people who are in tech that are getting paid- you have to know which questions to ask- where is the bonus pool coming from- are you "in the business" or the tech pool, which is a second class of citizen. I would have to be in a pretty bad place to ever consider going back to a bank, it was borderline abusive... always dangling the prospect of that big check that would make it worth it
The top ones would spend all their time traveling the world to the offices and meeting with staff in each location and the sending emails to the rest of the department about what the staff in that location were working on. They would harvest ideas from the staff as they went and then present that as their own or approve projects that staff have suggested to them. I really didn't see how they were worth the $5M they were earning since they didn't come up with the ideas for what would be done and didn't do any real work.
> Yeah, pay at the big banks is shit really
I would say that a $300,000/year salary is really quite far from "shit". Many jobs are stressful, demanding or unrewarding but rarely do they provide such a salary, and quite often they offer hourly wages hovering around minimum wage with poor benefits. It's worth taking a step back once in a while to think about this, because it sounds like some people are a little out of touch with how the majority of the world lives
And again- these are the cream of the crop, this is the team most people wanted to work. I was one of the first to leave, but all those guys are now making multiples elsewhere that I am aware of. The point is that there are just better opportunities out there where you get to actually enjoy your life. I hear you on other jobs being stressful and demanding, but I have not really worked at a job where I was wearing that stress 24/7- well typically nothing traded on Saturday night, so nothing would ruin that, but outside of that, I could expect a call at literally any time of day- the markets were trading 23x6.
I often find myself complaining about things at work like competing priorities or wearing too many hats and it helps me to ground myself by thinking about past jobs at minimum wage or even watching what people making less than a fifth of my salary have to put up with on a daily basis.
Frankly, a lot of what people refer to as "prop trading firms" and "quant hedge funds" on here are just market-makers...they are taking very little to no risk, a lot are just riding the wave of ETF growth. Even the ones that are running alpha, I have heard of a few strategies, and they are largely what you would expect: low-edge, crowded trades (frankly, a lot of it is LTCM-style stuff).
That is why the business has become more tech-like, because actually taking risk is...quite risky. If you are going for alpha, you are trying to hire one guy out of thousands, that person knows what they are worth, etc. It is far easier to hire lots of devs on low wages and do the grunt work jobs that are less lucrative, but don't require being able to actually work out if someone is a decent earner.
Man Group has their own department at Oxford Uni, I think Winton hires people out of their department at Cambridge...Man Group's shaky record is legendary (they are doing everything now that blew them up in 2008), Winton was top tier...not anymore. AQR is another one, although a more traditionally finance quant approach...it all turned out to be pure beta. The hybrid approach of hiring devs or ML PHds to generate alpha has only ever worked accidently.
The firms that have been printing money from quant investing over the past five years have all been traditional hedge funds that incorporated quant methods into a fundamental process. And I expect that will continue because, bluntly, these firms know what the price of a security should be better than some busted factor model.
I heard, on multiple occasions from senior leadership, that industry practice was to categorize all dev work as IT so they could justify lower compensation. Pay was, just as you pointed out, not the main problem with this setup. There was no respect for other people and individuals were frequently berated or humiliated for no reason.
One interesting observation: despite the fact that I am very well qualified to work in the sector, it's very rare for fintech recruiters to reach out to me. They must know that I'm far out of their pay grade now and will never go back.
Now those firms are all in a completely competitive industry squeezing each other for basis points.
Meanwhile the definition of a FAANG is that it has an effective monopoly, and these companies are taking in way more money than the HFT industry. (Netflix is losing its monopoly but we can’t really drop N from the acronym without a replacement..)
Huh, yeah. That would be quite a GAAF. Gotta come up with something before Netflix is forced out of the FAANG club.
Where I work at, the new grad offers are slightly better than FAANG, but the growth is very good based on performance, we also pay very well to people coming in from a competitor.
I can confirm, known firm, manager role, offer was $1,3M full cash.
Is there any realistic path for a demonstrably smart and hardworking person into that 7+zeros club? Evidence suggests no: leetcode grinders and FAANGers are not in that club, and most of them will never even make it into the 6+zeros club. Net wealth -- sure, but not income.
There's a good meta-lesson here which is that smart people will do dumb things if you make the smart thing require too much red tape or process.
Processes can block stupidity, but they can also block intelligence if not well designed.
There is a much more serious focus on having a defense in depth, and making sure that problems like this are noticed before they become an issue. Rollbacks are no longer the first action when something goes wrong: the kill switch comes first.
Dead code, tech debt, repurposed flags, and spotty test coverage are everywhere still.
I wouldn’t think of flags as expensive / effortful to make more of, but clearly they must be if people are tempted to reuse them. Can you help me understand what is meant by a flag in this context, and why it would be repurposed?
Often flags are local to a particular object. If there are lots of such objects, you want each to take as little space as possible. You should check out the contortions linux devs go through to make struct page small [0]. This is important, because there is one such struct per page of physical memory. The memory use is a near-constant percentage of your total memory, and you wouldn't want it to be any larger than necessary.
Even when there are not a lot of these objects, in low-latency software it's important to hit the cache. Your program should always just be as compact in memory as possible.
Semantically flags are booleans (is proposition P true of this object). They are stored compactly as bitsets, often implicitly, say:
#define FLAG_1 0x01
#define FLAG_2 0x02
/* ... */
#define FLAG_8 0x80
struct order {
u32 qty;
u16 id;
u8 type;
u8 flags;
};
This struct will fit into 8 bytes. This is great, as you probably won't waste space to alignment in many cases -- 8 is a good multiple. But if you wanted to add FLAG_9 here, your flags would become a u16, and your struct would, frustratingly, stop fitting into 8 bytes. To avoid this, one might repurpose flags.Another example of this is intrustive flagging, using, for example, the high or low bits of a pointer aligned to 2^n bytes. If you run out of bits there, not much you can do.
[0] https://github.com/torvalds/linux/blob/master/include/linux/...
However, you can still do something about making this safe. For instance the program could do some sort of version check on startup and panic if things weren't correct.
There's a bunch of stuff that needs to be done before the program reaches its low latency steady state, where speed doesn't matter. Might as well add checks there to make sure things are correct.
Yet you don't have to hang around here long to be told that "Unit Testing is Overrated": https://tyrrrz.me/blog/unit-testing-is-overrated
Things have changed a lot since 2012, and at the same time haven’t. Circuit breakers and position monitoring are no.1 in any sane market making firm. What happened then I can’t imagine happening now (accumulating a huge position for, what was it, 30 minutes? With nobody killing the algos within a couple of minutes?). On the other hand, the perfect world of “code hygiene” and 100% test coverage will never exist in this world, things will slip and they do frequently. What’s better, externally, is the availability of good tools for development and change reviews (bitbucket taking hold, for example), automated deployments, containers, testing frameworks and similar. This type of software, end to end, is incredibly complex and difficult to reason about when unexpected happens (there was a TTL misconfig for multicast and we never got such and such update? Well, no one thought of that!), esp these days with the influx of ML algos for price generation.
Also, any code that does not need to be there, is promptly removed right away.
Where I work at, we have a few people from KCG i.e what was formed after Knight Capital merged with GETCO, after this incident. Sometimes this incident is bought up, although none of them I think ever worked for Knight Capital before this incident.
*The fact I believe I could easily do it is probably exactly why I'd end up making some huge mistake. ;)
Maybe if you could translate from Coq.
Most of us (hopefully) have less devastating technical debt to deal with, but it is still a cautionary tale about what could happen if you ignore it for too long.
The extreme rigor on the one hand seems to require a value judgement of the real benefits to HTF that I'm not willing to make. The remaining nimble'ity, on the other hand, is an odd word to use over agility or old fashioned responsibility. The benefit is proportional to it, but not exclusively.
The rapidly evolving market conditions concern regular trade too. Swift reactions are expected in any other systems application. "almost impossible" is a weasel word. It's almost impossible to win except for the last man standing, is that it? And there's no practical upper limit to nimble'y, though conservative estimates indicate that less work is more.
What's missing is the perverse incentives, corrupt policies, sociopathic leadership, ...
Rolling back code is another thing I have no tolerance for anymore. The only option we entertain these days is a roll-forward. If your software takes so long to iterate/build that you need to go back to and old version in an emergency, you need to review your languages/tools/frameworks/processes. We maintain a contractual obligation to our customers for same-day code updates (in cases of production/regulatory emergencies) because we have enough confidence in our processes.
Your feature flag might be “SHOW_STRIKETHROUGH_PRICING_ON_CROSSSELL_OFFERS”; theirs is a bit mask macro to pick off the 5th bit from the 7th byte. (Why do they care? Because if they allow themselves to get fat and slow, a competitor will take the money.)
Roll-forward only, same day SLA is probably right for your business, but isn’t for a company that could have their systems dusting off $1M every handful of seconds that bad code is running.
Different business problems call for different technical approaches. You should no more adopt theirs than they should yours.
Reusing the RAM occupied by flags might be do-able in a clean way using guards like the structured enum in Rust, which permits unions (objects that occupy the same space) that always have the right type (i.e., the compiler has knowledge what is in there at each point in time). This mechanism could in theory be extended beyond type-safety to accommodate other contexts in systems programming use cases where memory usage is extremely important.
Sounds familiar. From everything listed above, it would sounds like this must have been yet another one of those "Just F-ing push the change the server now, dweeb, the traders are going crazy" environments. That is to say: the real problems were most likely cultural, and not about the sum of a certain set of bad practices.
Some of the issues mentioned include:
The article suggests improvements that could have prevented the chain of events.For those here who are in HFT circles, have things improved after the Knight Capital Group debacle?
edit: formatting