spyspy
Joined 2,285 karma
- spyspyThe way I've set these things up, nothing talks directly to the identity service. The ID service is a backend behind your gateway like any other service and any UI would have to have the request proxied through the gateway to reach it. Now, you can carve out certain rules (if you control the gateway) where requests headed to /users/* don't require the same authN steps other requests do because it's already headed to the ID server. Internal UIs may or may not work the same, that's really up to you - they won't likely be super high scale. Often the support teams won't even be querying the real DB, but instead a view or copy so they can't affect real user data. A share code for users A->B would just be a request from the UI to the ID server via the gateway, authenticated as User A, and responding with the code for B if possible. Or, I've do it where you could have special logic in the gateway to query 2 servers and combine the responses. No need for services to make requests sideways. Hope that makes sense.
- Don't give them ideas.
- The trick is to have your gateway handle authn, and then proxy authz data upstream so those services can decide how to handle it without needing to make a second call to the identity service.
- The trick I've used is the N1 (gateway) service handles all AuthN and proxies that information to the upstream services to allow them to handle AuthZ. N+ services only accept requests signed by N1 - the original authentication info is removed.
- Treating N4 as a service is fair. I think the article was leaning more toward that idea of N4 being a database, which is a legit bad idea with microservices (if fact defeating the point entirely). My takeaway is that if you're going to have a service that many other services depend on, you can do it but you need to be highly away of that brittleness. Your N4 service needs to be bulletproof. Netflix ran into this exact issue with their distributed cache.
- `SELECT * FROM mytable ORDER BY timestamp ASC`
- I'm still convinced the vast majority of kafka implementations could be replaced with `SELECT * FROM mytable ORDER BY timestamp ASC`
- If you haven't been following space updates closely, the US is _already_ in a race with China, especially in regards to the Artemis (moon) missions. That being said it's mostly being used as an excuse to keep SLS alive and prop up the legacy space contractors... It's hard to lose a contest you won 60 years prior...
- This reminds me of the journalist working for months on uncovering Trump's dirty business just for Trump himself to admit the entire thing in a tweet.
- If you do use terraform, for the love of god do NOT use Terraform Cloud. Up there with Github in the list of least reliable cloud vendors. I always have a "break glass" method of deploying from my work machine for that very reason.
- Off topic, and I don't want to knock the presenter here, but if you're ever going to give a public talk or presentation at work _please_ review the Death By Powerpoint slide deck[0] first.
[0] https://www.slideshare.net/slideshow/death-by-powerpoint/855...
- Like any company over a handful of years old, I'm sure they have super old, super critical systems running they dare not touch for fear of torching the entire business. For all we know they were trying to update one of those systems to be more resilient last night and things went south.
- I became one of the founding engineers at a startup, which worked for a little while until the team grew beyond my purview, and no good engineering plan survives contact with sales directors who lie to customers about capabilities our platform has.
- I've seen the exact same thing at multiple companies. The teams were always so proud of themselves for being "multi-cloud" and managers rewarded them for their nonsense. They also got constant kudos for their heroic firefighting whenever the system went down, which it did constantly. Watching actually good engineers get overlooked because their systems were rock-solid while those characters got all the praise for designing an unadulterated piece of shit was one of the main reasons I left those companies.
- Eh, the "best practices" that would've prevented this aren't trivial to implement and are definitely far beyond what most engineering teams are capable of, in my experience. It depends on your risk profile. When we had cloud outages at the freemium game company I worked at, we just shrugged and waited for the systems to come back online - nobody dying because they couldn't play a word puzzle. But I've also had management come down and ask what it would take to prevent issues like that from happening again, and then pretend they never asked once it was clear how much engineering effort it would take. I've yet to meet a product manager that would shred their entire roadmap for 6-18 months just to get at an extra 9 of reliability, but I also don't work in industries where that's super important.
- My only complaint about gofmt is that it’s not even stricter about some things.
- Cloud Run lets you cap the number of instances when you create a service. So you can just set max_instances to 1 and you never have to worry about a spambot or hug of death from blowing up your budget. I run all my personal sites like this and pay (generally) nothing.
- > Health Insurance CEO Reveals Key To Company’s Success Is Not Paying For Customers’ Medical Care [1]
1. https://theonion.com/health-insurance-ceo-reveals-key-to-com...
- You cannot trust your clients. Period. It doesn’t matter if they’re internal or external. If you design (and test!) with this assumption in mind, you’ll never have a bad day. I’ve really never understood why teams and companies have taken this defensive stance that their service is being “abused” despite having nothing even resembling an SLA. It seemed pretty inexcusable to not have a horizontally scaling service back in 2010 when I first started interning at tech companies, and I’m really confused why this is still an issue today.
- Something that was drilled into me early in my career was that you cannot expect your cache to be up 100% of the time. The logical extension of that is your main DB needs to be able to handle 100% of your traffic at a moment’s notice. Not only has this kind of thinking saved my ass on several occasions, but it’s also actually kept my code much cleaner. I don’t want to say rate limiters and circuit breakers are the mark of bad engineering, butttt they’re usually just good engineering deferred.