CEO @ https://urlscan.io Twitter: @heipei
- heipeiI still don't get how folks can hype Postgres with every second post on HN, yet there is no simple batteries-included way to run a HA Postgres cluster with automatic failover like you can do with MongoDB. I'm genuinely curious how people deal with this in production when they're self-hosting.
- The client is supposed to monitor availability themselves, that is how these contracts work.
- I'd rather he'd still be working on Nomad to be honest, but Ghostty is a good consolation prize ;)
- Obligatory reminder that "GPU utilisation" as a percentage is meaningless metric and does not tell you how well your GPU is utilised.
Does not change the usefulness of this dashboard, just wanted to point it out.
- Yeah, that was the first thing I checked as well. Being suited for small / tiny files is a great property of the SeaweedFS system.
- Sounds complicated. I just use autossh from the CLI and it reconnects if my laptop (or the remote machine) wakes up again.
- If you want to see what the phishing site (npmjs[.]help) looks like: https://urlscan.io/result/01992a3e-4f8c-72bb-90a9-c13826f2d8... - Was still up and running 2 hours ago.
- I agree, and I don't get where the claims that ES is hard to operate originate from. Yeah, if you allow arbitrary aggregations that exceed the heap space, or if you allow expensive queries that effectively iterate over everything you're gonna have a bad time. But apart from those, as long as you understand your data model, your searches and how data is indexed, ES is absolutely rock-solid, scales and performs like a beast. We run a 35-node cluster with ~ 240TB of disk, 4.5TB of RAM, and about 100TB of documents and are able to serve hundreds of queries. The whole thing does not require any maintenance apart from replacing nodes that failed from unrelated causes (hardware, hosting). Version upgrades are smooth as well.
The only bigger issue we had was when we initially added 10 nodes to double the initial capacity of the cluster. Performance tanked as a result, and it took us about half a day until we finally figured out that the new nodes were using dmraid (Linux RAID0) and as a result the block devices had a really high default read-ahead value (8192) compared to the existing nodes, which resulted in heavy read amplification. The ES manual specifically documents this, but since we hadn't run into this issue ourselves it took us a while to realise what was at fault.
- ScyllaDB discontinued it's free and open source version, so I personally wouldn't build anything new on it.
- Counterpoint: I once wrote a paper on accelerating blockciphers (AES et al) using CUDA and while doing so I realised that most (if not all) previous academic work which had claimed incredible speedups had done so by benchmarking exclusively on zero-bytes. Since these blockciphers are implemenented using lookup tables this meant perfect cache hits on every block to be encrypted. Benchmarking on random data painted a very different, and in my opinion more realistic picture.
- In my experience this affects techie users just as much. Especially when there is a UI that has been crafted and slowly perfected over the years, and where any remaining idiosyncrasy has long been learned by the user, changing that UI has profound negative impact on the productivity of anyone using the platform.
I have rarely seen UI changes where users were genuinely excited to have a new UI with the understanding that they'd have to learn new paradigms. Most web apps should still be Bootstrap apps, but of course then you can't put that on a giant dashboard wall at a conference ;)
- No it's not, not if you want to win customers from the US. Their annual budgets are in USD, so they don't have the flexibility to pay more next year just because the foreign exchange rate has shifted. You take the foreign exchange risk by listing prices in USD, but it could just as well be a windfall, and your customers pay stable prices in return.
- I just wanted to say thank you for your work on Nomad. It's one of the most pleasant and useful pieces of software I have ever worked with. Nomad allowed us to build out a large fleet of servers with a small team while still enjoying the process.
- I don't know why / how messages should be ordered. NSQ is a message queue and not a log. Some messages take longer to process than others, and some messages need to be re-queued and re-tried out of order, and that is a very common use-case.
I would love to be able to use a distributed log like Kafka/Redpanda since it's HA out of the box, but it simply does not fit that use-case.
- I could say the same thing about NSQ which is a distributed message queue with very simple semantics and a great HTTP API for message publishing and control actions. What it doesn't offer natively is HA though.
- It is not straightforward, and it is complicated by a number of factors. The first would be bad "brand hygiene": If a company has dozens of legitimate domains across different TLDs, different providers and different geographical locations then it's already more complicated than just one canonical .com domain. If teams within the company are permitted to spin up their own domains (e.g. marketing campaigns, branch offices) then it gets 10x worse. Lastly if a legitimate brand frequently changes its appearance, it will be harder to pin down the true brand identity.
But even if you follow all of these best practices there are still powerful attack vectors. A threat actor could host their phishing page on an unrelated (compromised) domain with good domain reputation, in that case you wouldn't even know about that site until the first email or SMS hits your customers. Or the threat actor could use one of the many file-hosting or website services to create their site and host it on a shared third-party domain with perfect domain reputation (e.g. amazonaws.com).
And then there's incentive: It's no the companies that suffer financial losses, it is their customers. If you were talking about their employees being phished that would be a different story. Same thing for Google Safe Browsing: Their incentive is to protect against most of the obvious phishing, without any false positives, ever. If they are slow to detect something they won't suffer any losses. If they generate a False Positive their Chrome browser might suffer significant reputational damage if a popular legitimate domain is blocked.
- Seconding this. Evading detection has become a real cake-walk since threat actors are able to sign up for a free Cloudflare account and then put their phishing site on their 2-hours old domain behind a level of protection backed by a $20B company. Funny that you almost never see phishing on Akamai ;)
Disclaimer: We operate in this space so we obviously have an interest in being able to detect these threats going forward.
- Here's hoping they don't run great tools like Consul and Nomad into the ground somehow. If I'm ever forced to ditch Nomad and work with a pile of strung-together components like k8s I might just quit tech altogether.
- Everything written in this article is true, but there are as usual more nuances to all of these. One of the primary reasons for me to go with a GmbH (and a holding) was the expectation of excess profits that I'd want to reinvest instead of drawing a salary. The other neat aspect of stacking a holding company on top of the operating company is that capital gains are not taxed in the event of a sale. That goes for investing into ETFs for example, but more importantly if the holding company ever sells the operating company then that sale is tax-free for the holding company. The ultimate owner is then able to pay out the proceeds and only be taxed using his personal capital gains tax rate (25%-ish).
- To the people dismissing the idea of binarising vectors: Fair criticism, but consider the fact that you can also train a model with a loss function that approaches a binary behaviour, i.e. so that the magnitude per dimension plays an insignificant role and only the sign of the dimension carries information. In that case you can use the binary vector for search and ranking.