Profile: 10000truths - Hacker Neue

10000truths

Joined Jun 27, 2020 2,902 karma

10000truths Dec 21, 2025

The biggest risk to this business model isn't the government, but the payment processor. Anonymity makes it easy for unsavory characters to use stolen credit cards to buy your compute. The inevitable barrage of chargebacks will then cause Stripe to cut you off. Hell, if you're particularly unlucky, your payment processor might even cut you off proactively, if they decide that your lack of KYC makes you a risk.
10000truths Dec 20, 2025

> In practice one of the things that happens very often is that you compress a file filled with null bytes. Such files compress extremely well, and would trigger your A/B threshold.
Sure, if you expect to decompress files with high compression ratios, then you'll want to adjust your knobs accordingly.
> On the other hand, zip bomb described in this blog post relies on decompressing the same data multiple times - so it wouldn't trigger your A/B heuristics necessarily.
If you decompress the same data multiple times, then you increment A multiple times. The accounting still works regardless of whether the data is same or different. Perhaps a better description of A and B in my post would be {number of decompressed bytes written} and {number of compressed bytes read}, respectively.
> Finally, A just means "you can't compress more than X bytes with my file format", right? Not a desirable property to have. If deflate authors had this idea when they designed the algorithm, I bet files larger than "unreasonable" 16MB would be forbidden.
The limitation is imposed by the application, not by the codec itself. The application doing the decompression is supposed to process the input incrementally (in the case of DEFLATE, reading one block at a time and inflating it), updating A and B on each iteration, and aborting if a threshold is violated.
10000truths Dec 19, 2025

For any compression algorithm in general, you keep track of A = {uncompressed bytes processed} and B = {compressed bytes processed} while decompressing, and bail out when either of the following occur:
1. A exceeds some unreasonable threshold
2. A/B exceeds some unreasonable threshold
10000truths Dec 12, 2025

That's not a cycle - service B isn't writing any new data to A.
10000truths Dec 5, 2025

FreeBSD and OpenBSD explicitly mention the prefaulting behavior in the mlock(2) manpage. The Linux manpage alludes to it in that you have to explicitly pass the MLOCK_ONFAULT flag to the mlock2() variant of the syscall in order to disable the prefaulting behavior.
10000truths Dec 5, 2025

Recursion: That's easy, don't. At least, not with a call stack. Instead, use a stack container backed by a bounded allocator, and pop->process->push in a loop. What would have been a stack overflow is now an error.OutOfMemory enum that you can catch and handle as desired. All that said, there is a proposal that addresses making recursive functions more friendly to static analysis [0].
Function pointers: Zig has a proposal for restricted function types [1], which can be used to enforce compile-time constraints on the functions that can be assigned to a function pointer.
[0]: https://github.com/ziglang/zig/issues/1006 [1]: https://github.com/ziglang/zig/issues/23367
10000truths Dec 5, 2025

Which is why I said "allocate and pin". POSIX systems have mlock()/mlockall() to prefault allocated memory and prevent it from being paged out.
10000truths Dec 4, 2025

Sure, but you can do the next best thing, which is to control precisely when and where those allocations occur. Even if the possibility of crashing is unavoidable, there is still huge operational benefit in making it predictable.
Simplest example is to allocate and pin all your resources on startup. If it crashes, it does so immediately and with a clear error message, so the solution is as straightforward as "pass bigger number to --memory flag" or "spec out larger machine".
10000truths Dec 4, 2025

...over the course of 8.5 months, which is way too short for a meaningful result. If their strategy could outperform the S&P 500's 10-year return, they wouldn't be blogging about it.
10000truths Dec 4, 2025

The reason I really like Zig is because there's finally a language that makes it easy to gracefully handle memory exhaustion at the application level. No more praying that your program isn't unceremoniously killed just for asking for more memory - all allocations are assumed fallible and failures must be handled explicitly. Stack space is not treated like magic - the compiler can reason about its maximum size by examining the call graph, so you can pre-allocate stack space to ensure that stack overflows are guaranteed never to happen.
This first-class representation of memory as a resource is a must for creating robust software in embedded environments, where it's vital to frontload all fallibility by allocating everything needed at start-up, and allow the application freedom to use whatever mechanism appropriate (backpressure, load shedding, etc) to handle excessive resource usage.
10000truths Dec 4, 2025

Disclosing an individual student's information to third parties without express consent is a violation of FERPA laws.
10000truths Nov 21, 2025

This is addressed in the "information asymmetry" section of the article.
10000truths Nov 21, 2025

What do you believe needs improving and why?
10000truths Nov 21, 2025

I think the ambiguity was deliberate.
10000truths Nov 21, 2025

Indeed, at some point, you can't lower tail latencies any further without moving closer to your users. But of the 7 round trips that I mentioned above, you have control over 3 of them: 2 round trips can be eliminated by supporting HTTP/3 over QUIC (and adding HTTPS DNS records to your zone file), and 1 round trip can be eliminated by server-side rendering. That's a 40-50% reduction before you even need to consider a CDN setup, and depending on your business requirements, it may very well be enough.
10000truths Nov 21, 2025

The table is a bit misleading. Most of the resources of a website are loaded concurrently and are not on the critical path of the "first contentful paint", so latency does not compound as quickly as the table implies. For web apps, much of the end-to-end latency hides lower in the networking stack. Here's the worst-case latency for a modern Chrome browser performing a cold load of an SPA website:
DNS-over-HTTPS-over-QUIC resolution: 2 RTTs
TCP handshake: 1 RTT
TLS v1.2 handshake: 2 RTTs
HTTP request/response (HTML): 1 RTT
HTTP request/response (bundled JS that actually renders the content): 1 RTT
That's 7 round trips. If your connection crosses a continent, that's easily a 1-2 second time-to-first-byte for the content you actually care about. And no amount of bandwidth will decrease that, since the bottlenecks are the speed of light and router hop latencies. Weak 4G/WiFi signal and/or network congestion will worsen that latency even further.
10000truths Nov 19, 2025

My rule of thumb is that unchecked access is okay in scenarios where both the array/map and the indices/keys are private implementation details of a function or struct, since an invariant is easy to manually verify when it is tightly scoped as such. I've seen it used it in:
* Graph/tree traversal functions that take a visitor function as a parameter
* Binary search on sorted arrays
* Binary heap operations
* Probing buckets in open-addressed hash tables
10000truths Nov 19, 2025

Yes? Funnily enough, I don't often use indexed access in Rust. Either I'm looping over elements of a data structure (in which case I use iterators), or I'm using an untrusted index value (in which case I explicitly handle the error case). In the rare case where I'm using an index value that I can guarantee is never invalid (e.g. graph traversal where the indices are never exposed outside the scope of the traversal), then I create a safe wrapper around the unsafe access and document the invariant.
10000truths Nov 15, 2025

I realize that this is meant as an exercise to demonstrate a property of variance. But most investors are risk-averse when it comes to their portfolio - for the example given, a more practical target to optimize would be worst-case or near-worst-case return (e.g. p99). For calculating that, a summary measure like variance or mean does not suffice - you need the full distribution of the RoR of assets A and B, and find the value of t that optimizes the p99 of At+B(1-t).
10000truths Nov 8, 2025

They are absolutely aware of these sorts of abuses. I'll bet my spleen that it shows up as a line item in the roadmapping docs of their content integrity/T&S teams.
The root problem is twofold: the inability to reliably automate distinguishing "good actor" and "bad actor", and a lack of will to throw serious resources at solving the problem via manual, high precision moderation.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous