- pedrozieg parent“In a library on a burner laptop” - but I’ll narrow it down to people who have handed in their notice on a specific day.
- A useful way to frame this isn’t “is it worth tens of hours to avoid a future reinstall” but “where do I want my entropy to live”. You’re going to invest time somewhere: either in a slowly-accumulating pile of invisible state (brew, manual configs, random installers) or in a config that you can diff, review and roll back. The former feels free until you hit some cursed PATH/SSL/toolchain issue at 11pm and realize you’ve been paying that tax all along, just in tiny, forgotten increments.
I think where Nix shines isn’t “one laptop every 6 years” but when your environment needs to be shared or recreated: multiple machines, a team, or a project with nasty native deps. At that point, nix-darwin + dev shells becomes infrastructure, not a hobby. You don’t have to go all-in on “my whole Mac is Nix” either: keep GUI apps and casual tools imperative, and treat Nix as the source of truth for the stuff that actually blocks you from doing work. That hybrid model matches what the article hints at and tends to give you most of the upside without turning your personal laptop into a second job.
- 5 points
- What I like about this writeup is that it quietly demolishes the idea that you need DeepMind-scale resources to get “superhuman” RL. The headline result is less about 2048 and Tetris and more about treating the data pipeline as the main product: careful observation design, reward shaping, and then a curriculum that drops the agent straight into high-value endgame states so it ever sees them in the first place. Once your env runs at millions of steps per second on a single 4090, the bottleneck is human iteration on those choices, not FLOPs.
The happy Tetris bug is also a neat example of how “bad” inputs can act like curriculum or data augmentation. Corrupted observations forced the policy to be robust to chaos early, which then paid off when the game actually got hard. That feels very similar to tricks in other domains where we deliberately randomize or mask parts of the input. It makes me wonder how many surprisingly strong RL systems in the wild are really powered by accidental curricula that nobody has fully noticed or formalized yet.
- What I like about this approach is that it quietly reframes the problem from “detect AI” to “make abusive access patterns uneconomical”. A simple JS+cookie gate is basically saying: if you want to hammer my instance, you now have to spin up a headless browser and execute JS at scale. That’s cheap for humans, expensive for generic crawlers that are tuned for raw HTTP throughput.
The deeper issue is that git forges are pathological for naive crawlers: every commit/file combo is a unique URL, so one medium repo explodes into Wikipedia-scale surface area if you just follow links blindly. A more robust pattern for small instances is to explicitly rate limit the expensive paths (/raw, per-commit views, “download as zip”), and treat “AI” as an implementation detail. Good bots that behave like polite users will still work; the ones that try to BFS your entire history at line rate hit a wall long before they can take your box down.
- Postgres’s extensible index AM story doesn’t get enough love, so it’s nice to see someone really lean into it for LIKE. Biscuit is basically saying: “what if we precompute an aggressive amount of bitmap structure (forward/backward char positions, case-insensitive variants, length buckets) so most wildcard patterns become a handful of bitmap ops instead of a heap scan or bitmap heap recheck?” That’s a very different design point from pg_trgm, which optimizes more for fuzzy-ish matching and general text search than for “I run a ton of LIKE '%foo%bar%' on the same columns”.
The interesting question in prod is always the other side of that trade: write amplification and index bloat. The docs are pretty up-front that write performance and concurrency haven’t been deeply characterized yet, and they even have a section on when you should stick with pg_trgm or plain B-trees instead. If they can show that Biscuit stays sane under a steady stream of updates on moderately long text fields, it’ll be a really compelling option for the common “poor man’s search” use case where you don’t want to drag in an external search engine but ILIKE '%foo%' is killing your box.
- Investors talk about HN like it’s a growth lever, but the site mostly behaves like a long-running reading habit with spam defenses. A tiny slice of users on /new decide whether you even get a shot, gravity slowly pushes old stuff down, there’s a “second chance” queue for posts that looked promising but died early, and moderators occasionally hand-tune obvious mistakes. Beyond that, it’s just a bunch of curious people clicking what looks interesting.
The only repeatable “strategy” I’ve seen work is: write things that would be interesting even if HN didn’t exist, and let other people submit them. Trying to treat HN as a distribution channel (carefully timed posts, optimized titles, orchestrated upvotes) reliably backfires because the software + mods are explicitly optimized against that. If you treat it as a weird little newspaper run by nerds for their own curiosity, the dynamics suddenly make a lot more sense.
- I buy the economics argument, but I’m not sure “mainstream formal verification” looks like everyone suddenly using Lean or Isabelle. The more likely path is that AI smuggles formal-ish checks into workflows people already accept: property checks in CI, model checking around critical state machines, “prove this invariant about this module” buttons in IDEs, etc. The tools can be backed by proof engines without most engineers ever seeing a proof script.
The hard part isn’t getting an LLM to grind out proofs, it’s getting organizations to invest in specs and models at all. Right now we barely write good invariants in comments. If AI makes it cheap to iteratively propose and refine specs (“here’s what I think this service guarantees; what did I miss?”) that’s the moment things tip: verification stops being an academic side-quest and becomes another refactoring tool you reach for when changing code, like tests or linters, instead of a separate capital-P “formal methods project”.
- We’ve had variations of “JSON describes the screen, clients render it” for years; the hard parts weren’t the wire format, they were versioning components, debugging state when something breaks on a specific client, and not painting yourself into a corner with a too-clever layout DSL.
The genuinely interesting bit here is the security boundary: agents can only speak in terms of a vetted component catalog, and the client owns execution. If you get that right, you can swap the agent for a rules engine or a human operator and keep the same protocol. My guess is the spec that wins won’t be the one with the coolest demos, but the one boring enough that a product team can live with it for 5-10 years.
- I’m not worried about “can I, personally, keep this thing running?” so much as “what is the long-term story for the kind of person who buys a turnkey appliance”.
Yes, Umbrel OS is on GitHub and you can already run it on generic NUCs / Pi etc. That’s great. But the value prop of the hardware is the whole bundle: curated apps, painless updates, maybe remote access, maybe backups. If Umbrel-the-company pivots or withers, the repo still being there under a non-commercial license doesn’t guarantee ongoing maintenance, an app store, or support. And the NC clause is exactly what makes it hard for someone else to step in and sell a fully supported forked “Umbrel but maintained” box to non-technical users. So for people like you and me, sure, we can just install it elsewhere; for the target audience of an expensive plug-and-play box, the long-term social contract is still the fragile part.
- A decade of “personal cloud box” attempts has shown that the hard part isn’t the hardware, it’s the long-term social contract. Synology/WD/My Cloud/etc all eventually hit the same wall: once the company pivots or dies, you’re left with a sealed brick that you don’t fully control, holding the most irreplaceable thing you own: your data. If you’re going to charge an Apple-like premium on commodity mini-PC hardware, you really have to over-communicate what happens if Umbrel-the-company disappears or changes direction: how do I keep using this thing in 5–10 years without your cloud, your app store, your updates?
The interesting opportunity here isn’t selling a fancy N100 box, it’s turning “self-hosted everything” into something your non-technical friend could actually live with. That’s mostly about boring stuff: automatic off-site backup that isn’t tied to one vendor, painless replacement/restore if the hardware dies, and clear guarantees about what runs locally vs phoning home. If Umbrel leans into being forkable and portable across generic hardware, it has a shot at being trusted infrastructure instead of just another pretty NAS that people regret once the marketing site goes dark.
- I’m working on 2zuz, a product search engine that optimizes for the users rather than the advertizers.
The goal is simple: if you search for something specific, you shouldn’t have to scroll through ads, “inspired by your search”, or completely-irrelevant junk. You should just only see products that actually match exactly what you’re looking for.
Right now it searches across a few large stores and I’m iterating on the ranking and filtering. If you buy a lot of stuff online, I’d love feedback on where the results feel clearly better, and where they still fail compared to Amazon/etc.
Link: https://2zuz.com
- 1 point
- The interesting thing here isn’t “spreadsheet, but backwards” so much as “spreadsheet as a constraint system”. Classic spreadsheets are basically DAGs: data flows one way and a lot of UX assumptions (and people’s intuition) rely on that. As soon as you allow arbitrary cells to be solved for, you’re in “which variables are free?” land, and most of the confusion in this thread is really about degrees of freedom, not about the math.
One way to make this less surprising might be to flip the default: treat all cells as fixed unless explicitly marked as solver variables, and give a lightweight visualization of “these are the cells that will move if you edit this one.” That keeps the power of a general constraint solver while preserving the mental model spreadsheet users already have, and it opens the door to more serious use cases (financial models, physics, scheduling) without feeling like spooky action at a distance.
- The interesting bit in the blog isn’t the 72.2% SWE-Bench Verified number, it’s their own human eval: Devstral 2 beats DeepSeek V3.2 in Cline-style workflows but still loses clearly to Claude Sonnet 4.5. That’s a nice reminder that “open SOTA” on a single benchmark doesn’t mean “best tool for the job” once you’re doing multi-step edits across a messy real repo.
What is a big deal here is the combination of licensing and packaging. A 123B dense code model under a permissive license plus an open-source CLI agent (Vibe) that already speaks ACP is basically a reference stack for “bring your own infra + agents” instead of renting someone else’s SaaS IDE. If that ecosystem hardens (Cline, Kilo, Vibe, etc.), the moat shifts from “we have the only good code model” to “we own the best workflows and integrations”, and that’s a game open models can realistically win.
- There’s something refreshing about explicitly saying “this editor exists to delight me, and that’s enough”. The default script now is that every side project should either be open-sourced or turned into a SaaS, even if that pressure is exactly what kills the weirdness that made it interesting in the first place.
Some of the best tools I’ve used felt like they started as someone’s private playground that only later got hardened into “serious” software. Letting yourself park Boo, go build a language, and come back when it’s fun again is probably how we get more Rio/Boo-style experiments instead of yet another VS Code skin with a growth deck attached.
- A lot of this “startups commit suicide, not homicide” framing feels like it pathologizes what is often a rational choice. Most founders don’t “give up” in some moral sense, they just run out of personal runway before the company produces enough evidence that the sacrifice is still worth it. Cash runway is visible on a spreadsheet; emotional runway, health, relationships, and opportunity cost aren’t, but they’re just as real a constraint.
What I’ve seen kill companies is the mismatch between those two curves: the time it takes to get real signal from the market vs the time a small group of humans can tolerate living in permanent crisis mode. In a ZIRP world you could paper over that with cheap capital; in 2025 you can’t. Calling that “suicide” makes it sound like a failure of grit, when it’s often just updating on new information about your life and the macro environment and deciding this particular lottery ticket isn’t worth any more years.
- The most interesting bit here is not the “2.4x faster than Lambda” part, it is the constraints they quietly codify to make snapshots safe. The post describes how they run your top-level Python code once at deploy, snapshot the entire Pyodide heap, then effectively forbid PRNG use during that phase and reseed after restore. That means a bunch of familiar CPython patterns at import time (reading entropy, doing I/O, starting background threads, even some “random”-driven config) are now treated as bugs and turned into deployment failures rather than “it works on my laptop.”
In practice, Workers + Pyodide is forcing a much sharper line between init-time and request-time state than most Python codebases have today. If you lean into that model, you get very cheap isolates and global deploys with fast cold starts. If your app depends on the broader CPython/C-extension ecosystem behaving like a mutable Unix process, you are still in container land for now. My hunch is the long-term story here will be less about the benchmark numbers and more about how much of “normal” Python can be nudged into these snapshot-friendly constraints.
- The wild part is that this isn’t really a pricing mistake, it’s the business model. That $65 Canon isn’t “a printer”, it’s a subsidized acquisition channel for a customer who will buy OEM ink at a huge margin or sign up for some kind of recurring refill program later. Accounting is fine eating some or all of the printer cost if the average buyer turns into years of cartridge revenue.
If you buy a new printer every time you run low on ink, you’re basically arbitraging that CAC line item. On paper it can be a “life hack” as long as only a few people do it and you ignore the e-waste and friction. If it ever became common, the easy knobs for the manufacturer are obvious: even smaller starter carts, more lock-in, more activation hoops, and less of the subsidy that makes this trick work in the first place.
- What I like about this writeup is that it surfaces a tension most “let’s build a compiler” tutorials skip: the AST is both a data structure and a UX boundary. Super-flat layouts are fantastic for cache and memory, but they’re hostile to all the things humans care about (debuggable shapes, easy instrumentation, ad-hoc traversals, “just print this node and its children” in a debugger). A lot of production compilers quietly solve that by having two tiers: a nice, inefficient tree for diagnostics and early passes, and increasingly flattened / interned / arena-allocated forms as you move toward optimization and codegen.
The interesting question for me is where the crossover is now that IDEs and incremental compilation dominate the workload. If your front-end is effectively a long-running service, it might be worth keeping a friendlier AST around and only using a super-flat representation for hot paths like analysis passes or bulk refactors. Otherwise you risk saving a few hundred MB while spending engineer-months making every new pass fight the layout.
- CVE counts are such a good example of “what’s easy to measure becomes the metric”. The moment Linux became a CNA and started issuing its own CVEs at scale, it was inevitable that dashboards would start showing “Linux #1 in vulnerabilities” without realizing that what changed was the paperwork, not suddenly worse code. A mature process with maintainers who actually file CVEs for real bugs looks “less secure” than a project that quietly ships fixes and never bothers with the bureaucracy.
If Greg ends up documenting the tooling and workflow in detail, I hope people copy it rather than the vanity scoring. For anyone running Linux in production, the useful question is “how do I consume linux-cve-announce and map it to my kernels and threat model”, not “is the CVE counter going up”. Treat CVEs like a structured changelog feed, not a leaderboard.
- It’s easy to forget how awful TLS was before Let’s Encrypt: you’d pay per-hostname, file tickets, manually validate domains, and then babysit a 1-year cert renewal calendar. Today it’s basically “install an ACME client once and forget it” and the web quietly shifted from <30% HTTPS to ~80% globally and ~95% in the US in a few years.
The impressive bit isn’t just the crypto, it’s that they attacked the operational problem: automation (ACME), good client ecosystem, and a nonprofit CA that’s fine with being invisible infrastructure. A boring, free cert became the default.
The next 10 years feel harder: shrinking lifetimes (45-day certs are coming) means “click to install cert” can’t exist anymore, and there’s still a huge long tail of internal dashboards, random appliances, and IoT gear that don’t have good automation hooks. We’ve solved “public websites on Linux boxes,” but not “everything else on the network.”
- 1 point
- What WhatsApp really needs to do is allow people to store their chats in the cloud. WhatsApp is the only communication tool that forces people to keep everything on their phones - or delete information. This causes WhatsApp to take up a large chunk of the available space on most phones.