Preferences

joatmon-snoo
Joined 2,358 karma
hackernews[at]sxlijin[dot]com

  1. The argument for not using electric sharpeners is that they (1) cut down the lifetime of your knife substantially and (2) they do a mediocre job of sharpening.

    Mechanically, it's just high-abrasive motorized spinning discs at preset angles. So rather than getting a good edge by taking a few microns of material off by doing it manually, you get an OK edge by taking 0.2mm off at a time. (If 0.2mm doesn't sound like a lot, think about how many mm wide your knife is.)

    ---

    I'm personally 50-50 on this advice: most people don't sharpen their knives at all, and I think people are better off getting 10 OK years out of a knife than 50 terrible years out of it.

    I'm also not willing to learn how to use a whetstone, so I landed in the middle on this: https://worksharptools.com/products/precision-adjust-knife-s...

  2. There's a lot of tooling built on static binaries:

    - google-wide profiling: the core C++ team can collect data on how much of fleet CPU % is spent in absl::flat_hash_map re-bucketing (you can find papers on this publicly)

    - crashdump telemetry

    - dapper stack trace -> codesearch

    Borg literally had to pin the bash version because letting the bash version float caused bugs. I can't imagine how much harder debugging L7 proxy issues would be if I had to follow a .so rabbit hole.

    I can believe shrinking binary size would solve a lot of problems, and I can imagine ways to solve the .so versioning problem, but for every problem you mention I can name multiple other probable causes (eg was startup time really execvp time, or was it networked deps like FFs).

  3. (author here) To be more specific, here's a benchmark that we ran last year, where we compared schema-aligned parsing against constrained decoding (then called "Function Calling (Strict)", the orange ƒ): https://boundaryml.com/blog/sota-function-calling
  4. This setting is new and was introduced in response to the first round of shai hulud attacks.
  5. Google never asked a volunteer for a fix.

    This is part of Google’s standard disclosure policy: it gets disclosed within 90 days starting from confirmation+contact.

    If ffmpeg didn’t want to fix it, they could’ve just let the CVE get opened.

  6. No, this is the unfortunate reality of “ffmpeg is maintained by volunteers” and “CVE discovered on specific untrusted input”.

    Google’s AI system is no different than the oss-fuzz project of yesteryear: it ensures that the underlying bug is concretely reproducible before filing the bug. The 90-day disclosure window is standard disclosure policy and applies equally to hobby projects and Google Chrome.

  7. This is great to see, much appreciated for the disclosure!
  8. We’ve had a lot of success implementing schema-aligned parsing in BAML, a DSL that we’ve built to simplify this problem.

    We actually don’t like constrained generation as approach - among other issues it limits your ability to use reasoning - and instead the technique we’re using is algorithm-driven error-tolerant output parsing.

    https://boundaryml.com/

  9. For folks who don't know what Magic Lantern is:

    > Magic Lantern is a free software add-on that runs from the SD/CF card and adds a host of new features to Canon EOS cameras that weren't included from the factory by Canon.

    It also backports new features to old Canon cameras that aren't supported anymore, and is generally just a really impressive feat of both (1) reverse engineering and (2) keeping old hardware relevant and useful.

  10. Strings are like time objects: most people and languages only ever deal with simplified versions of them that skip a lot of edge cases around how they work.

    Unfortunately going from most languages to Rust forces you to speedrun this transition.

  11. nit: Colossus* for Google.
  12. > if you proceed with care

    Yes, but that is _incredibly_ time consuming. You have to set up asan, msan, tsan, and valgrind. If you want linting you need to do shenanigans to wire up clang-tidy.

    I also like simple mental models. I like not having to figure out the cmake modifications to pull in a new library. I like having a search engine when I need a new library for x. I like when libraries return Result<Ok, Err> instead of ping ponging between C libraries which indicate errors using retval flags or C++ libraries that throw std::runtime_error(). I like not dealing with void* pointer casting .

  13. An unfortunate problem with using awk: there are three different versions of awk, and it is frighteningly easy to use a feature that exists on one but not other.

    (source: I have written unit tests against different versions of awk. That was... unpleasant.)

  14. Pretty sure dedupe is done manually by u/dang.
  15. Super cool! We at BAML had been thinking about doing something like this for our ecosystem as well - we’d love to add BAML models to this repo!

    If you haven’t heard of us, we provide a language and runtime that enable defining your schemas in a simpler syntax, and allow usage with _any_ model, not just those that implement tool calling or json mode, by by relying on schema-aligned parsing. Check it out! https://github.com/BoundaryML/baml

  16. bidi streaming screws with a whole bunch of assumptions you rely on in usual fault-tolerant software:

    - there are multiple ways to retry - you can retry establishing the connection (e.g. say DNS resolution fails for a 30s window) _or_ you can retry establishing the stream

    - your load-balancer needs to persist the stream to the backend; it can't just re-route per single HTTP request/response

    - how long are your timeouts? if you don't receive a message for 1s, OK, the client can probably keep the stream open, but what if you don't receive a message for 30s? this percolates through the entire request path, generally in the form of "how do I detect when a service in the request path has failed"

  17. ...how is _this_ the insight that you come away from this post with?

    This post is a commentary on product quality issues, the underlying cost models (both goods and services), and the interplay with American culture. There's like 20+ company/product anecdotes in there - a mistake about one detail about one technical detail of one product is wildly uninteresting.

  18. > This seems like exactly the sort of question the market will quickly decide over the next couple of years and not worth arguing over.

    Discussions like this are _how_ the market decides whether or not this achievement is real or not.

  19. If Google adopts an existing OSS technology, it usually takes the form of Google contributors joining a core team for the OSS in question. The OSS community generally isn't a fan of single companies taking _over_ a project and generally prefers ownership changes in the other direction (e.g. Kubernetes getting transferred to the CNCF).

    That being said, the most noticeable example here that I can think of is Google migrating its internal C++ toolchain from using gcc/g++ to clang.

  20. Different DBs implement locks differently.

    Postgres allows obtaining advisory locks at either the session _or_ transaction level. If it's session-level, then you have, ergo, a connection-level lock.

    https://www.postgresql.org/docs/current/explicit-locking.htm...

  21. This is what lockfiles are for.
  22. Good file watching that provides flexible primitives absolutely requires:

    - ok, a single ext4 file inode changes, and its filename matches my hardcoded string

    - oh, you don’t want to match against just changes to “package.json” but you want to match against a regex? voila, now you need a regex engine

    - what about handling a directory rename? should that trigger matches on all files in the renamed directory?

    - should the file watcher be triggered once per file, or just every 5ms? turns out this depends on your use case

    - how do symlinks fit into this story?

    - let’s say i want to handle once every 5ms- how do i actually wait for 5ms? do i yield the thread? do i allow other async contexts to execute while i’m waiting? how do those contexts know when to execute and when to yield back to me? now you have an async runtime with timers

    - how does buffering work? are there limits on how many file change events can be buffered? do i dynamically allocate more memory as more file changes get buffered? now you need a vector/arraylist implementation

    And this is before you look at what this looks like on different platforms, or if you want polling fallbacks.

    Can you do it with less dependencies? Probably, if you start making hard tradeoffs and adding even more complexity about what features you activate - but that only adds lines of code, it doesn’t remove them.

    What you describe is ideologically nice, but in practice it’s over-optimizing for a goal that most people don’t really care about.

  23. It's much easier to say "I'm going to make it impossible for us to have to worry about the Australian government filing a lawsuit against $my-state-agency, because legal said so" than "Well, if we allow Australian IPs to access this website, there's a 0.x% chance that we get sued by Australia, but it's worth it for the sake of the 0.00x% of American expats in Australia."

    Here's a analogously real example from current US-Ukraine policy:

    > For example, one current social goal in the U.S., given the geopolitical conflict with Russia, is to avoid facilitating activities that could aid the adversary. As Russia has invaded Ukraine, the U.S. has positioned itself in opposition to Russia but not Ukraine. Banks, therefore, need to align with these geopolitical stances, leading to decisions that might catch some individuals in the crossfire, even if they’re not directly involved.

    > Financial institutions often interpret this as: if they're not deeply specialized in doing business in Ukraine, they should avoid it altogether. They fear they won’t be able to consistently ensure compliance with these complex directives from the government [especially because there's a chance those directives might change in a week, or a month, or 3 months].

    > This creates a split-brain problem within U.S. decision-making. The government intends to say, "Please cut down on oligarch money laundering that supports Russia’s war effort." However, financial institutions hear this as, "Under no circumstances should you fund anything related to Ukraine," including, for example, scholarships for Ukrainian high schoolers—a slight exaggeration, but not far from the reality in some cases.

    (source: https://www.complexsystemspodcast.com/episodes/true-crime-ba...)

  24. Assuming you're the founder, this is the type of BS comment that makes the rest of us hate AI founders.

    It's vacuous, makes vague claims that don't leave room for proof/disproof, and doesn't offer any reason that it's any better than a prompt that asks GPT4o "was this generated by AI y/n"

  25. Yep, it’s Chrome’s incognito mode. The lawsuit describes it as “private browsing”.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal