Profile: bonobocop - Hacker Neue

bonobocop

Joined May 15, 2021 25 karma

bonobocop Nov 5, 2025

The MIT course for Raft (with the lectures on YouTube) was great to learn for me: http://nil.lcs.mit.edu/6.824/2020/labs/lab-raft.html
bonobocop Oct 21, 2025

The durability and transformation reasons are definitely more compelling, but the article doesn’t mention those reasons.
It’s mainly focused on the insert batching which is why I was drawing attention to async_insert.
I think it’s worth highlighting the incremental transformation that CH can do via the materialised views too. That can often replace the need for a full blown streaming transformation pipelines too.
IMO, I think you can get a surprising distance with “just” a ClickHouse instance these days. I’d definitely be interested in articles that talk about where that threshold is no longer met!
bonobocop Oct 20, 2025

Sure, but the article doesn’t talk about that, it seemed to be focused on CH alone, in which case async insert is much fewer technical tokens.
If you need to ensure that you have super durable writes, you can consider, but I really think it’s not something you need to reach for at first glance
bonobocop Oct 20, 2025

Why add RedPanda/Kafka over using async insert? https://clickhouse.com/docs/optimize/asynchronous-inserts
It’s recommended in the docs over the Buffer table, and is pretty much invisible to the end user.
At ClickHouse Inc itself, this scaled far beyond millions of rows per second: https://clickhouse.com/blog/building-a-logging-platform-with...
bonobocop Sep 19, 2025

Yeah, handles all the OTel signals
bonobocop Jul 16, 2025

Not OP, but to me, this reads fairly similar to how ClickHouse can be set up, with Bloom filters, MinMax indexes, etc.
A way to “handle” partial substrings is to break up your input data into tokens (like substrings split in spaces or dashes) and then you can break up your search string up in the same way.
bonobocop Jul 13, 2025

Quite like Cloudprober for this tbh: https://cloudprober.org/docs/how-to/alerting/
Easy to configure, easy to extend with Go, and slots in to alerting.
bonobocop Mar 19, 2025

If you choose not to travel, then you are eligible for a refund, rather than Delay Repay: https://www.nationalrail.co.uk/help-and-assistance/compensat...
There’s differences in consumer rights effectively between a refund and compensation (like DR)
bonobocop Mar 19, 2025

Claiming delay compensation if you don’t have intent to travel is the fraud part.
Easiest example is if you have a season ticket, but you have the day off. You weren’t going to take the train to work that day, so no intent to travel. If you claim DR, then that’s fraud for the compensation.
bonobocop Mar 19, 2025

I don’t think it’s fraud on the DR side if you actually take the trains and intend to travel.
If you didn’t actually intend to travel, then claiming DR is fraud.
bonobocop Mar 19, 2025

The DR system doesn’t look at ticket scans alone. It also builds a profile per customer based on a number data points.
It will flag up quite quickly if you are “sniping” delayed trains at different times.
bonobocop Dec 25, 2024

A small question on the schema, I noticed that you have only “_now” as the Order By (so should just use that for the primary key). Do you expect any cross tenant queries?
Just my feeling would be that I’d add the tenant ID before the timestamp as it should filter the parts more effectively
bonobocop Dec 24, 2024

Thoughts on stuff like ClickHouse with JSON column support? Less upfront knowledge of columns needed.
bonobocop Jul 2, 2022

https://docs.aws.amazon.com/organizations/latest/APIReferenc...
Now has an API (with some caveats)
bonobocop Jun 5, 2022

I think some of the best technical writing I've enjoyed is: https://aws.amazon.com/builders-library/
Clear and concise articles that really dig into some of the hard technical problems with working at scale.
Has honestly made me a much better systems programmer since starting to read them.
bonobocop Sep 30, 2021

TBH, as the Dockerfile based Lambda layers have a 20GB file size limit, I'm not sure the size matters as much.
It's certainly easier to figure out what's going on in a smaller container though! I've had to debug some nasty situation with layers in Python Lambdas before and it's not fun...
bonobocop Sep 30, 2021

Apparently Python rules the roost for Lambda: https://mobile.twitter.com/julian_wood/status/14427755423742...
But NodeJS is second!
bonobocop May 15, 2021

I used BPF (using _both_ of Brendan Gregg's recent books!) on our Jenkins builds recently to figure out why `TRUNCATE` statements were taking longer than I expected.
The underlying cause was that `ext4` filesystem was journalled and the `fsync`s were waiting on `jbd2_log_wait_commit`, which offcpu sampling let me pick up.
I don't think I would have been able to trace the kernel call and link it all the way back to the application call without BPF (at least I wouldn't have been able to).

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous