- Good / bad / unclassified.
It makes sense for unclassified to smell worse than good, and it'd probably be the biggest category by a long stretch.
(Pure speculation.)
- You could set a cloudwatch cost alert that scuttles your IAM and effectively pulls the plug on your stack. Or something like that.
- Those models are terrible. I tried one of the "best" ones and it told me the US Constitution was AI generated.
- My laptop at Amazon was also covered in stickers, although I shied away from the more politically charged ones.
- I think it's simpler than that, and they're just trying to avoid liability.
With all of their claims about how GPT can pass the legal/medical bar etc. I wouldn't be surprised if they're eventually held accountable for some of the advice their model gives out.
This way, they're covered. Similarly, Epic could still build that feature, but they'd have to add a disclaimer like "AI generated, this does not constitute legal advice, consult a professional if needed".
- Damn, I goofed, thanks for calling that out. Can't edit the comment anymore unfortunately.
- European here.
One of the problems is that many of our society's systems are predicated on a growing population. Social security and pensions, for example, are structured not unlike a pyramid scheme: for every old person we should have more than one working young person. People take more than they give. Fixing that will be painful, but possible.
More worrying is how many countries' birth rates have fallen below the replacement rate. Some SE Asian countries are interesting case studies here (Japan, S Korea), but it's not looking good, and much of western Europe is heading in the same direction. Maybe the worry is overblown and populations will eventually stabilize at a lower point, but currently it seems like a declining population will just add to the stressors that are putting people off from having children, so it could just as well keep snowballing.
All that's to say, I don't worry too much about over/underpopulation, but I do worry about a shrinking population.
- Is there any way to get notified if/when an Android app becomes available?
Great work, app looks great!
- There's no inductive bias for a world model in multiheaded attention. LLMs are incentivized to learn the most straightforward interpretation/representation of the data you present.
If the data you present is low entropy, it'll memorize. You need to make the task sufficiently complex so that memorisation stops being the easiest solution.
- Agreed.
I've found some AI assistance to be tremendously helpful (Claude Code, Gemini Deep Research) but there needs to be a human in the loop. Even in a professional setting where you can hold people accountable, this pops up.
If you're using AI, you need to be that human, because as soon as you create a PR / hackerone report, it should stop being the AI's PR/report, it should be yours. That means the responsibility for parsing and validating it is on you.
I've seen some people (particularly juniors) just act as a conduit between the AI and whoever is next in the chain. It's up to more senior people like me to push back hard on that kind of behaviour. AI-assisted whatever is fine, but your role is to take ownership of the code/PR/report before you send it to me.
- Elon was three years old when the first Iron Man comic book came out.
EDIT: and the movies are pretty faithful to the comic books.
- I was just thinking about how LLM agents are both unabashedly confident (Perfect, this is now production-ready!) and sycophantic when contradicted (You're absolutely right, it's not at all production-ready!)
It's a weird combination and sometimes pretty annoying. But I'm sure it's preferable over "confidently wrong and doubling down".
- This relates to one of my biggest pet peeves.
People interpret "statistically significant" to mean "notable"/"meaningful". I detected a difference, and statistics say that it matters. That's the wrong way to think about things.
Significance testing only tells you the probability that the measured difference is a "good measurement". With a certain degree of confidence, you can say "the difference exists as measured".
Whether the measured difference is significant in the sense of "meaningful" is a value judgement that we / stakeholders should impose on top of that, usually based on the magnitude of the measured difference, not the statistical significance.
It sounds obvious, but this is one of the most common fallacies I observe in industry and a lot of science.
For example: "This intervention causes an uplift in [metric] with p<0.001. High statistical significance! The uplift: 0.000001%." Meaningful? Probably not.
- The problem is that ChatGPT doesn't really know letters, it writes in wordpieces (BPE), which may be one or more letters.
For example, something like "running" might get tokenizef like "runn"+"ing", being only two tokens for ChatGPT.
It'll learn to infer some of these things over the course of training, but limited.
Same reason it's not great at math.
- The problem I see with (1) is that it becomes a little bit too easy to regenerate public keys and circumvent free tier metering.
- In general? In the past I've known ASS to be used a lot for things like anime, but less for live action shows.
- At least for English, those "fansubs" aren't typically burnt into the movie*, but ride along in the video container (MP4/MKV) as subtitle streams. They can typically be extracted as SRT files (plain text with sentence level timestamps).
*Although it used to be more common for AVI files in the olden days.
- I've been in organisations with great developers but no leadership. It's a shit show.
- Damage is usually done in aggregate. Leaded gasoline didn't have people dropping like flies, but still caused significant damage.
Although it seems this needs more research, I'd be wary dismissing it out of hand just because people haven't been having an acute reaction.
A product manager can definitely say things that would make me lose a bit of respect for a fellow senior engineer.
I can also see how juniors have more leeway to weigh in on things they absolutely don't understand. Crazy ideas and constructive criticism is welcome from all corners, but at some level I also start expecting some more basic competence.