Profile: ImageXav - Hacker Neue

ImageXav

Joined Feb 24, 2021 212 karma

ImageXav Nov 2, 2025 parent

Agreed. One at a time testing (OAT) has been outdated for almost a century at this point. Factorial and fractional factorial experiments have been around for that long and give detailed insights into the effect of not just single changes but the interaction between changes, which means you can superpower your learnings as many variables in DL do in fact interact.
Or, more modern Bayesian methods if you're more interested in getting the best results for a given hyperparameter sweep.
However, that is not to detract from the excellent effort made here and the great science being investigated. Write ups like this offer so much gold to the community.
ImageXav Oct 7, 2025 parent

I guess that's the crux of it. From an individual perspective it makes sense to stay in a stable environment, especially if a family is involved. However, I think from a societal perspective it is desirable to have people who gamble on creating new products which can raise the bar in their given industries.
Also, just because the start up fails doesn't mean it was a waste of time. If you manage to provide employment for even just 3 or 4 people for a few years, help them and yourself develop, that is a valuable success.
ImageXav Oct 6, 2025 parent

I would add an aspect that is not covered here but is often ignored: the strong labour protection laws result in a mentality where if you get a good job you are much less likely to want to take risks e.g. start your own business. There was a post on the HENRY (high earner, not rich yet) UK subreddit the other day from someone who had a wealth of experience and had the opportunity to join a start up as a CTO. It honestly sounded like a great chance to initiate change. All of the comments were telling the poster that they had it good, that 99% of start ups fail, that the hours would be gruelling. I feel as though the conversation would have been quite different in a US subreddit.
A term they like to use is 'crabs in a bucket'.
ImageXav Aug 22, 2025 parent

This is an interesting point. I've been trying to think about something similar recently but don't have much of an idea how to proceed. I'm gathering periodic time series data and am wondering how to factor in the frequency of my sampling for the statistical tests. I'm not sure how to assess the difference between 50Hz and 100Hz on the outcome, given that my periods are significantly longer. Would you have an idea of how to proceed? The person I'm working with currently just bins everything in hour long buckets and uses the mean for comparison between time series but this seems flawed to me.
2 points Aug 21, 2025

Best workflow for quick ideation with LLMs from phone

0 comments ImageXav
ImageXav Aug 10, 2025 parent

I've had the complete opposite experience, and feel the complete opposite way. What is there to learn from failing a leetcode? It feels like luck of the draw - I didn't study that specific problem type and so failed. Also, there is an up front cost of several months to cover and study a wide array of leetcode problems.
With a take home I can demonstrate how I would perform at work. I can sit on it, think things over in my head, come up with an attack plan and execute it. I can demonstrate how I think about problems and my own value more clearly. Using a take home as a test is indicative to me that a company cares a bit more about its hiring pipeline and is being careful not to put candidates under arbitrary pressures.
ImageXav Aug 8, 2025 parent

Yes, especially as models are known to have a preference towards outputs of models in the same family. I suspect this leaderboard would change dramatically with different models as the judge.
ImageXav Aug 6, 2025 parent

Thanks for sharing that. Interesting that the leaderboard is dominated by Anthropic, Google and DeepSeek. Openai doesn't even register.
ImageXav Jul 15, 2025 parent

How did you achieve that? I was looking into it and $0.006/min is quoted everywhere.
ImageXav Jun 16, 2025 parent

I feel as though it also represents the fact that contributors are less invested in the project. There was a small study done a few years back hypothesizing that the number of swear words related somewhat to code quality [0] due to emotional involvement of the codebase authors. I can imagine this to be somewhat true. I would love to see this study redone now that LLMs are widespread on pre chatgpt repos (as I suspect that repos created using LLMs are going to be very sanitised).
[0] https://cme.h-its.org/exelixis/pubs/JanThesis.pdf
ImageXav Apr 18, 2025 parent

Even better, python has named tuples [0]. So if you have a tuple that you are sure will always have the same inputs you can declare it:
``` Point = namedtuple('Point', 'x y') pt1 = Point(1.0, 5.0) ```
And then call the X or Y coordinates either by index: pt1[0], pt1[1], or coordinate name: pt1.x, pt1.y.
This can be a really handy way to help people understand your code as what you are calling becomes a lot more explicit.
[0] https://stackoverflow.com/questions/2970608/what-are-named-t...
ImageXav Feb 4, 2025 parent

Not necessarily. Interpretability of a system used to make decisions is more important in some contexts than others. For example, a black box AI used to make judiciary decisions would completely remove transparency from a system that requires careful oversight. It seems to me that the intent of the legislation is to avoid such cases from popping up, so that people can contest decisions made that would have a material impact on them, and that organisations can provide traceable reasoning.
ImageXav Jan 25, 2025 parent

Something that stuck out to me in the updated blog [0] is that Demon Adam performed much better than even AdamW, with very interesting learning curves. I'm wondering now why it didn't become the standard. Anyone here have insights into this?
[0] https://johnchenresearch.github.io/demon/
ImageXav Dec 6, 2024 parent

I've found that YOLOv7 [0] tends to perform better across the board than anything ultralytics has produced, without the horrendous licensing.
[0] https://github.com/WongKinYiu/yolov7
ImageXav Nov 18, 2024 parent

I think it is you who have misunderstood the Nyquist-Shannon theorem. Aliasing and noise are real concerns. Tim Wescott explains it very well [0] (Figures 3, 10 and 11). If your signal is below one half the sample rate but the noise isn't, you'll lose information about the signal. If your signal phase is shifted wrt. the sampling, you'll lose information. If your sampling period isn't representative, you'll lose information. These are not implementation details.
[0] https://www.wescottdesign.com/articles/Sampling/sampling.pdf
1 point Nov 17, 2024

How to Improve at Reviewing PRs?

0 comments ImageXav
ImageXav Nov 17, 2024 parent

Actually, I was mislead by the video example. They do actually keep the background information they use a temporal encoding so that the information is propagated through. Very interesting and well thought out
ImageXav Nov 16, 2024 parent

As far as I can can tell though the core idea is the same, to focus on the differences, the implementation is different. Differential transformers 'calculates attention scores as the difference between two separate softmax attention maps'. So they must process the redundant areas. This removes them altogether, which would significantly reduce compute. Very neat idea.
However, I do think that background information can sometimes be important. I reckon a mild improvement on this model would be to leave the background in the first frame, and perhaps every x frames, so that the model gets better context cues. This would also more accurately replicate video compression.
10 points Nov 16, 2024

OpenAI vs. Musk Email Archives

0 comments ImageXav lesswrong.com
ImageXav Nov 15, 2024 parent

I thought the same, but the description of the cat picture is pretty spot on. I wonder if this is a dataset issue. Cat pictures are far more prevalent than abstract art on the internet so might well be overrepresented. Can Vision LLMs deal with a long tail of underrepresented objects when small? Or can they only do so at scale?
ImageXav Aug 11, 2024 parent

Here's another one that I feel is often overlooked by traditional A/B testers: if you have multiple changes, don't simply test them independently. Learn about fractional factorial experiments and interactions, and design your experiment accordingly. You'll get a much more relevant result.
My impression is that companies like to add/test a lot of features separately - and individually these features are good, but together they form complex clutter and end up being a net negative.
ImageXav Aug 5, 2024 parent

I don't really see why a leap to indentured servitude is necessary here. There are obviously many other ways of implementing a training relationship - see the other comments re. how it is done in Germany.
The point I'm making is that society as a whole is worse off if companies are too risk averse to hire and train juniors up to a certain level of quality. This results in too few people capable of doing a specific, presumably valuable, job well. The consequences of this are a society that underperforms it's true potential in the long run. In monetary terms, as you raise the question, that means that less taxable value is generated. So society as a whole pays the consequence.
ImageXav Aug 5, 2024 parent

The most interesting point in this article for me lies towards the end: "the role of the Guild was not to form rules, mores, regulations, and laws with respect to their crafts; their role was to introduce a system of art or craft to a new individual, to instill in them the idea of standards, quality, consistency, and perfection".
A common complaint nowadays is that it is very difficult for juniors with no experience to get hired, unless they have a degree from a prestigious university, and even then that's not often a guarantee. It seems that companies are more averse than guilds to take the risk of training someone up to industry standards.
I believe that it would be very beneficial to society to create schemes that encourage learning with a similar system. Mentors can sometimes accomplish this role, but that relation is far more informal.
ImageXav Jul 26, 2024 parent

As a not so active user, this tool is rather inaccurate. It seems to have focussed on the one question I asked about jpeg xl, which is the topic I know the least about.
I suspect a bias towards more common topics might be occurring.
1 point Jul 25, 2024

Good Tutorials on the Implementation of AlphaZero

0 comments ImageXav
ImageXav Jun 28, 2024 parent

A more niche but nonetheless interesting method that I was hoping to see discussed was magnetism. Gravitational waves are expected to decay into photons in intense magnetic fields. Or so I was told by one of my physics professors back in the day. I did understand the math somewhat back then, but it is beyond me now. It does however seem as though some people are still exploring this avenue [0].
[0] https://indico.cern.ch/event/1074510/contributions/4519384/a....
ImageXav Jun 4, 2024 parent

I think the best insult I ever read actually related to him, describing him as a bloviating buffoon, making for a very pleasant alliteration.
ImageXav May 12, 2024 parent

Ahhh I see, that makes sense. Thank you for clarifying I appreciate it. I made the mistake of assuming that the paper in the documentation was the paper of interest. I will take the time to properly delve in further once the paper is released, do you have any idea when that might be? In the mean time I look forward to giving testing the method on some toy examples I have.
ImageXav May 12, 2024 parent

So, this looks really interesting and I look forward to delving into the methodology in order to understand the algorithm better. However, what I immediately noticed from the paper linked in the documentation was that linearboost has a worse F1 score on average than the mentioned classifiers. Where it shines is energy consumption. Would it be possible to edit the title to reflect this? It's a huge gain in energy efficiency for a relatively small F1 loss, so kudos for that, but I think people might be expecting something a bit different from the title.
ImageXav May 12, 2024 parent

I think it's important to point out for people that might be interested in this comment that a few things are wrong.
1. Standard JPEG compression uses the Discrete Cosine Transform, not the Fourier Transform.
2. It is easy to be dismissive of any technology by saying that it is 'just' X with Y, Z, etc on top
3. Vision transformers allow for much longer range context - the magic comes in part from the ability to relate between patches, as well as the learned features, which JPEG does not do.

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous