Profile: lopuhin - Hacker Neue

lopuhin

Joined Sep 10, 2015 153 karma

I work at https://www.zyte.com/ on https://www.zyte.com/automatic-extraction/

Github: https://github.com/lopuhin/ Kaggle: https://www.kaggle.com/lopuhin/

lopuhin Dec 11, 2025

Context window size of 400k is not new, gpt-5, 5.1, 5-mini, etc. have the same. But they do claim they improved long context performance which if true would be great.
lopuhin Aug 7, 2025

you can rent them for less then $2/h in a lot of places (maybe not in the drawer)
lopuhin Jul 11, 2025

I find OpenAI's new flex processing more attractive, as it has the same 50% discount, but allows to use the same API as regular chat mode, so you can still do stuff where Batch API won't work (e.g. evaluating agents), and in practice I found it to work well enough when paired with client-side request caching: https://platform.openai.com/docs/guides/flex-processing?api-...
lopuhin Apr 17, 2025

it's pretty difficult to package native python dependencies for wasmtime or other wasi runtimes, e.g. lxml
2 points Apr 6, 2025

Visualize LLM Token Probabilities and Confidence with ELI5

0 comments lopuhin readthedocs.io
lopuhin Mar 24, 2025

Crazy amount of breakage...
Here is a PR which reverts this: https://github.com/pypa/setuptools/pull/4911
Interesting that maintainers of setuptools still only postpone the depreciation date for a year, so we can probably expect more issues like this in the future.
lopuhin Feb 27, 2025

Congrats on the launch! How much does it cost? And what is the sandboxing technology?
lopuhin Jan 28, 2025

I find it strange that the author is really happy with the quality of string comparison here https://pgaleone.eu/ai/coding/2025/01/26/using-ai-for-coding... and while it would kind of work, it's a very weird piece of code from ML standpoint, e.g. it's training a TF-IDF vectorizer on just two strings being compared, which at best won't change anything (unless the same word is repeated within one product), and is a super weird thing to do as for better quality you'd probably want to train that on some corpus, or not bother at all. And also it compare two strings as bags of words, which again is not the end of the world but maybe not what the author wants here, and if they want this then it's not the easiest way of doing it. So it's taking some things which can be useful when comparing texts (tf-idf and cosine similarity) but then applying them in a weird way which does not let them show their strengths.
lopuhin Jan 23, 2025

It's a 600B+ mixture of experts and yes it's described in the paper, GitHub, etc.
lopuhin Jan 23, 2025

Why is this doubtful, did you spot any suspicious things in their paper? They make the weights and a lot of training details open as well, which leaves much less room for making stuff up, e.g. you could check training compute requirements from active weight size (which they can't fake as they released the weights) and fp8 training used.
lopuhin Jan 20, 2025

With distilled models being released, it's very likely they'd be soon served by other providers at a good price and perf, unlike the full R1 which is very big and much harder to serve efficiently.
lopuhin Oct 13, 2024

I don't think so, what they show on CS video is exactly the Dust2 map, not just something similar/inspired by it.
lopuhin Sep 17, 2024

I think GraalPython does have a GIL, see https://github.com/oracle/graalpython/blob/master/docs/contr... - and if by "there is no such thing on those platforms" you mean JVM/CLR not having a GIL, C also does not have a GIL but CPython does.
lopuhin Jul 21, 2024

Curious which model was used? Sorry if I missed that. Looks like an important detail to mention when doing an evaluation.
lopuhin Jul 19, 2024

Also I don't think you can use NIM packages in production without a subscription, and I wasn't able to find the cost without signing up. Also NIM package for Mistral Nemo is not yet available anyways.
lopuhin Jul 16, 2024

The README says they plan to add llama.cpp support which should cover a lot of targets, also they have tinygrad already integrated I think.
lopuhin Jun 19, 2024

Not quite the same, OpenAI was initially quite open, while Ilia is currently very explicitly against opening or open-sourcing research, e.g. see https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-lau...
lopuhin Jun 12, 2024

yes looks like a bug in the example to me, feel free to report to https://github.com/fchollet/ARC-AGI/issues :)
lopuhin Jun 2, 2024

Really nice to see no-GIL Python become closer and closer to reality.
lopuhin May 27, 2024

I think they mean "chemical composition", e.g. how much hydrogen / helium / other elements there are in the star, deduced from their spectra

This user hasn’t submitted anything.