Profile: msp26 - Hacker Neue

msp26

Joined Jun 12, 2023 841 karma

msp26 Dec 11, 2025 parent

Hi if the Gemini API team is reading this can you please be more transparent about 'The specified schema produces a constraint that has too many states for serving. ...' when using Structured Outputs.
I assume it has something to do with the underlying constraint grammar/token masks becoming too long/taking too long to compute. But as end users we have no way of figuring out what the actual limits are.
OpenAI has more generous limits on the schemas and clearer docs. https://platform.openai.com/docs/guides/structured-outputs#s....
You guys closed this issue for no reason: https://github.com/googleapis/python-genai/issues/660
Other than that, good work! I love how fast the Gemini models are. The current API is significantly less of a shitshow compared to last year with property ordering etc.
msp26 Dec 2, 2025 parent

The new large model uses DeepseekV2 architecture. 0 mention on the page lol.
It's a good thing that open source models use the best arch available. K2 does the same but at least mentions "Kimi K2 was designed to further scale up Moonlight, which employs an architecture similar to DeepSeek-V3".
---
vllm/model_executor/models/mistral_large_3.py
```
from vllm.model_executor.models.deepseek_v2 import DeepseekV3ForCausalLM
class MistralLarge3ForCausalLM(DeepseekV3ForCausalLM):
```
"Science has always thrived on openness and shared discovery." btw
Okay I'll stop being snarky now and try the 14B model at home. Vision is good additional functionality on Large.
msp26 Dec 1, 2025 parent

K2 Thinking has immaculate vibes. Minimal sycophancy and a pleasant writing style while being occasionally funny.
If it had vision and was better on long context I'd use it so much more.
msp26 Nov 27, 2025 parent

Because its not a software issue, it's a human social cooperation issue.
Companies don't want to support useful APIs for interoperability so its just easier to have an LLM bruteforce problems using the same interface that humans use.
msp26 Nov 25, 2025 parent

really nice post, will share!
msp26 Nov 18, 2025 parent

Is flash/flash lite releasing alongside pro? Those two tiers have been incredible for the price since 2.0, absolute workhorses. Can't wait for 3.0.
msp26 Nov 15, 2025 parent

https://saucenao.blogspot.com/2021/04/recent-events.html
Mildly related incident where a Canadian child protection agency uploads csam onto a reverse image search engine and then reports the site for the temporarily stored images.
msp26 Nov 13, 2025 parent

> I don't like how closed the frontier US models are, and I hope the Chinese kick our asses.
For imagegen, agreed. But for textgen, Kimi K2 thinking is by far the best chat model at the moment from my experience so far. Not even "one of the best", the best.
It has frontier level capability and the model was made very tastefully: it's significantly less sycophantic and more willing to disagree in a productive, reasonable way rather than immediately shutting you out. It's also way more funny at shitposting.
I'll keep using Claude a lot for multimodality and artifacts but much of my usage has shifted to K2. Claude's sycophancy is particular is tiresome. I don't use ChatGPT/Gemini because they hide the raw thinking tokens, which is really cringe.
msp26 Nov 8, 2025 parent

Groq does quantise. Look at this benchmark from moonshotai for K2 where they compare their official implementation to third party providers.
https://github.com/MoonshotAI/K2-Vendor-Verifier
It's one of the lowest rated on that table.
msp26 Oct 16, 2025 parent

Rumour is a release on the 22nd I believe
msp26 Oct 6, 2025 parent

Accessing services from the UK without handing over your personal ID to a service that will inevitably get hacked.
This happened to discord literally a few days ago.
msp26 Sep 30, 2025 parent

The voice quality in the generated vids is surprisingly awful.
msp26 Sep 28, 2025 parent

can you provide an example please? The docs suggest that propertyOrdering can only be a list[str].
msp26 Sep 23, 2025 parent

that only works for the outer level, not for any nested fields
msp26 Aug 22, 2025 parent

what the fuck is this slop? Don't name your shit grifts after (the codenames of) actual highly anticipated models.
msp26 Aug 18, 2025 parent

The french comic pirate scene has an interesting rule where they keep a ~6 month time lag on what they release. The scene is small enough that the rule generally works.
It's a really good trade-off. I would never have gotten into these comics without piracy but now if something catches my eye, I don't mind buying on release (and stripping the DRM for personal use).
Most of my downloading is closer to collecting/hoarding/cataloguing behaviour but if I fully read something I enjoy, I'll support the author in some way.
msp26 Aug 7, 2025 parent

Yeah that was the only exciting part of the announcement for me haha. Can't wait to play around with it.
I'm already running into a bunch of issues with the structured output APIs from other companies like Google and OpenAI have been doing a great job on this front.
msp26 Aug 7, 2025 parent

OpenAI's work on Dota was also very important for funding
msp26 Jul 16, 2025 parent

Are us plebs allowed to monitor the CoT tokens we pay for, or will that continue to be hidden on most providers?
msp26 Jun 21, 2025 parent

Merge comments? https://www.hackerneue.com/item?id=44331150
I'm really getting bored of Anthropic's whole song and dance with 'alignment'. Krackers in the other thread explains it in better words.
msp26 Jun 16, 2025 parent

Agree completely. When I read the Gemma 3 paper (https://arxiv.org/html/2503.19786v1) and saw an entire section dedicated to measuring and reducing the memorization rate I was annoyed. How does this benefit end users at all?
I want the language model I'm using to have knowledge of cultural artifacts. Gemma 3 27B was useless at a question related to grouping Berserk characters by potential baldurs gate 3 classes; Claude did fine. The methods used to reduce memorisation rate probably also deteriorate performance in some other ways that don't show up on benchmarks.
msp26 Jun 8, 2025 parent

> 12GB vram
waste of effort, why would you go through the trouble of building + blogging for this?
msp26 May 22, 2025 parent

> Finally, we've introduced thinking summaries for Claude 4 models that use a smaller model to condense lengthy thought processes. This summarization is only needed about 5% of the time—most thought processes are short enough to display in full. Users requiring raw chains of thought for advanced prompt engineering can contact sales about our new Developer Mode to retain full access.
Extremely cringe behaviour. Raw CoTs are super useful for debugging errors in data extraction pipelines.
After Deepseek R1 I had hope that other companies would be more open about these things.
msp26 May 21, 2025 parent

Fantastic. I wonder how many random technical info is buried in these servers. I hate what it's done for game modding.
msp26 May 7, 2025 parent

Brand safety. Journalists would write articles about the models being 'dangerous'.
msp26 May 4, 2025 parent

The linked blog is down. But agreed, I would especially like to see this particular thing fixed.
> Property ordering
> When you're working with JSON schemas in the Gemini API, the order of properties is important. By default, the API orders properties alphabetically and does not preserve the order in which the properties are defined (although the Google Gen Al SDKs may preserve this order). If you're providing examples to the model with a schema configured, and the property ordering of the examples is not consistent with the property ordering of the schema, the output could be rambling or unexpected.
msp26 May 2, 2025 parent

Thanks for the detailed comment.
I had no idea that fine tuning for adding information is viable now. Last I checked (year+ back) it seemed to not work well.
msp26 Apr 30, 2025 parent

Actually you can't do "system" roles at all with OpenAI models now.
You can use the "developer" role which is above the "user" role but below "platform" in the hierarchy.
https://cdn.openai.com/spec/model-spec-2024-05-08.html#follo...
msp26 Apr 29, 2025 parent

It's a race to the bottom for pricing. They can't do shit. Even if the American companies colluded to stop competing and raise prices, Chinese providers will undermine that.
There is no moat. Most of these AI APIs and products are interchangeable.
msp26 Apr 19, 2025 parent

I haven't bothered with video gen because I'm too impatient but isn't Wan pretty good too on regular hardware?

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous