I'm actually ehsanul (http://news.ycombinator.com/user?id=ehsanul) here on HN, but I switched Google accounts.
- ehsanu1 parentBut I doubt you can opt in to them training on that data coming in via OpenCode.
- The DB specifically, or the concept of event sourcing? Event sourcing is not a new approach and has a lot of similarities with temporal's approach, though temporal events are not necessarily business events and deterministic event replay is required with temporal. In the general case of event sourcing, arbitrary processing might be done on the event stream to produce some final state or do whatever needs to happen for your use case. As long as you're persisting the events and using events as the basis for your business logic and state, you're doing event sourcing.
I dont know anything about this specific DB though, if that was what you were wondering about, that's more of an implementation-level detail. Temporal server just uses regular mysql and supports mutiple storage backends.
- Using Research->Plan->Implement flow is orthogonal, though I notice parts of those do exist as skills too. But you sometimes need to do other things too, e.g. debugging in the course of implementing or specific techniquws to improve brainstorming/researching.
Some of these skills are probably better as programmed workflows that the LLM is forced to go through to improve reliability/consistency, that's what I've found in my own agents, rather than using English to guide the LLM and trusting it to follow the prescribed set of steps needed. Some mix of LLMs (choosing skills, executing the fuzzy parts of them) and just plain code (orchestration of skills) seems like the best bet to me and what I'm pursuing.
- I see no conflict between AGPL and SaaS: https://opensource.stackexchange.com/a/12988
- It's hard to attribute PR merge rate with higher tool quality here. Another likely reason is level of complexity of task. Just looking at the first PR I saw from the github search for codex PRs, it was this one-line change that any tool, even years ago, could have easily accomplished: https://github.com/maruyamamasaya/yasukaribike/pull/20/files
- Where I work, our legal department requires making use of LLMs only through our own contractual relationships with model providers. Given that, BYOK is table stakes for me at least.
Litellm is what we use internally, so we can support any LLM backend with any open source tool, and create virtual keys for each developer to monitor and manage usage limits etc.
- There seems to be a couple of field-specific journals of negative results for similar purposes. It seems like there should be value in citing negative results to inform current research. Perhaps if there were more journals dedicated to this, or a single one not limited to specific fields, there would still be some incentive to publish there, if the effort required was low enough (another area where AI might be applied: writing it up).
- Essentially, you don't need to think about time and space. You just write more or less normal looking code, using the Temporal SDK. Except it actually can resume from arbitrarily long pauses, waiting as long as it needs to for some signal, without any special effort beyond using the SDK. You also automatically get great observability into all running workflows, seeing inputs and outputs at each step, etc.
The cost of this is that you have to be careful in creating new versions of the workflow that are backwards compatible, and it's hard to understand backcompat requirements and easy to mess up. And, there's also additional infra you need, to run the Temporal server. Temporal Cloud isn't cheap at scale but does reduce that burden.
- That was my initial position too, but I think there is a search efficiency story here as well. CoT comes in many flavors and improves when tailored to the problem domain. If the LLM can instead figure out the right strategy to use to problem solve for a given problem, this may improve performance per compute vs discovering this at inference time.
Tailoring prompts is likely still the best way to maximize performance when you can, but in broader domains you'd work around this through strategies like asking the LLM to combine predefined reasoning modules, or creating multiple reasoning chains and merging/comparing them, explicit MCTS etc. I think those strategies will still be useful for a good while, but pieces of that search process, especially directing the search more efficiently, move to the LLMs over time as they get trained with this kind of data.
- I've only read the abstract, but also find this strange. I wonder if this is just tapping into the computational chains that are already available when tokens are further away, due to the positional encodings being trained that way. If so, that makes the reasoning/modeling powers of LLMs even more impressive and inscrutable.
- I've used usearch successfully for a small project: https://github.com/unum-cloud/usearch/