In fact so far, they consistently fail in exactly these scenario, glossing over random important details whenever you double check results in depth.
You might have found models, prompts or workflows that work for you though, I'm interested.
This item has no comments currently.
It looks like you have JavaScript disabled. This web app requires that JavaScript is enabled.
Please enable JavaScript to use this site (or just go read Hacker News).
In fact so far, they consistently fail in exactly these scenario, glossing over random important details whenever you double check results in depth.
You might have found models, prompts or workflows that work for you though, I'm interested.