- This is what I got from Tao's post as well.
- >Single-digit FPS can _absolutely_ be playable if you're a desperate enough ten-year-old...
And somehow, more mesmerizing than games feels like playing now. To be a kid again.
- I know, I am saying they should be links, as it is what one would expect from an article like this.
- It may be just my system, but the times look like hyperlinks but aren't for some reason. It is especially disappointing that the commit hashes don't link to the actual commit in the kernel repo.
- Exactly, imagine if someone gave you a 100k LOC project and said improve anything you can.
- Unless you are going to be more specific, that criticism applies to all benchmarks that are connected to a positive gain, not just AI coding benchmarks.
- >the code will continue to be boring.
Why would you not want you code to be boring?
- >establish benchmarks that make sense and are reliable
How aren't current LLM coding benchmarks reliable?
- Are you actually asking someone for a source for their opinion?
- People look at laws like Chat Control and ask, "How could anyone have thought that it was a good idea?" But then you see comments like this, and you can actually see how such viewpoints can blossom in the wild. It's baffling to see in real time.
- That's ironic, as most evidence-based medicine says the completely opposite. There is a clear connection between violence and exogenous testosterone use.
- Yes, different strokes I guess but I can't imagine just using flashcards for random information. And you'd think someone who wrote(maybe vibe coded?) a flashcard program would be calculate that by hand.
- That is a normal, run of the mill sentence.
- LLM can detect sarcasm easily, they wouldn't be tricked by something like this.
- >Gen 0: expertsexchange.com
No way.
- Does all of that stuff have to be done by developers though, or project managers?
- > On the down side they also help the extroverts and the confident, and you have to be careful about preventing a bias towards those.
This is true, but it is also why it is important to get an actual expert to proctor the exam. Having confidence is good and should be a plus, but if you are confident about a point that the examiner knows is completely incorrect, you may possibly put yourself in an inescapable hole, as it will be very difficult to ascertain that you actually know the other parts you were confident (much less unconfident) in.
- I agree with you and the other posters actually, but I think the efficiency compared with typed work is the reason it’s having such a slow adoption. Another thing to remember is that there is always a mild Jevons paradox at play; while it's true that it was possible in previous centuries, teacher expectations have also increased which strains the amount of time they would have grading handwritten work.
- Yes, I hate oral exams, but they are definitely better at getting a whole picture of a person's understanding of topics. A lot of specialty boards in medicine do this. To me, the two issues are that it requires an experienced, knowledgeable, and empathetic examiner, who is able to probe the examinee about areas they seem to be struggling in, and paradoxically, its strength is in the fact that it is subjective. The examiner may have set questions, but how the examinee answers the questions and the follow-up questions are what differentiate it from a written exam. If the examiner is just the equivalent of a customer service representative and is strictly following a tree of questions, it loses its value.
- The issue is that it is not scalable, unless there is some dependable, automated way to convert handwriting to text.
- I agree. It is definitely readable but clearly formatting decisions were made to keep line count down over the flow of code. It isn't even just the variable names, things like having a function with a single return statement on one line, while legal, I cannot imagine why someone would do it in a non one off script code.
- Since you said you read the post, explain this to me:
>To me, this project perfectly encapsulates the uselessness of AI, small projects like this are good learning or relearning experience and by outsourcing your thinking to AI you deprive yourself of any learning, ownership, or the self fulfillment that comes with it. Unless, of course, you think engaging in "tedious" activities with things you enjoy have zero value, and if getting lost in the weeds isn't the whole point.
In the context of his first paragraph
>I own more books than I can read. Not in a charming, aspirational way, but in the practical sense that at some point I stopped knowing what I owned. Somewhere around 500 books, memory stopped being a reliable catalog.[...] For years, I told myself I would fix this. Nothing elaborate, nothing worthy of a startup idea. A spreadsheet would have been enough. I never did it, not because it was hard, but because it was tedious.
Wouldn't your statement be completely moot because he plainly said the purpose of the project was to create a system to handle his books, and the only reason he hasn't done it yet was because it was tedious? (Hint: if you need the paragraph or thge rest of his article to be broken down for you more, I suggest asking ChatGPT to give a summary for you).
- That is exactly right!
- I don't think you fully understood the purpose of the project. He wanted an end product (the bookshelf app) that he had been putting off due to the time commitment. He did not say he wanted to learn about how to program in general, nor did he even say he liked programming. People care about results and the end product. If you like to program as a hobby, LLMs in no way stop you from doing this. At the end of the day, people with your viewpoint fall short of convincing people against using AI because you are being extremely condescending to the desires of regular people. Also, it is quite ironic that you attempted to make a point about him not reading all 500 books on his bookshelf, yet you don't seem to have read (or understood) the opening section of the post.
- That is correct for the collective as a whole, but in his instance, if this wasn't connect to a public github, it would have been substanially more difficult to prove he used a LLM.
- One can do that with LLM as well. Honestly, I almost always just save the command if I think I am going to use it later. Also, I can just look back at the chat history.
- I get using LLMs to help proofread or copywrite a blog post, but whats the point of just regurgitating a ChatGPT answer in a blog post and posting it here? If it's really that interesting, couldn't you just post the prompt so that we can see it for ourselves?
- I recsind my previous statement. Also, people have to stop putting everything on github.
- To me, unless it is egregious, I would be very sensitive to avoid false positives before saying something is LLM aided. If it is clearly just slop, then okay, but I definitely think there is going to be a point where people claim well-written, straightforward posts as LLM aided. (Or even the opposite, which already happens, where people purposely put errors in prose to seem genuine).
I think it's because it comes off as you are saying that we should move off of GenAI, and alot of people use LLM when they mean GenAI.