Comment by fakedang - Hacker Neue

Aight you win fam, I was trippin fr. You're absolutely bussin, no cap. Harvard should be taking notes.

(^^ alien language that was developed in less than a decade)

The existence of common slang which isn't used in the sort of formal writing that grammar linting tools are typically designed to promote is more of a weakness of learning grammar by a weighted model of the internet vs formal grammatical rules than a strength.

Not an insurmountable problem, ChatGPT will use "aight fam" only in context-sensitive ways and will remove it if you ask to rephrase to sound more like a professor, but RHLFing slang into predictable use is likely a bigger potential challenge than simply ensuring the word list of an open source program is sufficiently up to date to include slang whose etymology dates back to the noughties or nineties, if phrasing things in that particular vernacular is even a target for your grammar linting tool...

chrisweekly 3 days ago

Huh, this is the first time I've seen "noughties" used to describe the first decade of the 2000s. Slightly amusing that it's surely pronounced like "naughties". I wonder if it'll catch on and spread.

nailer 3 days ago

‘Noughties’ was popular in Australia from 2010 onwards. Radio stations would “play the best from the eighties nineties noughties and today”.

notahacker 3 days ago

Common in Britain too, also appears in the opening lines of the Wikipedia description for the decade and the OED.

harvey9 3 days ago

The fact that you never saw it before suggests it did not catch on and spread during the last 25 years.

dmoy 3 days ago

Pedantically,

aight, trippin, fr (at least the spoken version), and fam were all very common in the 1990s (which was the last decade I was able to speak like that without getting jeered at by peers).

afeuerstein 3 days ago

I don't think anyone has the need to check such a message for grammar or spelling mistakes. Even then, I would not rely on a LLM to accurately track this "evolution of language".

fakedang OP 3 days ago

What if you're writing emails to GenZers?

dpassens 3 days ago

As a zoomer, I'd rather not receive emails that sound like they're written by a moron.

bombcar 3 days ago

Attempting to write like a GenZ when you’re not gets you “hello fellow kids” and “Boomer” right away.

phoe-krk 3 days ago

Yes, precisely. This "less than a decade" is magnitudes above the hours or days that it would take to manually add those words and idioms to proper dictionaries and/or write new grammar rules to accomodate aspects like skipping "g" in continuous verbs to get "bussin" or "bussin'" instead of "bussing". Thank you for illustrating my point.

Also, it takes at most few developers to write those rules into a grammar checking system, compared to millions and more that need to learn a given piece of "evolved" language as it becomes impossible to avoid learning it. It's not only fast enough to do this manually, it also takes much less work-intensive and more scalable.

fakedang OP 3 days ago

Not exactly. It takes time for those words to become mainstream for a generation. While you'd have to manually add those words in dictionaries, LLMs can learn these words on the fly, based on frequency of usage.

phoe-krk 3 days ago

At this point we're already using different definitions of grammar and vocabulary - are they discrete (as in a rule system, vide Harper) or continuous (as in a probability, vide LLMs). LLMs, like humans, can learn them on the fly, and, like humans, they'll have problems and disagreements judging whether something should be highlighted as an error or not.

Or, in other words: if you "just" want a utility that can learn speech on the fly, you don't need a rigid grammar checker, just a good enough approximator. If you want to check if a document contains errors, you need to define what an error is, and then if you want to define it in a strict manner, at that point you need a rule engine of some sort instead of something probabilistic.

efitz 3 days ago

I’m glad we have people at HN who could have eliminated decades of effort by tens of thousands of people, had they only been consulted first on the problem.

phoe-krk 3 days ago

Which effort? Learning a language is something that can't be eliminated. Everyone needs to do it on their own. Writing grammar checking software, though, can be done few times and then copied.

This item has no comments currently.