Preferences

k8si
Joined 203 karma

  1. I'm not sure people outside of Greater Boston would care, but those of us who do live there probably find it exceedingly strange that this occurred in Brookline of all places.
  2. Well, currently we have a ton of Congresspeople who are primarily motivated by their "good financial sense" (for obvious reasons e.g. this study). So, I think we could do with a few more Congresspeople with less financial sense and more genuine motivation to improve the lives of their constituents.
  3. "Only rich kids should get to choose what they study in school, poor kids are too dumb to make their own choices"
  4. Maybe this is a nitpick but CoNLL NER is not a "challenging task". Even pre-LLM systems were getting >90 F1 on that as far back as 2016.

    Also, just in case people want to lit review further on this topic: they call their method "programmatic data curation" but I believe this approach is also called model distillation and/or student-teacher training.

  5. I believe many high-quality embedding models are still based on BERT, even recent ones, so I don't think it's entirely fair to characterize it as "deprecated".
  6. Please put concrete examples right at the top of the page you're publicizing!
  7. Is it actually more feasible now? Do LLMs actually make this problem easier to solve?

    Because I have a hard time believing they can actually extract time increments and higher-level tasks from log data without a ton of pre/post-processing. But then the problem is just as much work as it was 5 years ago when you might have been using plain old BERT.

  8. Why? Do they not know where the data in their own system, that they built, is being sent?
  9. I suggest going through the exercise of seeing whether this is true quantitatively. Get a business-relevant NER dataset together (not CoNLL, preferably something that your boss or customers would care about), run it against Mistral/etc, look at the P/R/F1 scores, and ask "does this solve the problem that I want to solve with NER". If the answer is 'yes', and you could do all those things without reading the book or other NLP educational sources, then yeah you're right, job's finished.
  10. Why do people pretend that alignment of AI is the important problem to solve, rather than alignment of the companies that run AI products with the wellbeing of humanity?
  11. Do something more ambitious that is bigger and more impactful than closing JIRA tickets.
  12. You are a teenager who needs oral contraceptives because you are sexually active. You don't want your parents to find out. Since you're a teenager, you have a few constraints:

    - you have no car, how do you get to your doctor's appointment without asking parents for a ride?

    - you are a minor, do you have any guarantee that your doctor won't tell your parents? You can't risk them finding out, they are very conservative

    - you may not have ever made a doctor appointment for yourself before, maybe don't have access to insurance information etc

    Planned Parenthood provides BCPs at a price you can afford with your teenager job (also guarantees privacy) but the closest one is hours away...

    What do you do?

  13. Certain brands (e.g. SkinnyPop) advertise their bags as "chemical-free" (SkinnyPop claims their's is free of PFOAS). Can anyone help me understand/verify these kinds of claims?
  14. Does anyone know of a good piece of writing about what has made TSMC so successful, what makes their management so good, etc.? Seems like an operational exemplar I'd like to learn more about.
  15. What we really need is PraaS (Praat as a Service). Praat Cloud Edition. Etc.
  16. Communication rates are very similar across languages: https://www.science.org/doi/10.1126/sciadv.aaw2594

    See also (great read): https://pubmed.ncbi.nlm.nih.gov/31006626/

    wrt your Spanish example: grammatical gender adds information redundancy to make it easier to process spoken language (e.g. helps with reference resolution). This redundancy enables Spanish speakers to speak at a relatively fast rate without incurring perception errors. English has fewer words but a slower speech rate. It's an optimization problem.

    The speech rate issue isn't as obvious if you're only looking at text, but I'd argue/speculate that lossless speech as a language evolutionary constraint has implications for learnability.

    tl;dr there is no communication tax, languages are basically equivalent wrt to information rate, they just solved the optimization problem of compactness vs speech rate differently

  17. "word" isn't a useful concept in a lot of languages. Words are obvious in English because English is analytic: https://en.wikipedia.org/wiki/Analytic_language

    But there are tons of languages (not just CJK languages) that use either compounding or combos of root + prefix/suffix/infix to express what would be multiple words in English. E.g. German 'Schadenfreude'. Its actually way more useful to tokenize this as separate parts because e.g. 'Freude' might be part of a lot of other "words" as well. So you can share that token across a lot of words, thereby keeping the vocab compact.

  18. I don't know what you mean by compiler terms but basically, worse tokenizer = worse LM performance. This is because worse tokenizer means more tokens per sentence so it takes more FLOPs to train on each sentence, on average. So given a fixed training budget, English essentially gets more "learning per token" than other languages.
  19. Companies have been hand-wringing about the tech labor shortage for the last 10 years. People went to school and got degrees in a job sector they thought would be pretty safe. Supply/demand.
  20. For GPT4: "Pricing is $0.03 per 1,000 “prompt” tokens (about 750 words) and $0.06 per 1,000 “completion” tokens (again, about 750 words)."

    Meanwhile, there are off-shelf models that you can train very efficiently, on relevant data, privately, and you can run these on your own infrastructure.

    Yes, GPT4 is probably great at all the benchmark tasks, but models have been great at all the open benchmark tasks for a long time. That's why they have to keep making harder tasks.

    Depending on what you actually want to do with LMs, GPT4 might lose to a BERTish model in a cost-benefit analysis--especially given that (in my experience), the hard part of ML is still getting data/QA/infrastructure aligned with whatever it is you want to do with the ML. (At least at larger companies, maybe it's different at startups.)

  21. Seems like accepting lots of part time work would be a great way help employers pay us less and take away benefits/working hours flexibility. I don't wanna be a gig worker. I want a salary, benefits, and a 4 day work week.
  22. How do people feel about the GreenPan products?
  23. Very hard to make a business case because for the reasons you mentioned + the costs are very front-loaded because ontologies are so damn hard to build, even for very well-contained problems. Without a clear payoff, why bother
  24. Incredibly irritating how much of everyone's time Musk has wasted on this
  25. What should Twitter do to fix the problem?
  26. Not for people who are homeless or don't have an ID for whatever reason, and need access to social services.
  27. Again, vegans almost always have to read the ingredients/labels, on every processed food product they plan to consume. The little 'vegan' icon on the back is new and not consistently used. Choosing a plant based lifestyle is A LOT more burdensome than not doing that. I know because I've switched back and forth many times and am married to a vegan. Whey, casein, random cream, honey, they're in everything.

    Even so: reading ingredients is honestly not that hard.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal