Preferences

Because of benchmarking LLMs have also been pushed towards fluency in Python, and related frameworks like Django and Flask. For example, SWE-Bench Verified is nearly 50% Django framework PR tasks: https://epoch.ai/blog/what-skills-does-swe-bench-verified-ev...

It will be interesting to see how durable these biases are as labs work towards developing more capable small models that are less reliant on memorized information. My naive instinct is that these biases will be less salient over time as context windows improve and models become increasingly capable of processing documentation as a part of their code writing loop, but also that, in the absence of instruction to the contrary, the models will favor working with these tools as a default for quite some time.


This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal