- 1 point
- words_on_lines = [ret for line in text.splitlines() for ret in [line.split()]]
- Dask for python satisfies exactly these requirements, just in the python ecosystem. A pattern I have been using at multiple workplaces for the last decade was to start a dask cluster (maybe 10 years ago I would start an ipyparallel cluster) on any node that does computation, then as I needed I would spin new nodes and connect them. This gives dynamic infinite scalability, with almost no overhead or even code debt - the dask interfaces are great even without using distributed computing. When I wasn't allowed to use containers, I would sneakily add code to other machines to join my dask clusters. I would connect any and all computing devices. One company pushed us to use databricks and spark, and I never got it - why would we commit to a cluster size before we started a computation?
- 2 points
- 3 points
- The innovation is to find traces of a global cosmic-ray event with which to connect the dating of objects in one local area, Greece, where the dendrochronological data is not continuous, with those in far away local areas, for example England/Ireland, where we have continuous dendrochronological data
- I get the sentiment, but personally I can easily imagine myself writing an autocompleter that would work fine with select before from. (I don't write much sql so I don't)
Just to clarify, my point is that when we do write sql most of us start by writing the from part, and even if we didn't I can just offer all columns from all tables I know about with some heuristic for their order when autocompleting in the select part.
- I must go on compiling / You can't break that which isn't yours / I must go on packaging / I'm not my own, it's not my choice
- This is lovely, I didn't know. I guess this is what Kuhn was talking about, we write history in retrospective, sorting it out preferring narrative over fact.
- Lots of people are saying that having large files in a repo is wrong, bad, bad design, incorrect usage.
Forget that you know git, github, git-lfs, even software engineering for a moment. All you know is that you're developing a general project on a computer, you are using files, and you want version history on everything. What's wrong with that?
The major issue with big files is resources: storage, and network bandwidth. But for both of these it is the sum of all object sizes in a repo that matters, not any particular file, so it's weird to be harking on big files being bad design or evil.
- This definitely fits with Grothendieck's philosophy: he basically ignored all work in this area, implicitly claiming it was trivial, while some of his closest friends and most famous student made huge strides with actual hard work - not quite things falling into place. In fact, the paper most famously proving the Weil conjectures has as an explicit target the coefficients of a modular form, uses an inspiration from automorphic forms theory, and is infamously Grothendieck's greatest disappointment.
There is rich structure in this area of maths that goes well beyond just sections of some sheaf, or at least this is what Serre, Deligne, Langlands, Mazur, Katz, Hida, Taylor, Wiles and many others seem to think.
- Well, they're not _just_ that, right?
First, they can be differential forms, not only functions. Second, there's an important note that we don't look only at things over C. For example, specifically in the context of Fermat's Last Theorem, we need Hida's theory of p-adic families of modular forms. Much of the arithmetic of modular forms comes from the modular curves being algebraic and (almost) defined over the integers.
- Still waiting to see the first new irrationality proof with these ideas, wishing you lots of good luck!
- Actually, AFAIU the TLA+ proof is only for a few small cluster sizes - not for all sizes. And the number of nodes in the painting is definitely above that checked by TLA+...
- I didn't know about task spooler. Is it better than using xargs with a parallel pool?
xargs -L1 -P20 git clone --bare < repositories.txt - I have a hot take on this, which I hope will resonate with at least a few people: duplication, even of blocks of up to a few long statements, rarely bothers me, because I remember all the duplications as a single instance. I have extra ordinary memory, and this makes a huge difference in how I think of and write code. Or anything really. I save everything I've ever written, like bash history, but everything, and refer beck to it and copy paste somewhere else. I wonder if anyone else has this. This doesn't affect how I think of production code, but it hugely affects my work flow.
- I hope this isn't received too badly on HN, but Feynman was way too smug sometimes. This speech is essentially a philosophy of science piece, at the intellectual stage of at least one hundred years prior, and probably more like three hundred.
It's too bad that he so diminished philosophy of science, and at the same time put so much undeveloped thought and prose into it.
- > Before if you started with thread and then realised you were GIL-limited then switching from the threading module to the multiprocessing module was a complete change
Is this true?
I've been switching back and forth between multiprocessing.Pool and multiprocessing.dummy.Pool for a very long time. Super easy, barely an inconvenience.
- Similarly, in 2003, they had some code-under-caps promotion. I wrote a script to submit thousands of random codes to the website, and subsequently someone from Coca-Cola NZ called my home. They calmed down when my dad said I wasn't home, but at school.
Mind you, in 2002 no called, and I got a free shirt and a folding chair.
What's the complexity of computing the nth fibonacci number? Make a graph of computation time with n=1..300 that visualizes your answer.
There are those that very quickly reply linear but admit they can't get a graph to corroborate, and there are those that very quickly say linear and even produce the graph! (though not correct fibonacci numbers...)