Preferences

ibgeek
Joined 149 karma

  1. This seems like a great way to group semantically-related statements, reduce variable leakage, and reduce the potential to silently introduce additional dependencies on variables. Seems lighter weight (especially from a cognitive load perspective) than lambdas. Appropriate for when there is a single user of the block -- avoids polluting the namespace with additional functions. Can be easily turned into a separate function once there are multiple users.
  2. They are analyzing models trained on classification tasks. At the end of the day, classification is about (a) engineering features that separate the classes and (b) finding a way to represent the boundary. It's not surprising to me that they would find these models can be described using a small number of dimensions and that they would observe similar structure across classification problems. The number of dimensions needed is basically a function of the number of classes. Embeddings in 1 dimension can linearly separate 2 classes, 2 dimensions can linearly separate 4 classes, 3 dimensions can linearly separate 8 classes, etc.
  3. Time to fork and bring back removed features. :). An advantage of it being AGPL licensed.
  4. Maybe two different things here: SBCs that run Linux versus microcontrollers (MCUs).

    MCUs are lower power, have less overhead, and can perform hard real-time tasks. Most of what Arduino focuses on are MCUs. The equivalent is the Raspberry Pi Pico.

    In my experience, the key thing is the library ecosystem for the C++ runtime environment. There are a large number of Arduino and third-party high-level libraries provided through their package management system that make it really easy to use sensors and other hardware without needing to write intermediate level code that uses SPI or I2C. And it all integrates and works together. The Pico C/C++ SDK is lower level and doesn’t have a good library / package management story, so you have to read vendor data sheets to figure out how to communicate with hardware and then write your own libraries.

    It’s much more common for less experienced users to use MicroPython. It has a package management and library ecosystem. But it’s also harder to write anything of any complexity that fits within the small RAM available without calling gc.collect() in every other line.

  5. I'm not sure if I'm understanding correctly, but it reminds me of the kernel trick. The distances between the training samples and a target sample are computed, the distances are scaled through a kernel function, and the scaled distances are used as features.

    https://en.wikipedia.org/wiki/Kernel_method

  6. I really wish you guys would change the name since the product has moved so far away from the goals and concepts in the original publication. :). I love the product and what you are doing -- it's definitely needed and valuable.
  7. This isn’t BTRFS
  8. Thanks! I’ll take a look at quadlet.

    I find that I tend to package one-off tasks as containers as well. For example, create database tables and users. Compose supports these sort of things. Ansible actually makes it easy to use and block on container tasks that you don’t detach.

    I’m not interested in running kubernetes, even locally.

  9. Ok one more to add that is a kind-of an abuse of containers: Some compute cluster solutions (like those used for HPC) are using containers to manage software installations on the clusters. They are trying to unify containers with the standard Unix environment, however, so that users still see their home directory (mounted in the container) and other paths so that running applications in the container is the same experience as running it directly on the host OS. This is just a TERRIBLE solution. I much prefer Environment Modules or something like Python's virtual environments (if it worked for arbitrary software installs) as a solution.

    https://en.wikipedia.org/wiki/Environment_Modules_(software)

  10. One of the goals of containers are to unify the development and deployment environments. I hate developing and testing code in containers, so I develop and test code outside them and then package and test it again in a container.

    Containerized apps need a lot of special boilerplate to determine how much CPU and memory they are allowed to use. It’s a lot easier to control resource limits with virtual machines because the application in the system resources are all dedicated to the application.

    Orchestration of multiple containers for dev environments is just short of feature complete. With Compose, it’s hard to bring down specific services and their dependencies so you can then rebuild and rerun. I end up writing Ansible playbooks to start and stop components that are designed to be executed in particular sequences. Ansible makes it hard to detach a container, wait a specified time, and see if it’s running. Compose just needs to be updated to support management of shutting down and restarting containers, so I can move away from Ansible.

    Services like Kafka that query the host name and broadcast it are difficult to containerize since the host name inside the container doesn’t match the external host name. Requires manual overrides which are hard to specify at run time because the orchestrators don’t make it easy to pass in the host name to the container. (This is more of a Kafka issue, though.)

  11. Good write up. The only real bias I can detect is that the author seems to conflate their (lack of) familiarity with ease of use. I bet if they spent a few months using DuckDB and Polars on a daily basis, they might find some of the tasks just as easy or easier to implement.
  12. The Meta post is particularly interesting. Thanks for sharing!
  13. I think the title is misleading. This isn't really about either language in production environments. As other commenters mentioned, a post about production would cover topics like whether there were any tooling / dependency updates that broke a build, whether they encountered any noticeable bugs in production caused by libraries / run time, and how efficiently the run times handle high load (e.g., with GC).

    This is more about syntax differences. Even then, I'd be curious how well both languages accommodate themselves to teams and long term projects. In both cases, you will have multiple people working on parts of the code base. Are people able to read and modify code they haven't written -- for example, when fixing bugs? When incorporating new sub components, how well did the type systems prevent errors due to refactoring? It would be interesting to know if Haskell prevents a number of practical problems that occurred with OCaml or if, in practice, there was no difference for the types of bugs they encountered.

    This blog post feels more like someone is comparing basic language features found in reviews for new users rather than sharing deep experience and gotchas that only come from long-term use.

  14. It's not clear from the article whether actors offer significant benefits (or disadvantages) for data modeling versus the traditional OO paradigm. The article reads more like an introduction that describes the problem and teases a solution rather a complete article that offers a solution and evaluation of it.
  15. The article seems to be smashing together two (seemingly) unrelated topics and doesn't offer much in the way of a solution. What alternative design does the author propose? Is it possible to solve the problem with traditional object-oriented design techniques? It's not clear that the issues presented require or substantially benefit from the actor model without seeing a best in-class OO example.
  16. This is very cool!
  17. Biological trees don’t make predictions. Second or third sentence contains the phrase “randomized tree ensembles not only make predictions.”
  18. Multi-tenant stuff is very interesting to me.

    Do you provide any per-tenant resource limits or prioritization (storage, memory, network [rates plus total], CPU)? Anything to limit the impact of noisy neighbors?

    Do you provide per-tenant accounting (for billing) capabilities?

  19. Most universities publish a high-level breakdown of their expenses through the government IPEDS database. For teaching-focused institutions like mine, a vast majority of the money is spent on salaries. We are committed to keeping class sizes to 20 students or less and hiring experts to keep teaching quality high. That means that there haven’t been any increases in efficiency other than extracting more labor from faculty. This also seems to be true in other areas as well. Startups solve business process needs by purchasing services (e.g., Bamboo HR, expense reporting software, DocuSign, etc.). Universities still have people doing all of these things manually.
  20. That helps. Thanks!

    Giving the programmer control over the checkpoints make sense. Also, limiting it to “workflows” (not necessarily entire programs) and requiring that these be deterministic makes sense.

    I did some working with Folding@Home in grad school. All of the simulations would save their state to disk every N iterations to allow restarts. The state was relatively small (lots of compute, not a lot of data).

    The DBOS papers focused on implementing core OS kernel functionality on top of a distributed relational database. Trying to make a true distributed OS.

    This product seems like a pretty different direction (reliable execution and enhanced observability). Do your future plans tie back to the original project goals or are you going to keep going in a different direction?

  21. I love the idea, especially from reproducibility and debugging standpoints.

    It seems like there would be a significant performance hit, even if doing this on a single machine.

    If the application does external I/O, it seems impossible to reproduce state if it depends on the state of an external system.

    Can you give more details on the pros and cons and your ideas for ideal use cases?

    Thanks!

  22. That's quite a list of impressive improvements from the team. Kudos!
  23. TIL thanks!
  24. I was hoping that the blog post would actually spell out examples of problems. Is it just me or have there been a lot of shorter blog posts on HN lately that are really no more than an introduction section rather than an actual full article?
  25. This is absolutely correct. For public universities, the amount of state funding hasn't kept up with the increase in the number of students.

    While tuition at private institutions have also gone up, so has the amount of financial aid (scholarships) that they've made available:

    https://www.brookings.edu/articles/college-prices-arent-skyr...

    The result is that the net price hasn't increased all that much for the average student.

  26. Adding to this: we should be looking at the actual amount paid by students have receiving financial aid. Analyses of tuition - average financial aid show that the actual amount paid has been relatively flat at private institutions. A lot of private institutions (mine included) give a 50% scholarship to most students.

    Increased costs at public universities are driven largely by lack of state funding. This is also in part because the number of students attending public universities has gone up tremendously but state funding hasn't kept up.

    https://www.brookings.edu/articles/college-prices-arent-skyr...

  27. This is a neat project. The documentation could be clearer, however. I was interpreted the phrase "create a cluster of thousands of nodes in seconds" to mean that it offered a significant speed up in initializing real clusters. Rather, this appears to be a tool for mocking kubelet to enable development and and testing at scale. Or am I misunderstanding?
  28. I wish the article addressed integration with seaborn and other libraries. I’m sure Polars can do most of what I want in isolation, but I’m not sure how well it integrates with everything else.
  29. Ah! Okay, that’s what I was missing. I found the blog posts on using malloc and free from D. I knew the GC could be disabled, but I wasn’t aware of how practical it was to manually manage memory. Thanks!

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal