Preferences

Yup! My whole inspiration for this came from a friend explaining ECS to me and me thinking "wouldn't that work for a JS engine?"

I've seen this brought up a couple times now, but I never get it. Why would ECS fit a JS engine? The ECS pattern optimizes for iterating over ton of data, but a JS engine does the opposite of that, it need to interpret instruction by instruction which could access random data.
Indeed, there's no guarantee that it will fit: I think it will but I don't know and want to find out.

There are strong (IMO) reasons to think it will fit, though. User code can indeed do whatever but it rarely does. Programs written in JS are no less structured and predictable than ones written in C++ or Rust or any other language: they mostly operate on groups of data running iterations, loops, and algorithms over and over again. So the instructions being interpreted are likely to form roughly ECS System-like access patterns.

Furthermore, it is more likely that data that came into the engine at one time (eg. one JSON.parse call or fetch result) will be iterated through at the same time. Thus, if the engine can ensure that data is and stays temporally colocated, then it is statistically likely that the interpreter's memory access patterns will not only come from System-like algorithms, they will access Component-array like memory.

So: JS objects (and other heap allocated data) are Entities, their data is laid out in arrays of Components (TODO laying out object properties in Component arrays, at least in some cases), and the program forms the Systems. ECS :)

Disclaimer: I’m way out of my depth on the theoretical front, despite similarly taking interest in ECS in unconventional places. I’m responding from the perspective of most of my career being in JS/TS.

I think your instincts about program structure are mostly right, but the outliers are pretty far out there.

I’m much less optimistic about how you’re framing arbitrary data access. In my experience, it’s very common for JS code (marginally less common when authored as TS) to treat JSON (or other I/O bound data) as a perpetual blob of uncertainty. Data gets partially resolved into program interfaces haphazardly, at seemingly random points downstream, often with tons of redundancy and internal contradictions.

I’m not sure how much that matters for your goals! But if I were taking on a project like this I’d be looking at that subset of non-ideal patterns frequently to reassess my assumptions.

Hey, thank you for the viewpoint. I'm myself a career JS/S programmer as well, and I do appreciate that the lived reality is quite varied.

The partial resolving and haphazardness of JSON data usage shouldn't matter too much. I don't mean to make JSON parsed objects to be some special class, per se, or for the memory layout to depend on access patterns on said data. Only, I force data that was created together to be close together in memory (this is what real production engines already do, but only if possible) and for that data to stay together (again, production engines do this but only as is reasonably possible; I force the issue). So I explicitly choose temporal coherence. Beyond that, I use interface inheritance / removal of structural inheritance to reduce memory usage. eg. Plain Arrays (used in the common way) I can push to 9 bytes or even 8 bytes if I accept that Arrays with a length larger than 2^24 are always pessimised. ECS / Struct-of-Arrays data storage then further allows me to choose to move some data onto separate cache lines.

But; it's definitely true that some programs will just ruin all reasonable access patterns and do everything willy-nilly and mixed up. I expect Nova to perform worse on those kinds of cases: as I am adding indirection to uncommon cases and splitting up data onto multiple cache lines to improve common access patterns, I do pessimise the uncommon cases further and further down the drain. I guess I just want to see what happens if I kick those uncommon cases to the curb and say "you want to be slow? feel free." :) I expect I will pay for that arrogance, and I look forward to that day <3

Thank you for your response! I’ve been loosely following the project already and now my interest is piqued even more. Your explanation and approach makes a lot of sense to me, now I’m curious to see how it plays out!
Hmm, a compacting garbage collector that would try to put live data together, according to its access patterns, might be fun to consider. Along these lines, it could even split objects' attributes along ECS-friendly lines, working in concert with a profiler.
Nova's GC doesn't use access patterns for this, but this is basically what we do, or in some cases aim to do.

Arrays, Objects, ArrayBuffers, Numbers, Strings, BigInts, ... all have their data allocated onto different heap vectors. These heap vectors will eventually be SoA vectors to split objects' attributes along ECS-friendly lines; eg. Array length might be split from the elements storage pointer, Object shape pointer split from the Object property storage pointer etc. Importantly, what we already do is that an Array does not hold all Object's attributes but instead holds an optional pointer to a "backing Object". If an Array is used like an Object (eg. `array.foo = "something"`) then a backing object is created and the Array's backing Object pointer is initialised to point to that data. Because we use a SoA structure, that backing Object pointer can be stored in a sparse column, meaning that Arrays that don't have a backing Object initialised also do not initialise the memory to hold the pointer.

I'm also interested in maybe splitting Object properties so that they're stored in ECS-friendly lines (at least if eg. they're Objects parsed from an Array in JSON.parse).

Our GC is then a compacting GC on these heap vectors where it simply "drops" data from the vector and moves items down to perform compaction. This also means it gets to perform the compaction in a trivially parallel manner <3

I had the impression, ECS would boost performance mainly by allowing the systems to run in parallel on the entities. Isn't this kinda moot in a single threaded runtime?
This is definitely the more meaningful/influential performance benefit of ECS in game development, I believe. JavaScript will not allow for that as you point out. Perhaps a sufficiently crazy JIT might claw some of those benefits back, though? Not sure.

But: the lesser but still impactful performance benefit of ECS is the usage of Struct-of-Array vectors for data storage. JavaScript can still ruin that benefit by always accessing all parts and features of an Object every time it touches one, but it is a less likely thing to happen. So, there is a benefit that JavaScript code itself can enjoy.

Finally, there is one single "true System" in a JavaScript engine's ECS: the garbage collector. The GC will run through a good part of the engine heap, and you can fairly easily write it to be a batched operation where eg. "all newly found ordinary Objects" are iterated through in memory access order, have their mark checked, and then gather up their referents if they were unmarked. Rinse and repeat to find all live/reachable objects by constantly iterating mostly sequential memory in batches. This can also be parallelised, though then the batch queue needs to become shareable across threads.

The sweep of the heap after this is then a True-True System where all items are iterated in order, unmarked ones are ignored, marked ones are copied to their post-compaction location, and any references they hold are shifted down to account for the locations of items changing post-compaction.

"a sufficiently crazy JIT might claw some of those benefits back"

Good point.

If you know the data can't be accessed in parallel by the user code, that safety guarantee might allow the JIT to do it anyway.

ECS's main performance benefit comes from reducing cache misses.

Memory that's accessed together is stored together.

So for example, if you're calculating physics in a game, you perform all the needed physics operations on values stored contiguously in memory. Great cache locality means huge reduction in cache misses and performance benefits. If you had millions of entities and are performing each entity's set of operations (rendering, IO, AI, physics, etc) entity by entity, you might be getting a lot of cache misses.

There's an entire talk by Bob Nystrom (of crafting interpreters and game programming patterns) arguing you likely don't need to use ECS unless you have the exact problem of high cache miss rate.

I'll be checking this project out! I'm a big fan of ECS and have lofty goals to use it for a data processing project I've been thinking about for a long time that has a lot in common with a programming language, enough that I've basically been considering it as one this whole time. So it's always cool to see ECS turn up somewhere I wouldn't otherwise expect it.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal