Preferences

Another major caveat to this benchmark is it doesn't include any significant marshalling costs. For example, passing strings or arrays from Java to C is much, much slower than passing a single integer. Same is going to be true for a lot (all?) of the GC'd languages, and especially true for strings when the language isn't utf8 natively (as in, even though Java can store in utf8 internally, it doesn't expose that publicly so JNI doesn't benefit)

adgjlsfhk1
Julia (one of the 2 fastest languages here) Is GCed. GC only make C interop hard if you move objects.
kllrnohj OP
But as I said, this benchmark doesn't include any meaningful marshalling. Julia is the second fastest here when a single int is passed. Julia is very unlikely to still be second fastest when an array (or string) is passed.
adgjlsfhk1
You only need marshling if one of your languages is using a bad data layout. Julia stores strings as a a length plus a pointer to Uint8 data. Julia structs have the same layout as C (you may need to specify padding in C, but that's easy enough). Arrays of immutable structs are also usually stored inline. There are definitely some types of objects that you might need to marshal (eg Dicts or other more complicated data structures), but for all of the basic stuff (lists of floats etc), Julia can still just pass a pointer.
cbkeller
For an array, you'd have to worry about row- vs column-major orientation if multidimensional, but for simple numeric vectors and base strings (which are just a collection of UInt8s in memory), it appears to be sufficient to merely pass the pointer to C:

  julia> a = "hello there!"
  "hello there!"

  julia> p = pointer(a)
  Ptr{UInt8} @0x000000010a431458

  julia> ccall(:puts, Int, (Ptr{UInt8},), p)
  hello there!
  10
For vectors of structs, then of course you'd need to know the layout of each struct when operating on them from the other language, but still doable enough in principle. Vectors of union types would be trickier though.
jimmaswell
How is that possible? It's not just passing pointers?
lelanthran
> How is that possible? It's not just passing pointers?

No. A Java string is a "pointer" to an array of 16-bit integers (each element is a 2-byte character). A C string is a pointer to an array of 8-bit integers.

You have to first convert the Java string to UTF8, then allocate an array of 1-byte unsigned integers, then copy the UTF8 into it, and only then can you pass it to a C function that expects a string.

vvanders
Let's not forget that it's modified UTF-8[1] you get back from JNI lest you think that you'll be able to use the buffer as-is.

[1] https://docs.oracle.com/javase/10/docs/specs/jni/types.html#...

ReactiveJelly
Qt (C++ framework) is also UTF-16, so maybe if you're lucky you could pass strings between Java and Qt without transcoding?
lelanthran
> Qt (C++ framework) is also UTF-16, so maybe if you're lucky you could pass strings between Java and Qt without transcoding?

Probably not; the other fields in the string will be different (the length field might be unsigned in Qt while it's almost certainly signed in Java. Java strings may have other fields that are not present in the Qt string (and vice versa).

Why would signedness be a problem? If you reinterpret a non-negative two's complement integer as unsigned, you get the same value.
lelanthran
> Why would signedness be a problem? If you reinterpret a non-negative two's complement integer as unsigned, you get the same value.

It won't be a problem if the string being passed from Java to C is const. It will be if the C code increases the size of the string enough to set the highest bit. Then Java will be looking at negative length strings.

spullara
Will be interesting to see how Project Panama does on this kind of benchmark.

https://openjdk.java.net/projects/panama/

jimmaswell
Guess I was missing the context, I thought this was just within Java.
kllrnohj OP
Others already covered the string issue, but broadly you can't have a compacting GC if you need to also have stable C pointers. Can't move the data around to compact it at that point.

In theory this is why JNI has GetPrimitiveArrayElements and GetPrimitiveArrayCritical. The Critical variant could block the GC from running at all for the duration or disable compaction (hence why you also can't make other JNI calls in the interim). In practice the way I've found that's most consistently fast is to actually use the GetArrayRegion methods. You're paying for a copy, but you're often paying for one anyway. So at least you can avoid the release JNI call, and copy to memory you've allocated (and could then also reuse).

glouwbug
Likely a malloc’d copy to appease the 8bit char ABI

This item has no comments currently.