Conversation
Context: c6c487b Context: 312fbf4 Context: 2197579 Context: dotnet/android#7276 There is a desire to remove the "marshal-ilgen" component from .NET Android, which is responsible for all non-blittable type marshaling within P/Invoke (and related) invocations. The largest source of such non-blittable parameter marshaling was with string marshaling: `JNIEnv::GetFieldID()` was "wrapped" by `java_interop_jnienv_get_field_id`: JI_API jfieldID java_interop_jnienv_get_field_id (JNIEnv *env, jthrowable *_thrown, jclass type, const char* name, const char* signature); which was P/Invoked within `JniEnvironment.g.cs`: partial class NativeMethods { internal static extern unsafe IntPtr java_interop_jnienv_get_field_id (IntPtr jnienv, out IntPtr thrown, jobject type, string name, string signature); } and `string` parameter marshaling is *not* blittable. Turns out™ that this particular usage of non-blittable parameter marshaling was fixed and rendered moot by: * 312fbf4: C#9 function pointer backend for `JNIEnv` invocations * c6c487b: "Standalone" build config to use C#9 function pointers * 2197579: Standalone build config is now the default That said, this code path felt slightly less than ideal: the "top-level abstraction" for member lookups is an "encoded member", a string containing the name of the member, a `.`, and the JNI signature of the member, e.g.: _members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this) The "encoded member" would need to be split on `.`, and with c6c487b the name and signature would be separately passed to `Marshal.StringToCoTaskMemUTF8()`, which performs a memory allocation and converts the UTF-16 string to UTF-8. Meanwhile, [C# 11 introduced UTF-8 string literals][0], which allows the compiler to deal with UTF-8 conversion and memory allocation. Enter `JniMemberInfoLookup``: public ref struct JniMemberInfoLookup { public string EncodedMember {get;} public ReadOnlySpan<byte> MemberName {get;} public ReadOnlySpan<byte> MemberSignature {get;} public JniMemberInfoLookup (string encodedMember, ReadOnlySpan<byte> memberName, ReadOnlySpan<byte> memberSignature); } `JniMemberInfoLookup` removes the need to call `Marshal.StringToCoTaskMemUTF8()` entirely, at the cost of a more complicated member invocation: // Old and busted: bool value = _members.InstanceFields.GetBooleanValue("propogateFinallyBlockExecuted.Z", this); // Eventual new hawtness: var lookup = new JniMemberInfoLookup ( "propogateFinallyBlockExecuted.Z", "propogateFinallyBlockExecuted"u8, "Z"u8); bool value = _members.InstanceFields.GetBooleanValue(lookup, this); Is It Worth It™? *Maybe*; see the new `JniFieldLookupTiming.FieldLookupTiming()` test, which allocates a new `JniPeerMembers` instance and invoke `members.InstanceFields.GetFieldInfo(string)` and `members.InstanceFields.GetFieldInfo(JniMemberInfoLookup)`. (A new `JniPeerMembers` instance is required because `GetFieldInfo()` caches the field lookup.) Using `JniMemberInfoLookup` is about 4% faster. # FieldLookupTiming Timing: looking up JavaTiming.instanceIntField 10000 times # .InstanceMethods.GetFieldInfo(string): 00:00:02.2780667 # .InstanceMethods.GetFieldInfo(JniMemberInfoLookup): 00:00:02.2016146 I'm not sure if this is *actually* worth it, especially as this will imply an increase in code size. TODO: * Update `JniPeerMembers.*.cs` to use `JniMemberInfoLookup`, so that e.g. the above `_members.InstanceFields.GetBooleanValue()` overload exists. * `generator` changes to use `JniMemberInfoLookup` [0]: https://learn.microsoft.com/dotnet/csharp/whats-new/csharp-11#utf-8-string-literals
…loc-member-lookup
Commit d1a9951 showed that *for just field lookup*, the idea of a `ref struct JniMemberInfoLookup` *might* be a good idea. Now that we've expanded `JniMemberInfoLookup` plumbing to include *method* lookup, we can throw it into `TimingTests.cs` and see how it compares! The answer is that `ref struct`s and `Span<T>` are *not* magical performance sauce with magical JIT support, and this is with CoreCLR! Method Lookup + Invoke Timing: Traditional: 00:00:00.0175778 No caching: 00:00:00.0202369 Dict w/ lock: 00:00:00.0181357 ConcurrentDict: 00:00:00.0220411 JniPeerMembers: 00:00:00.0209174 JPM+Lookup: 00:00:00.0186421 (I)I virtual+traditional: 00:00:00.0000600 (I)I virtual+JniPeerMembers: 00:00:00.0000588 (I)I virtual+JPM+Lookup: 00:00:00.0007137 The new timings are `JPM+Lookup` and `virtual+JPM+Lookup`. // JniPeerMembers return _members.InstanceMethods.InvokeVirtualObjectMethod("toString.()Ljava/lang/String;", this, null); // JPM+Lookup var member = new JniMemberInfoLookup("toString.()Ljava/lang/String;", "toString"u8, "()Ljava/lang/String;"u8); ReadOnlySpan<JniArgumentValue> args = null; return _members.InstanceMethods.InvokeVirtualObjectMethod (member, this, args); We see that JPM+Lookup is 11% *faster* when no arguments are involved. Nice! Throw an argument into the mix: // (I)I virtual+JniPeerMembers var args = stackalloc JniArgumentValue [1]; args [0] = new JniArgumentValue (value); return _members.InstanceMethods.InvokeVirtualInt32Method ("VirtualIntMethod1Args.(I)I", this, args); // (I)I virtual+JPM+Lookup var member = new JniMemberInfoLookup ( "VirtualIntMethod1Args.(I)I", "VirtualIntMethod1Args"u8, "(I)I"u8 ); var args = stackalloc JniArgumentValue [1]; args [0] = new JniArgumentValue (value); return _members.InstanceMethods.InvokeVirtualInt32Method (member, this, new ReadOnlySpan<JniArgumentValue> (args, 1)); and we're now a *whole order of magnitude worse*, taking 12.1x longer. Which quickly makes this idea as-is unworkable. Maybe it's the ReadOnlySpan<T> usage, and if I went back to straight `JniArgumentValue*` values it would be better?
abfe931 to
840f04a
Compare
|
There isn't much point in continuing this effort, other than to show the idea and -- more importantly -- how it performs; see 840f04a. Trying to use What's here uses 0.0000439 to 0.0004603 is a 10.4x increase in time: an order of magnitude. Without pulling out a profiler, my guess is that this is due to the use of |
Context: c6c487b
Context: 312fbf4
Context: 2197579
Context: dotnet/android#7276
There is a desire to remove the "marshal-ilgen" component from .NET Android, which is responsible for all non-blittable type marshaling within P/Invoke (and related) invocations.
The largest source of such non-blittable parameter marshaling was with string marshaling:
JNIEnv::GetFieldID()was "wrapped" byjava_interop_jnienv_get_field_id:which was P/Invoked within
JniEnvironment.g.cs:and
stringparameter marshaling is not blittable.Turns out™ that this particular usage of non-blittable parameter marshaling was fixed and rendered moot by:
JNIEnvinvocationsThat said, this code path felt slightly less than ideal: the "top-level abstraction" for member lookups is an "encoded member", a string containing the name of the member, a
., and the JNI signature of the member, e.g.:The "encoded member" would need to be split on
., and with c6c487b the name and signature would be separately passed toMarshal.StringToCoTaskMemUTF8(), which performs a memory allocation and converts the UTF-16 string to UTF-8.Meanwhile, C# 11 introduced UTF-8 string literals, which allows the compiler to deal with UTF-8 conversion and memory allocation.
Enter `JniMemberInfoLookup``:
JniMemberInfoLookupremoves the need to callMarshal.StringToCoTaskMemUTF8()entirely, at the cost of a more complicated member invocation:Is It Worth It™? Maybe; see the new
JniFieldLookupTiming.FieldLookupTiming()test, which allocates a newJniPeerMembersinstance and invokemembers.InstanceFields.GetFieldInfo(string)andmembers.InstanceFields.GetFieldInfo(JniMemberInfoLookup). (A newJniPeerMembersinstance is required becauseGetFieldInfo()caches the field lookup.) UsingJniMemberInfoLookupis about 4% faster.I'm not sure if this is actually worth it, especially as this will imply an increase in code size.
TODO:
Update
JniPeerMembers.*.csto useJniMemberInfoLookup, so that e.g. the above_members.InstanceFields.GetBooleanValue()overload exists.generatorchanges to useJniMemberInfoLookup