Skip to content

Expose the refcounted GCHandle from the Objective-C interop layer#47832

Merged
AaronRobinsonMSFT merged 558 commits intodotnet:feature/objc-interopfrom
AaronRobinsonMSFT:reftracker
Feb 27, 2021
Merged

Expose the refcounted GCHandle from the Objective-C interop layer#47832
AaronRobinsonMSFT merged 558 commits intodotnet:feature/objc-interopfrom
AaronRobinsonMSFT:reftracker

Conversation

@AaronRobinsonMSFT
Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT commented Feb 4, 2021

This API accepts a callback for begin/end "Reference Tracking" phase, a callback for "is reference", and a callback for object has been placed in the finalizer queue. An special refcounted GCHandle can be created for an object that has the TrackedNativeReferenceAttribute in its type hierarchy.

Added a flag to the MethodTable to indicate a type has TrackedNativeReferenceAttribute and a finalizer. There is also a check during creation for this state and in the "enter finalizer queue" callback path.

/cc @jkoritzinsky @elinor-fung @jkotas @VSadov @davidwrighton

@ghost
Copy link

ghost commented Feb 4, 2021

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@davidwrighton
Copy link
Member

@lambdageek Could you comment? Or bring in the appropriate Mono expert here to comment?

Looking at the api it exposes this functionality to developers as a GCHandle. Unfortunately, I believe that the equivalent logic to this in Mono is not actually modeled as a GCHandle. We should reach out to the Mono folks and identify if it is feasible for them to expose this api, or if we really ought to provide a struct type such as ReferenceTrackingHandle that has a Free method and Target property (and presumably also conversion to/from IntPtr.) so that it is straightforward to implement in Mono as well.

@lambdageek
Copy link
Member

@davidwrighton The way I understood the Bridge API is that CreateReferenceTrackingHandle would subsume parts of NSObject.Initialize (specifically the CreateManagedRef call) - ie it will create a GC Handle that will be registered on the native side in xamarin-macios. That seems ok.

@davidwrighton if you're thinking of registering objects for toggleref processing, (ie NSobject.MarkDirty) I guess the Bridge API would now force all objects to be registered for toggleref processing early at Initialize time. Does that sound right @rolfbjarne @AaronRobinsonMSFT ?

@@ -31,6 +31,9 @@ void Interop::OnGCStarted(_In_ int nCondemnedGeneration)
#ifdef FEATURE_COMWRAPPERS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should compare with GetMaxGeneration.
It will be 2 for the foreseeable future, but it might be cleaner to not assume that,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm. I may want to chat offline about this. I think I need to get more educated in this area first.

Copy link
Member

@VSadov VSadov Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see in the code, you check for the max generation just to be sure it is not nested. That seems correct and practically it will be 2.
On the other hand GetMaxGeneration would be cheap enough to call once per-GC and not have a literal in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps even simpler would be to move Interop::OnGCStarted invocation under if (condemned == max_gen) in the GcStartWork

@AaronRobinsonMSFT
Copy link
Member Author

Does that sound right @rolfbjarne @AaronRobinsonMSFT ?

@lambdageek Yes, that is what I was understanding from our conversations.

}
CONTRACTL_END;

if (g_BeginEndCallback != NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that xamarin-macios will take a global lock inside this callback: https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/runtime.m#L337 .

Taking this global lock while the background GC is in progress will frequently turn the background GC into blocking GC. The execution will tend to get blocked on this global lock that will be only released once the background GC is finished.

This may be as a shortcut to just hack some prototype together, but I do not think it acceptable as a shipping design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very familiar with the purpose - is this to block prevent something while GC in progress or to do something while GC is in progress? I think it might be the second, since we ignore ephemeral GCs.

Then, perhaps this begin/end should be around the blocking stage of background GC (when we do dependent handles and such) , or around entire GC when it is a blocking gen2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rolfbjarne Yep. That one i have been able to understand - the comments are very helpful. The one I am more confused about is https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L112-L114. This one, https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L103-L106, seems similar but doesn't have the lock so the use case isn't obvious to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anybody can think of another solution than the lock, I'm happy to try it out.

Copy link
Member Author

@AaronRobinsonMSFT AaronRobinsonMSFT Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that's not a problem after retain

@rolfbjarne Okay. So I get everything you said above. But if that is true, then what is the purpose of surrounding the call with the lock? It would seem that locking isn't going to ensure that i isn't deallocated here.

xamarin_framework_peer_lock ();
[i retain];
xamarin_framework_peer_unlock ();

I started by looking for all instances where the "peer lock" was utilized. There seem to be two instances. The one above and the one in the managed ref release function. The one above is confusing me because it is a retain and not a release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a retain with a problem too, that goes like this:

a1) Thread A starts a GC
a2) Thread A reads the retainCount of object i in the toggle ref callback
a3) Thread A collects everything collectible.
b1) Thread B retains i
b2) Thread B returns from the function xamarin_marshal_return_value_impl (leaving no other references to the managed object on the stack).

The race condition goes like this:

a1. GC starts
a2. The toggle ref callback sees that retainCount = 1, so according to the toggle ref callback the object is collectible)
b1. retainCount is now 2.
b2. The managed i disappears from the stack.
a3. The GC now collects i (because it's nowhere to be seen on the stack), and it has a retainCount of 2.

This isn't a problem here: https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L103-L106, because in that case there is no managed wrapper object for the NSArray in question, we're just marshaling a managed array to an NSArray directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke offline. The above scenario is described as follows:

Conceptually, it's these actions:

  1. Create managed wrapper object for a native object.
  2. Create a handle to this managed object in native code in some form (for Mono we use a MonoObject* pointer, but it can be any kind of representation), where this handle is the only thing keeping the GC from collecting the managed object.
  3. Call retain on the native object.
  4. Release/destroy the handle from 2.
  5. The toggleref callback checks the retainCount of the native object.
  6. The GC collects everything it can.

The following order is the race condition:

1-2-5-3-4-6

  1. Create managed wrapper object for a native object.
  2. Create a handle to this managed object in native code in some form (for Mono we use a MonoObject* pointer on the stack, but it can be any kind of representation), where this handle is the only thing keeping the GC from collecting the managed object.
  3. The toggleref callback checks the retainCount of the native object (getting 1 and telling the GC that native code does not have a reference to the managed object).
  4. Call retain on the native object.
  5. Release/destroy the handle from 2.
  6. The GC collects everything it can (which will collect the managed object, because the toggle ref callback said "don't need it", and there's nothing else keeping it alive)

Copy link
Member

@jkotas jkotas Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a handle to this managed object in native code in some form (for Mono we use a MonoObject* pointer

Is it important that the handle is created in native code? In other words, is this any different from just keeping the managed object reference in managed code? There should be no difference for CoreCLR. There may be a difference for Mono due to how conservative GC stack scanning and embedding APIs work.

Copy link
Member

@rolfbjarne rolfbjarne Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't matter exactly where the handle is created, but I think that's missing the point.

The problem is the synchronization between releasing that handle, and code in the toggle ref callback. For Mono the handle is implicitly released by returning from the stack frame (this line of code makes sure the handle is considered alive up until that point: https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L118 - other types of handles might need explicit release at that point).

Let's say we use a GCHandle (where the managed function returned a GCHandle of the managed object that was created), in which case this code block:

https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L108-L119

could look something like this (retval_gchandle is the GCHandle the managed function returned):

id i = xamarin_get_handle (retval_gchandle, exception_gchandle);
if (*exception_gchandle != 0) {
	mono_gchandle_free (retval_gchandle);
	return NULL;
}

xamarin_framework_peer_lock ();
[i retain];
xamarin_framework_peer_unlock ();
if (!retain)
	[i autorelease];

mono_gchandle_free (retval_gchandle);
return i;

Without the lock, the following scenario is possible:

a) Thread A is just about to execute [i retain].
b) Thread B is running the GC, and calls the toggle ref callback.
c) Thread B: The toggle ref callback sees that the retainCount of i is 1, and tells the GC that the object can be collected.
d) Thread A calls[i retain], to prevent the object from being collected, and frees retval_gchandle.
e) Thread B: The GC collects the object.

/// <summary>
/// Attribute used to indicate a class is tracked from the native environment.
/// </summary>
[SupportedOSPlatform("macos")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not an immediate concern, it's a few steps ahead, but we'll need maccatalyst too to support MAUI (and likely Blazor) too
dotnet/designs#174

@davidwrighton
Copy link
Member

@lambdageek, this api is somewhat different from toggleref processing in mono I think, in that in Mono, an object is opted into toggleref processing, and when it is cleaned up, the processing will just go away. In this model, the reference tracker handle is a separate thing (internally in our system, a GCHandle) which works like a weak/strong gchandle. In this case, the user of this api must explicitly cleanup the reference tracker handle at some point, or there will be a leak of the handle.

}
CONTRACTL_END;

if (g_BeginEndCallback != NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a retain with a problem too, that goes like this:

a1) Thread A starts a GC
a2) Thread A reads the retainCount of object i in the toggle ref callback
a3) Thread A collects everything collectible.
b1) Thread B retains i
b2) Thread B returns from the function xamarin_marshal_return_value_impl (leaving no other references to the managed object on the stack).

The race condition goes like this:

a1. GC starts
a2. The toggle ref callback sees that retainCount = 1, so according to the toggle ref callback the object is collectible)
b1. retainCount is now 2.
b2. The managed i disappears from the stack.
a3. The GC now collects i (because it's nowhere to be seen on the stack), and it has a retainCount of 2.

This isn't a problem here: https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L103-L106, because in that case there is no managed wrapper object for the NSArray in question, we're just marshaling a managed array to an NSArray directly.

}
CONTRACTL_END;

if (g_BeginEndCallback != NULL)
Copy link
Member

@rolfbjarne rolfbjarne Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't matter exactly where the handle is created, but I think that's missing the point.

The problem is the synchronization between releasing that handle, and code in the toggle ref callback. For Mono the handle is implicitly released by returning from the stack frame (this line of code makes sure the handle is considered alive up until that point: https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L118 - other types of handles might need explicit release at that point).

Let's say we use a GCHandle (where the managed function returned a GCHandle of the managed object that was created), in which case this code block:

https://github.com/xamarin/xamarin-macios/blob/daa7e123e212eb3c9e0f30d951b66c0fe254d754/runtime/trampolines.m#L108-L119

could look something like this (retval_gchandle is the GCHandle the managed function returned):

id i = xamarin_get_handle (retval_gchandle, exception_gchandle);
if (*exception_gchandle != 0) {
	mono_gchandle_free (retval_gchandle);
	return NULL;
}

xamarin_framework_peer_lock ();
[i retain];
xamarin_framework_peer_unlock ();
if (!retain)
	[i autorelease];

mono_gchandle_free (retval_gchandle);
return i;

Without the lock, the following scenario is possible:

a) Thread A is just about to execute [i retain].
b) Thread B is running the GC, and calls the toggle ref callback.
c) Thread B: The toggle ref callback sees that the retainCount of i is 1, and tells the GC that the object can be collected.
d) Thread A calls[i retain], to prevent the object from being collected, and frees retval_gchandle.
e) Thread B: The GC collects the object.

vargaz and others added 15 commits February 11, 2021 11:00
…1.1 (dotnet#48164)

[master] Update dependencies from mono/linker
…erate add_intrinsic (dotnet#47731)

Co-authored-by: Nathan Ricci <naricc@xam-emag-01.redmond.corp.microsoft.com>
On the runtime side, intitialize the schema field so that the jit does not
internally think all optimized builds have pgo data with mismatched IL.

Quiet down some of the jit dumping when doing edge instrumentation.

Remove a few unused COMPlus vars from the jit.
* Add Activity Trace Id Custom Generation Support

* Address the feedback

* Update src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs

Co-authored-by: Stephen Toub <stoub@microsoft.com>

Co-authored-by: Stephen Toub <stoub@microsoft.com>
* flush CachedDirectoryStoreProvider on X509Store changes

* reset _forceRefresh
* Unavoidable refresh by Visual Studio of the csproj file.

* Backport Matrix3x2

* Backport Matrix4x4

* Backport Plane

* Backport Quaternion

* Backport Vector

* Backport Vector2

* Backport Vector3

* Backport Vector4

* Manually fix Plane &lt; and &gt;

* Manually fix Vector wrong param names

* Manually fix Vector2 &lt; and &gt;

* Manually fix Vector3 &lt; and &gt;

* Manually fix Vector4 &lt; and &gt;

* Add GenerateDocumentationFile to csproj

* Revert csproj GenerateDocumentationFile change

* Revert file fully

* Fix typo "Multiples" => "Multiplies"

* Use <typeparamref name="T" /> instead of <c>T</c>

* Matrix3x2 adjust remarks.

* Quaternion address remarks.

* Plane adjust remarks.

* Vector4 adjust remarks.

* Matrix4x4 adjust remarks.

* Vector2 adjust remarks.

* Vector3 adjust remarks.

* Address tannergooding and gewarren suggestions.

* Matrix4x4 tries to invert.

* Bring back GenerateDocumentationFile in csproj

* Revert sln change

Co-authored-by: carlossanlop <carlossanlop@users.noreply.github.com>
* add asserts that we don't create void asg and lclVars.

* Fix the issue.

* Add test repro.

* update the comments.
…roject dotnet/icu dotnet/arcade dotnet/xharness (dotnet#48054)

[master] Update dependencies from dotnet/runtime-assets dotnet/llvm-project dotnet/icu dotnet/arcade dotnet/xharness


 - Update arcade tfm package deps to netcoreapp3.1
@ahsonkhan
Copy link
Contributor

ahsonkhan commented Feb 25, 2021

Wow, is this intentional or does it require some cleanup/rebase of the PR?
image

@filipnavara
Copy link
Member

@ahsonkhan Merge from main branch :/

@AaronRobinsonMSFT
Copy link
Member Author

Merging this into the feature branch for now.

@AaronRobinsonMSFT AaronRobinsonMSFT merged commit f1d3046 into dotnet:feature/objc-interop Feb 27, 2021
@AaronRobinsonMSFT AaronRobinsonMSFT deleted the reftracker branch February 27, 2021 02:09
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.