Skip to content

Crash caused by orphaned locks #44071

@kevingosse

Description

@kevingosse

I just run into an issue on .NET Framework, but it impacts .NET Core as well. In both cases, it causes the app to crash with an access violation.

Repro code:

static void Main(string[] args)
{
    var locks = new object[4];

    for (int i = 0; i < locks.Length; i++)
    {
        locks[i] = new object();
    }

    for (int i = 0; i < 4; i++)
    {
        var thread = new Thread(state =>
        {
            Monitor.Enter(locks[(int)state]);
        });

        thread.Start(i);
        thread.Join();
    }

    // Force the thread to be collected, orphaning the lock
    GC.Collect(2, GCCollectionMode.Forced, true, true);
    GC.WaitForPendingFinalizers();
    GC.Collect(2, GCCollectionMode.Forced, true, true);

    // Promote to a syncblock
    for (int i = 0; i < locks.Length; i++)
    {
        _ = locks[i].GetHashCode();
    }

    // Trigger the crash
    for (int i = 0; i < 4; i++)
    {
        var thread = new Thread(state =>
        {
            _ = Monitor.IsEntered(locks[(int)state]);
        });

        thread.Start(i);
        thread.Join();
    }
}

First, a thread locks an object. A thin-lock is used. The thread dies without releasing the lock.
Later, the thin-lock is promoted to a syncblock (either because of a call to GetHashCode, or because another thread tries to acquire the lock). When it does, the runtime tries to map the thin-lock id to the address of the thread: https://github.com/dotnet/runtime/blob/master/src/coreclr/src/vm/syncblk.cpp#L2214
If the cleaned up thread was the last in the map, it causes the IdToThreadWithValidation method to return NULL (that's the if ((size_t)result <= m_idToThreadCapacity) check) :

Thread *IdToThreadWithValidation(DWORD id)
{
WRAPPER_NO_CONTRACT;
CrstHolder ch(&m_Crst);
Thread *result = NULL;
if (id <= m_highestId)
result = m_idToThread[id];
// m_idToThread may have Thread*, or the next free slot
if ((size_t)result <= m_idToThreadCapacity)
result = NULL;
_ASSERTE(result == NULL || ((size_t)result & 0x3) == 0 || ((Thread*)result)->GetThreadId() == id);
return result;
}

If it does, then GetSyncBlock sets the thread address in the syncblock to -1:

if (pThread == NULL)
{
// The lock is orphaned.
pThread = (Thread*) -1;
}

Later, another thread tries to acquire the lock. ObjectNative::IsLockHeld is called, which in turn calls GetThreadOwningMonitorLock (https://github.com/dotnet/runtime/blob/master/src/coreclr/src/vm/syncblk.cpp#L1804). The method fetches the thread address from the syncblock, checks if it's null, but omits to check if it's equal to -1. This causes the access violation when trying to dereference it to fetch the thread id:

Thread* pThread = psb->GetMonitor()->GetHoldingThread();
if(pThread == NULL)
{
*pThreadId = 0;
*pAcquisitionCount = 0;
return FALSE;
}
else
{
*pThreadId = pThread->GetThreadId();
*pAcquisitionCount = psb->GetMonitor()->GetRecursionLevel();
return TRUE;
}

Metadata

Metadata

Assignees

Labels

area-System.Threadingbugin-prThere is an active PR which will close this issue when it is mergedtenet-reliabilityReliability/stability related issue (stress, load problems, etc.)

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions