Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Add PathAssemblyResolver feature to System.Reflection.MetadataLoadContext#33734

Merged
jkotas merged 4 commits into
dotnet:masterfrom
steveharter:PathAssemblyResolver
Dec 1, 2018
Merged

Add PathAssemblyResolver feature to System.Reflection.MetadataLoadContext#33734
jkotas merged 4 commits into
dotnet:masterfrom
steveharter:PathAssemblyResolver

Conversation

@steveharter

@steveharter steveharter commented Nov 29, 2018

Copy link
Copy Markdown
Contributor

Based on original PR #33201 we deferred PathAssemblyResolver until we had better requirements especially around duplicate assemblies.

Fixes #33323 - see this for requirements.

context.LoadFromAssemblyPath(path);
}

// Remove the path so we don't process it again.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes otherwise we'll try to load the same assembly(s) again later when a different AssemblyName is passed in assuming something changed like Version or PublicKeyToken from the previous load(s).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents multiple threads going through this path in parallel and hitting the problem you just described?

(I assume that the MetadataLoadContext / PathAssemblyResolver are safe for concurrent use from multiple threads.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no harm in loading the same assembly other than perf, so this avoids a possible perf hit. A multi-threaded case may end up loading the same assembly again.

@jkotas jkotas Nov 29, 2018

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to end up with two Assembly objects instead of one? Do the two different Assembly objects in the system cause bad things to happen?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original proposal included a func callback which did not have this issue. However that didn't work well for code re-use \ built-in policy cases thus the separate class.

Ultimately, someone could layer on top of a func-based MetadataLoadContext and do whatever they want regarding caching or exposing only certain Load methods.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not rely on loading the same path twice in the same context will return the same Assembly instance?

Yes you can rely on that. There are two "caches" in MetadataLoadContext:
_loadedAssemblies: All canonical assemblies that were loaded. Keyed on "name" (Name, Version, PublicKeyToken, Culture). When LoadFromStream\Path\Byte is called, an existing assembly is returned and newly read one is ignored if "name" matches 100%. If "name" matches but the Mvid is different then an exception is thrown.

_binds: All mapped assemblies populated by Resolve(). Also keyed on "name". Prevents Resolve() from being called unnecessary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So then I'm leaning towards

  1. Leave the API surface as-is
  2. Don't enumerate all loaded assemblies, just enumerate all candidates, load them and pick the best one.
  3. Don't remove anything from the candidate lists

This doesn't require any caching of assembly instances, we can rely on the lower level caches

For the build scenario at least, I don't think we need to bind to one-off loads outside the full set of paths provided to the resolver constructor.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK based on offline mtg we agreed to above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit has these semantics.

Also firmed up rules about matching Assembly.PublicKeyToken depending on incoming AssemblyName.PublicKeyToken.

Also I don't believe AssemblyName.Version can ever be null when the AssemblyName is obtained from an existing assembly, so I removed the NormalizeVersion method.

@jkotas

jkotas commented Nov 30, 2018

Copy link
Copy Markdown
Member

Build breaks...

public Assembly LoadFromStream(System.IO.Stream assembly) { throw null; }
}

public class PathAssemblyResolver : MetadataAssemblyResolver

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan can you verify this still works for you? I've done this before but apparently genapi is failing trying to load System.Reflection.TypeExtensions (for this assembly plus others I tried)

@ahsonkhan ahsonkhan Nov 30, 2018

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run it from the src directory or ref directory? I see the GenApi error only on the ref directory. Run from src directory.

The following worked fine and updated the reference assembly (for what's already in master):

E:\GitHub\Fork\corefx\src\System.Reflection.MetadataLoadContext\src> msbuild /t:GenerateReferenceSource

image

I recommend running it. Let me know if you are still seeing issues.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveharter, were you able to auto-generate the ref? Do you plan to address that outside of #33992?

@ahsonkhan ahsonkhan left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting some changes + nits.


private ReadOnlySpan<byte> NormalizePublicKeyToken(ReadOnlySpan<byte> publicKeyToken)
{
if (publicKeyToken == null)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A span can't really be null, this is doing a hidden implicit cast from array to span and then equality comparison.

I think you want either span.IsEmpty or if span == default. No?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to inline cast. A "null" span means the same as empty\default so really a coding convention

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span == null and span == default are similar/identical, but span.IsEmpty is much cheaper.

https://sharplab.io/#v2:EYLgtghgzgLgpgJwDQxNMAfAAgBgARYCMA3ALABQWAzAQEx4DCeA3hXuwTcAPbcA2eALKEAFAGUADhAB2AHmABPeAD48UKdICULNhz1YA7Go14AvKbzSArnz5lyegL672Lznh78htcRvlK4VXUZbVYHPQ5DYxkAOgBJKABRMAkYBXsnNzdqD14BQSpfGX8VaK0dcIiCI2DpMwsAEzgAMwgbGAyOZ3JHIA===

public class C {
    public bool M1(Span<byte> span) {
        return span == null;
    }
    
    public bool M2(Span<byte> span) {
        return span.IsEmpty;
    }
    
    public bool M3(Span<byte> span) {
        return span == default;
    }
}
C.M1(System.Span`1<Byte>)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: cmp dword [ebp+0x10], 0x0
    L0007: jnz L0030
    L0009: cmp dword [ebp+0x8], 0x0
    L000d: jnz L0014
    L000f: mov eax, [ebp+0xc]
    L0012: jmp L0024
    L0014: mov eax, [ebp+0x8]
    L0017: cmp al, [eax+0x4]
    L001a: lea edx, [eax+0x4]
    L001d: mov eax, [ebp+0xc]
    L0020: add edx, eax
    L0022: mov eax, edx
    L0024: xor edx, edx
    L0026: cmp eax, edx
    L0028: setz al
    L002b: movzx eax, al
    L002e: jmp L0032
    L0030: xor eax, eax
    L0032: pop ebp
    L0033: ret 0xc

C.M2(System.Span`1<Byte>)
    L0000: cmp dword [esp+0xc], 0x0
    L0005: setz al
    L0008: movzx eax, al
    L000b: ret 0xc

C.M3(System.Span`1<Byte>)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: cmp dword [ebp+0x10], 0x0
    L0007: jnz L0030
    L0009: cmp dword [ebp+0x8], 0x0
    L000d: jnz L0014
    L000f: mov eax, [ebp+0xc]
    L0012: jmp L0024
    L0014: mov eax, [ebp+0x8]
    L0017: cmp al, [eax+0x4]
    L001a: lea edx, [eax+0x4]
    L001d: mov eax, [ebp+0xc]
    L0020: add edx, eax
    L0022: mov eax, edx
    L0024: xor edx, edx
    L0026: cmp eax, edx
    L0028: setz al
    L002b: movzx eax, al
    L002e: jmp L0032
    L0030: xor eax, eax
    L0032: pop ebp
    L0033: ret 0xc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, span.IsEmpty is not doing the same thing as span == null or span == default. The later two check whether the span is empty and points to null pointers - it is typically not what you want to check for empty span.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the perf benefit of using IsEmpty.

The later two check whether the span is empty and points to null pointer

Can you provide an example where == null doesn't equate to IsEmpty? It seems like they should always have the same end semantics to me. I see using == null will create a span whose ctor calls this == default.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example where == null doesn't equate to IsEmpty?

class Program
{
    static void Main(string[] args)
    {
       Span<byte> span = new byte[0];
       Console.WriteLine($"span.IsEmpty : {span.IsEmpty}");
       Console.WriteLine($"span == null : {span == null}");
    }
}

Result:

span.IsEmpty : True
span == null : False

@steveharter

Copy link
Copy Markdown
Contributor Author

I believed I addressed all feedback. Any reason to not commit?

@jkotas jkotas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jkotas jkotas merged commit 1b3a095 into dotnet:master Dec 1, 2018
jlennox pushed a commit to jlennox/corefx that referenced this pull request Dec 16, 2018
…text (dotnet#33734)

* Add PathAssemblyResolver

* Treat PathAssemblyResolver as immutable
@karelz karelz added this to the 3.0 milestone Dec 21, 2018
@steveharter steveharter deleted the PathAssemblyResolver branch August 29, 2019 18:09
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…text (dotnet/corefx#33734)

* Add PathAssemblyResolver

* Treat PathAssemblyResolver as immutable



Commit migrated from dotnet/corefx@1b3a095
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants