Skip to content
This repository was archived by the owner on Nov 1, 2020. It is now read-only.

Add public API for ILVerify#5186

Merged
jkotas merged 6 commits intodotnet:masterfrom
jcouv:verify-roslyn
Jan 10, 2018
Merged

Add public API for ILVerify#5186
jkotas merged 6 commits intodotnet:masterfrom
jcouv:verify-roslyn

Conversation

@jcouv
Copy link
Member

@jcouv jcouv commented Jan 3, 2018

@ArztSamuel @jkotas @VSadov Here is another stab at a public API for ILVerify.
The ILVerify program, the ILVerify tests and the Roslyn tests all rely on the same API, summarized below. There are two APIs that are internal and used by the ILVerify tests.

namespace ILVerify
{
    public delegate bool ShouldVerifyMethod(string name);

    public interface IResolver
    {
        PEReader Resolve(AssemblyName name);
    }

    public class Verifier
    {
        public ShouldVerifyMethod ShouldVerifyMethod { set; } // Used by ILVerify program to filter which methods should be verified vs. skipped
        public Verifier(IResolver resolver);
        internal Verifier(SimpleTypeSystemContext context); // Used by the ILVerify tests
        public void SetSystemModuleName(AssemblyName name);
        public VerificationResult Verify(AssemblyName moduleToVerify);
    }

    public class VerificationResult
    {
        public int NumErrors = 0;
        public string Message = string.Empty;
        internal IEnumerable<VerifierError> _errors; // Note: there may be fewer errors recorded here than counted in NumErrors, which also counts exceptions. Used by the ILVerify tests
    }

    public class VerifierException : Exception
    {
        public VerifierException(string message) : base(message)
        {
        }
    }
}

Note 1: there is auto-discovery feature for the system module. It is required for Roslyn tests which create their own core library, but the test framework doesn't know when that is the case. If this feature feels wrong, I can further explore a Roslyn-side solution.

Note 2: the SimpleTypeSystemContext keeps a cache of EcmaModules (_modules). That is currently keyed on simple name instead of full name. That's because I don't know how to get a full name from an EcmaModule or MetadataReader (see CreateModule method). Any tips?

If the API looks good, I will clean up this PR (removing strong name using Roslyn key, switch back to an exe).

Fixes #3734

@jcouv jcouv self-assigned this Jan 3, 2018
return null;
}

public virtual ModuleDesc ResolveModule(AssemblyName name, bool throwIfNotFound = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to add support for multi-module assemblies, it should be done in separate PR.

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

there is auto-discovery feature for the system module

It would be nice to submit this in separate PR as well. #Resolved

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

That's because I don't know how to get a full name from an EcmaModule

Modules in multimodule assemblies do not have assembly names. They are always resolved in the context of the owning assembly. #Resolved

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

There are two APIs that are internal and used by the ILVerify tests.

These are not the only internal APIs used by the ILVerify tests... #Resolved

@jcouv
Copy link
Member Author

jcouv commented Jan 3, 2018

Updated the PR with Jan's feedback (removed the multi-module and auto-discovery features).
Any feedback on the public API design?

Also, I'm thinking the exe should be split into exe+dll. Jared mentioned that using exe's as library often causes problems. And the split should happen before release packages get published.

Copy link
Member

@VSadov VSadov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much to add. So LGTM.
I am fairly sure, though, that we will have to tweak the API as we get more information from the actual use.

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

Any feedback on the public API design?

  • ShouldVerifyMethod: I think it would be better to have a method that takes a metadata token and that you call to verify a single method.
  • VerificationResult: Why is this not a simple array of errors?
  • VerifierException: Does it need to be part of the public API?

Also, I'm thinking the exe should be split into exe+dll

Agree that these should be split. #Resolved

@jcouv jcouv changed the title Add public API and fix some issues to allow verifying Roslyn tests Add public API for ILVerify Jan 4, 2018
continue;
}

var methodName = method.ToString();
Copy link
Member

@MichalStrehovsky MichalStrehovsky Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToString() on the type system objects is a debugging helper, not an API. We reserve the right to change the output any time to aid debuggability. I recently overhauled it in #5065. #Resolved

@ArztSamuel
Copy link
Collaborator

ArztSamuel commented Jan 4, 2018

I noticed that the tests were not executable any longer after these changes. This is caused by the method tester now resolving the test assemblies in a different way, which enforces assembly name == file name. Some older test assemblies did not respect this. I have submitted a PR fixing it: #5198 #Resolved

return new VerificationResult();
}

internal VerificationResult VerifyMethod(MethodDesc method, MethodIL methodIL, string moduleName)
Copy link
Member

@jkotas jkotas Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these methods need to take moduleName? You should be always able to get the module name from MethodDesc. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can also get the MethodDesc from the MethodIL.

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should rename the moduleName. This parameter is used to maintain the old command-line output, which inclues file paths.
But when the verifier is used in-memory, I'm passing in the module name. #Closed

}
catch (VerificationException)
{
numErrors++;
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerificationExceptions are thrown to either abort the entire method or a single basic block. When such an event happens, the rule, which caused the abort, should already have reported an error, therefore I think increasing numErrors here is redundant.
It also breaks some test cases.
#Resolved

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
It seems I've not been running all the tests locally and CI isn't either :-( Investing in test automation would surely help.
#Resolved

@ArztSamuel
Copy link
Collaborator

ArztSamuel commented Jan 4, 2018

I think it would be better to have a method that takes a metadata token and that you call to verify a single method.

I agree. As discussed in the corresponding issue, it would make sense to me, to have:
Verify(PEReader) for verifying entire modules
Verify(PEReader, MethodDefinitionHandle) for verifying a specific method
Verify(PEReader, TypeDefinitionHandle) for verifying a specific type

As far as the auto-discovery is concerned: Would it make sense to add a property SystemModuleName with getter to IResolver? Or do you think this does not belong into the Resolver?
The ILVerify Program would simply return _systemModule for it, while a resolver implementation for the Roslyn tests could perform auto-discovery.
#Resolved

private string _systemModule = DefaultSystemModuleName;
private Dictionary<string, string> _inputFilePaths = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
private Dictionary<string, string> _referenceFilePaths = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
private AssemblyName _systemModule;
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be set to AssemblyName("mscorlib") as a default?
#Closed

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know if the caller is using simple names or full names as identifiers. #Closed

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. You were right :-) #Closed

var result = VerifyMethod(method, methodIL, path);
if (result.NumErrors > 0)
{
return result;
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immediately returning here will abort the module verification after the first method has reported an error. This is different from how we and PEVerify previously handled this (keep verifying as long as sensible). #Resolved

public int NumErrors = 0;
public string Message = string.Empty;
internal IEnumerable<VerifierError> _errors; // Note: there may be fewer errors recorded here than counted in NumErrors, which also counts exceptions
}
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jkotas. I think it would make sense to simply have an enumeration of VerificationErrors (which can be an array internally).
I think a new VerificationError class would be helpful for this, which could look like this:

class VerificationError
{
    public readonly string ModuleName;
    public readonly string MethodName;
    public readonly VerificationErrorArgs ErrorArgs;
}

It might even make sense to use a MethodDefinitionHandle and PEReader instead of strings, to be more consistent with the proposed Verify methods.
I am currently working on a use case, where having the VerificationErrorArgs for each reported error is vital. Creating the error string can then also be implemented by the application using the Verifier rather than the Verifier having to implement it.
#Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with strings. Feel free to adjust to your need, after this PR but before we publish.

@jcouv
Copy link
Member Author

jcouv commented Jan 5, 2018

@jkotas

ShouldVerifyMethod: I think it would be better to have a method that takes a metadata token and that you call to verify a single method.

Just to confirm your proposal: the user of the API should be responsible for enumerating all method definitions that it cares about and verify them one by one (using a public API like VerifyMethod(MethodDesc method, MethodIL methodIL) rather than Verify(AssemblyName moduleToVerify) with a callback for filtering methods). #Resolved

@jkotas
Copy link
Member

jkotas commented Jan 5, 2018

the user of the API should be responsible for enumerating all method definitions that it cares about and verify them one by one

Yes - when the user wants to filter which methods to verify.

sing a public API like VerifyMethod(MethodDesc method, MethodIL methodIL)

The public API should be: void VerifyMethod(System.Reflection.PortableExecutable.PEReader peReader, System.Reflection.Metadata.MethodDefinitionHandle methodHandle) as proposed in #5186 (comment)

MethodDesc is internal implementation detail. It should not show up in any public APIs. We should make sure that it is not marked as public in the final version of the .dll. #Resolved

_typeSystemContext.SetSystemModule(_typeSystemContext.GetModule(name));
}

public VerificationResult Verify(AssemblyName moduleToVerify)
Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. As discussed in the corresponding issue, it would make sense to me, to have:
Verify(PEReader) for verifying entire modules
Verify(PEReader, MethodDefinitionHandle) for verifying a specific method
Verify(PEReader, TypeDefinitionHandle) for verifying a specific type

If the API accepts PEReader, then we have to give up caching in SimpleTypeSystemContext, since PEReaders can't be keys in a dictionary. New EcmaModules will be recreated multiple times.
I don't know if that is just a performance concern or if that might cause identity problems as well... #Closed

Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since PEReaders can't be keys in a dictionary

Why not? We can use their object identity. #Closed

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PEReaders returned by IResolver.Resolve are currently not cached, so we're getting new ones. I'll add caching then object identity should work. #Closed

@jcouv
Copy link
Member Author

jcouv commented Jan 5, 2018

Pushed a commit that addresses the feedback so far.
I think this project needs some infrastructure for compiling IL in unittests. That will help testing the APIs. The core logic is already shared and therefore tested by existing tests. #Resolved

@MichalStrehovsky
Copy link
Member

I think this project needs some infrastructure for compiling IL in unittests

We have all the scaffolding there. In #3725 the ILVerify project made the jump to netcoreapp2.0 before the repo was able to deal with it, so the build and testing got disconnected from how the rest of the repo is built. As you note, we don't run the tests in the CI, and we don't even build the project in the CI.

At this point we should be able to hook it up into the build again and make whatever changes are necessary for it to build with build.cmd (in a separate pull request). A sample of a test that has assets in IL is the typesystem test.

return null;
}

public abstract PEReader ResolveCore(AssemblyName name);
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is only meant to be called by the resolver itself, right? Would it make sense to make it protected then? #Resolved

var results = _verifier.Verify(peReader, methodHandle);
if (results.Count() > 0)
{
return results;
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immediately returning here will abort the verification of the entire assembly after the first method has reported an error, right? This is different from how we and PEVerify previously handled this (keep verifying as long as sensible).
It might make sense to build up a list of VerificationResults instead. #Resolved

{
}
PEReader peReader = _resolver.Resolve(name);
string actualSimpleName =peReader.GetSimpleName();
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing space after =. #Resolved


private SimpleTypeSystemContext _typeSystemContext;

private static VerificationResult s_noSystemModuleResult = new VerificationResult() { Message = "No system module specified" };
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to also throw an exception for this case instead? #Resolved

@ArztSamuel
Copy link
Collaborator

ArztSamuel commented Jan 5, 2018

It seems I've not been running all the tests locally and CI isn't either :-( Investing in test automation would surely help.

I totally agree. That's not your fault.
Automating the test execution is actually the next thing on my list for ILVerify. I have been assembling the tests with a cmd script locally for quite a while now, I just didn't have time to implement it properly yet.
Of course I am also happy, if anyone is able to implement it before I finally get around to it ;) #Resolved

@jcouv
Copy link
Member Author

jcouv commented Jan 5, 2018

@ArztSamuel Roslyn has some test infrastructure to compile IL on the fly (see "CompileIL"), if that helps. Takes a string as input (and maybe some references?) and produces an array of bytes. #Resolved


namespace ILVerify
{
class SimpleTypeSystemContext : MetadataTypeSystemContext
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to rename this to ILVerifyTypeSystemContext. It is becoming very IL-verify specific. #Resolved

Copy link
Member Author

@jcouv jcouv Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll rename as last thing (and also split into exe+dll) once review is ok, so that the diff remains readable. #Resolved

Copy link
Member

@jkotas jkotas Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and also split into exe+dll

I would do the split in separate PR.

#Resolved

foreach (KeyValuePair<EcmaModule, ModuleData> entry in _moduleData)
EcmaModule module = CreateModule(peReader);
string simpleName = peReader.GetSimpleName();
if (module is null && throwIfNotFound)
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can module be null here? #Resolved

{
public class VerificationResult
{
public string ModuleName { get; internal set; }
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need ModuleName here? The verification granularity is on module, so it will be always same for the given invocation.

If we need it here, it should be PEReader that the caller gets the ModuleName from if it needs it. #Resolved

{
public string ModuleName { get; internal set; }
public string TypeName { get; internal set; }
public string Method { get; internal set; }
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be MethodDefinitionHandle . #Resolved


public IEnumerable<VerificationResult> Verify(PEReader peReader, TypeDefinitionHandle typeHandle)
{
if (peReader is null)
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some places use is null, other places use == null. It would be nice to be consistent. Personally, is null reads harder to me than == null. #Resolved


if (_typeSystemContext.SystemModule is null)
{
yield return s_noSystemModuleResult;
Copy link
Member

@jkotas jkotas Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this should be verification error. It should just throw exception - it is invalid use. #Resolved

{
// This is called once for every assembly that should be verified, so linear search is acceptable.
foreach (KeyValuePair<EcmaModule, ModuleData> entry in _moduleData)
string actualSimpleName = peReader.GetSimpleName();
Copy link
Member

@jkotas jkotas Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create MetadataReader and throw it away immediately. The GetModule method a few lines below will create the MetadataReader again.

It would be better to move this validation to after the GetModule call (and delete the GetSimpleName method because of it should not be needed anymore). #Resolved

{
public class VerificationResult
{
public string TypeName { get; internal set; }
Copy link
Member

@jkotas jkotas Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not need TypeName. It can be fetched from MethodDefinitionHandle. #Resolved

Copy link
Member Author

@jcouv jcouv Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed TypeName #Resolved

Write(" : ");
Write(result.TypeName);
Write("::");
Write(result.Method);
Copy link
Member

@jkotas jkotas Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this will print a nice name of the method. #Resolved

Copy link
Member Author

@jcouv jcouv Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks
Fixed this and verified manually. Here's an example of output:

> ILVerify.exe  D:\issues\Console46\Console46\bin\Debug\New.exe  -r C:\WINDOWS\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll

[IL]: Error: [D:\issues\Console46\Console46\bin\Debug\New.exe : Program::set_Property(Internal.TypeSystem.Ecma.EcmaType)][offset 0x00000002] Fall through end of the method without returning.
1 Error(s) Verifying D:\issues\Console46\Console46\bin\Debug\New.exe
``` #Resolved

@jkotas
Copy link
Member

jkotas commented Jan 6, 2018

LGTM modulo comments

}

message.Append(" ");
var numErrors = results.Count();
Copy link
Collaborator

@ArztSamuel ArztSamuel Jan 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since VerifyAssembly is implemented using yield return, this will verify the entire assembly a second time, just to get the error count. It might make sense to simply increase a counter in the above foreach instead. #Resolved

Copy link
Member Author

@jcouv jcouv Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. Thanks! #Resolved

using System.Collections.Generic;
using System.Reflection;
using System.Reflection.Metadata;
using System.Reflection.PortableExecutable;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should not be necessary anymore.

_typeSystemContext.SetSystemModule(_typeSystemContext.GetModule(_typeSystemContext._resolver.Resolve(name)));
}

public EcmaModule GetModule(PEReader peReader)
Copy link
Member

@jkotas jkotas Jan 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EcmaModule should not referenced in the public surface of the eventual ILVerify library. Could you please make it internal? #Resolved

@jkotas
Copy link
Member

jkotas commented Jan 10, 2018

Could please fix the few remaining nits, and undo the strongname signing hacks? I think we can merge this, and then start new PR for the additional changes.

@jkotas jkotas merged commit d84fb99 into dotnet:master Jan 10, 2018
@jcouv
Copy link
Member Author

jcouv commented Jan 10, 2018

Cool. Thanks a lot for the review and feedback.
I'll go ahead with a PR to split to exe+dll.

Also tagging @A-And as FYI since he's working on nuget package.

@jcouv jcouv deleted the verify-roslyn branch January 10, 2018 21:20
am11 pushed a commit to am11/corert that referenced this pull request Jan 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ILVerify] Define and Implement a public API surface

5 participants