[java] New type resolution#2689
Conversation
Generated by 🚫 Danger |
Bug with check for null type args instead of empty list
This way it doesn't need to be synchronized, and is garbage collected on each file.
This fixes the remaining problems with scoping in qualified anon classes.
(But implicitly, every getTypeMirror call may trigger type resolution, it's the entire point) Also uniformize the exception, it's an assert
adangel
left a comment
There was a problem hiding this comment.
Looks good, thanks for the huge work!!
I especially like, that we use ASM now to load the types/classes rather than loading the class vial classloader/reflection. That should make it really possible, to use Java8 as runtime to analyze a newer java version.
I'll add the small stuff (like deprecations for 700 and TODO comments) and merge it then.
| * BSD-style license; for more info see http://pmd.sourceforge.net/license.html | ||
| */ | ||
|
|
||
| package net.sourceforge.pmd.lang.java.rule.security; |
There was a problem hiding this comment.
Maybe we could move this later to package "internal" - security doesn't seem to fit. Maybe a ruleset/category "diagnostics" is a better name?
| * from {@link JTypeDeclSymbol symbols}, a layer of abstraction above reflection | ||
| * classes. | ||
| * | ||
| * <p>Type mirrors can be obtained {@linkplain TypesFromReflection from reflected types}, |
There was a problem hiding this comment.
Should we allow this in the future - TypesFromReflection? I think, we should remove this possibility in the end to avoid mixing auxclasspath and PMD's runtime classpath...
There was a problem hiding this comment.
Actually I think TypesFromReflection doesn't use the Class instances directly. It will ask the TypeSystem to resolve the class name, and gives up if that class is not on the classpath of the typesystem.
This is a remnant of an early version of the framework, where it was still using the reflected symbol implementation and not ASM. So this was effectively depended on by the reflected symbol implementation to get the type of eg a method parameter, and was important.
I figured, if we keep it we can build complex types with eg typeSystem.fromReflect(new TypeLiteral<List<String>>() { }) (which would produce the type for List<String>) without having to go through hoops with parameterize and such. But it's not a crucial piece of the framework and could be removed. I think, we should decide when we try to port rules, to see if it carries its weight.
|
Thanks for merging ! 🎉 |
Main changes:
Symbols
Instead of using Class instances to back symbols, we use ASM to read class files directly.
This has several advantages
Class::isRecordis only there since JDK 14, so we can't use it if we target JDK 8)@SuppressWarningswith constants instead of literals #520)Implementation is tricky, because we have to lazy load referenced classes (to prevent stuff like #2685), but we have to do it in a thread-safe way.
This replaces PMDAsmClassLoader. The AsmSymbolResolver uses a classloader as input, but only to resolve URLs to class files.
The reflected symbol implementation is removed. It doesn't carry its weight.
Using ASM means, we don't have access to Class instances at runtime anymore, only symbols.
TypeNode::getTypeis deprecated for removal. I think this is a win: we have a single API to get class information.Type representation
The core of the framework is a set of interfaces to represent types. The root interface is
JTypeMirror.Type mirrors are created by a
TypeSystemobject. This object is analysis-global (eventually it should be part of the Language instance, #2518)TypeSystemhas a bunch of final fields that represent important types, egObject, primitive types,void, etc. The reason those are not static constants, is it makes initialization clearer. It also makes us able to use theObjectclass that's loaded from the auxclasspath.JTypeMirrorobjects give access to operations like subtyping checks, that are aware of type arguments, can test if they are reifiable or erased, etc. This API is meant for rules, though currently it's only been fleshed out enough to support type inference. How rules will use it, we can figure out later.StringToStringRulehas been ported to use the new framework as a proof of concept:This should precisely fit the contract of the rule, and if there are FPs/FNs, then it's not the fault of the rule, it's the fault of type resolution. Being able to depend on these components is essential to write simpler, more maintainable, and more precise rules. This PR brings us to that stage.
I wanted to keep out
nullfrom the API, so there's a specialUNRESOLVED_TYPEas a field ofTypeSystem, that is used as a sentinel.Type resolution
Resolution of types is done lazily. See
LazyTypeResolver, which is the "better ClassTypeResolver". Laziness is probably not a great benefit, but it at least makes implementation very clear.The logic in
LazyTypeResolveronly cares about the so-called "standalone expressions", that is, expressions whose type does not depend on context (eg literals). Expressions that are not standalone are called "poly expressions", these include lambdas, method references, generic method/constructor calls, conditional expressions. Resolving the type of poly expressions is complicated, and in the general case, it needs type inference.Type inference
A huge chunk of the code additions is the type inference engine. The entry point for this API is the class
Infer. AnInferobject is created per file as it's not thread-safe.The engine's crucial responsibility is to do overload selection. This may involve doing type inference when methods are generic, or for diamond invocation. Sometimes type inference is not needed (method is not generic).
The engine doesn't interact with the AST directly. It has its own set of interfaces (
ExprMirror), whose implementations wrap AST nodes. This decouples the AST from the engine, for example, we can do overload selection for enum constants as if they were regular constructor calls.There are sophisticated logging strategies to inspect a run of the inference engine. Some tests in the maven build show a sample output.
All this is internal API, though I hope to use it in some privileged rules to detect unnecessary cast, unnecessary boxing, unnecessary type arguments, while being aware of maybe casts/boxings that disambiguate between some overloads.
Symbol table
Symbol tables resolve not only symbols now, they resolve typed symbols. For example, if you're in a class that inherits from
AbstractList<String>, then the methodgetthat the class inherits is not justT get(int), it'sString get(int). Same thing for fields, and inherited nested types.The typed equivalent for symbols are:
JTypeParameterSymbol->JTypeVarJClassSymbol->JClassTypeJVariableSymbol->JVariableSigJMethodSymbol->JMethodSigCode maturity
TypeResTestRulein categorysecurity. This is just a rule I used to test the type resolution on an entire codebase and catch bugs. I'm leaving it in this PR, in case you want to do the same. We should remove it before releasingPMD_DEBUG_LEVEL,FINEorFINEST). This is also just a dev-only feature, that one can use to inspect a run. Logic is inJavaAstProcessor. I expect this can become a forever-experimental language property. Note that levelFINESTis too much detail, usually only appropriate for only a single overload resolution.PMDASMClassLoader: by comparing the classloader. Logic is inJavaAstProcessor.Many tests depend on thethis is done nowtoStringofJTypeMirror. I haven't had the courage to update them yet...ClassTypeResolver, because some rules use it. It doesn't work anymore, this is just so that the rules compile. We can fix that laterpackage-infoof thetypespackage.List<K,V>. Fault tolerance can be explored later. I won't add anything more to this PRBlockers
@InternalApiisExactlyNone, and all methods which take TypedNameDeclaration parameters. This class is deprecated. Remaining is only methodsis[Exactly]Athat take aTypeNodeparameter.(this needs a big refactoring of the new things introduced here and I'd rather do this in this PR)