Skip to content

Support Rust types by retrieving them from debug info#307

Merged
wsmoses merged 30 commits into
EnzymeAD:mainfrom
cychen2021:rust
Dec 1, 2021
Merged

Support Rust types by retrieving them from debug info#307
wsmoses merged 30 commits into
EnzymeAD:mainfrom
cychen2021:rust

Conversation

@cychen2021

@cychen2021 cychen2021 commented Aug 21, 2021

Copy link
Copy Markdown
Contributor

Support Rust types by retrieving them from debug info

This is the pull request for my GSoC project, and the last commit for this project is c96bc56.

What we have done

We wrote a parser to parse the type info contained in debug info generated by rustc, the Rust compiler. It indicates the types of data used in a Rust program, and these types are then used to construct initial type trees for Enzyme's type analysis. It facilitate Enzyme to use Rust types to assist its synthesis of differentiated functions.

How to use it

The API contains two functions. One is in TypeAnalysis/RustDebugInfo.h

TypeTree parseDIType(DbgDeclareInst& I, DataLayout& DL);

It extract the type info from an instruction's debug info and build the type tree according to it. It doesn't care what location the type tree is associated with and only gives the type tree corresponding to the debug info type structure.

We also add a function to TypeAnalyzer class defined in TypeAnalysis/TypeAnalysis.h

void considerRustDebugInfo();

It looks up for the LLVM intrinsic llvm.dbg.declare which is used by rustc to indicate declaration of a local variables, computes type trees for them, and infuses them to data related to these variables. Then, the type info will be propagated.

Supported types

We now support data of the following types in Rust:

  • Scalars (u8, i8, f32, f64, ...)
  • Structs (defined by the struct keyword)
  • Arrays (eg. [f32; 4])
  • Vectors (Vec<T>)
  • Boxes (Box<T>)
  • Pointers (*const T, *mut T)
  • References (&T, &mut T)
  • Unions (defined by the union keyword)

Implementation

For someone who is interested in the implementation details, I can give a brief sketch. In short, the debug info types are in a recursive style. So, the process to parse them is just determining the types of different offsets in current layer according to current node, and then traversing all sub-nodes and getting types of offsets in their layers. To implement this, we write override functions for different kinds of debug info type nodes in TypeAnalysis/TypeAnalysis.cpp. They don't affect the usage of the parser, so we didn't expose them to the API.

A spacial case that should be mentioned when constructing type trees is that when the type to be parsed is *u8 we just ignore it and return an empty tree, and considerRustDebugInfo will do nothing after receiving that empty tree. This is because that in Rust, any pointer type can be casted to *u8 which may cause mismatch between the debug info types and the actual types of the underlying data.

TODOs

The most urgent ones are extending the parser to more types and testing it thoroughly. TODOs are listed below according to their emergency.

  • Support slices
  • Support enums
  • Support traits
  • Test and debug

In the future, we may add predefined derivatives for frequently used functions in Rust to improve efficiency of compiling and running the differentiated functions. But that will be another story.

Comment thread enzyme/Enzyme/TypeAnalysis/TypeAnalysis.cpp Outdated
Comment thread enzyme/Enzyme/TypeAnalysis/TypeAnalysis.cpp Outdated
@wsmoses

wsmoses commented Aug 23, 2021 via email

Copy link
Copy Markdown
Member

…ked when the Rust type option is switched off
@wsmoses

wsmoses commented Sep 3, 2021

Copy link
Copy Markdown
Member

@cychen2021 Can you add corresponding tests within the TypeAnalysis test folders that verify the given functionality? When that's done this should be ready to merge.

@cychen2021

Copy link
Copy Markdown
Contributor Author

@cychen2021 Can you add corresponding tests within the TypeAnalysis test folders that verify the given functionality? When that's done this should be ready to merge.

I'm working on this and will complete soon.

@cychen2021

cychen2021 commented Sep 5, 2021

Copy link
Copy Markdown
Contributor Author

@wsmoses Hi, I met some problems when writing test cases. To test the Rust type parser, I first write the test cases in Rust, and compile it to LLVM IR. I then use opt with Enzyme to do the auto differentiation. However, I have to indicate the location of Rust's std lib to link the differentiated code. But the test cases are in Enzyme project so they shouldn't know which Rust toolchain the user uses and where it is. How can I deal with that?

@wsmoses

wsmoses commented Sep 5, 2021

Copy link
Copy Markdown
Member

You can just have the LLVM code here and don't need to do a full end-to-end test. For example, look at the existing TypeAnalysis tests here which just validate that type analysis works as expected.

@cychen2021

cychen2021 commented Sep 7, 2021

Copy link
Copy Markdown
Contributor Author

You can just have the LLVM code here and don't need to do a full end-to-end test. For example, look at the existing TypeAnalysis tests here which just validate that type analysis works as expected.

@wsmoses There are two problems:

  1. The code generated by rustc is complecated, so it's hard to extract to-be-tested functions with type info from it, or write a test case by hand to mimic the code and type info generated by rustc, since the types in Rust nest in a very complex way. Also, it's hard to analyze the generated code by hand to get the expected analysis result.
  2. Even if we can write the test case by hand, the code generated by rustc can cover more cases due to their complex structure. For example, the code generated by rustc helps me in finding that I have missed the implementation for enums which is included in an internal type. However, a written-by-hand test case may not do that since I would forget to include enums at all from the beginning.

@cychen2021

cychen2021 commented Sep 26, 2021

Copy link
Copy Markdown
Contributor Author

@wsmoses Hi, I have added a test case for f32 type in Rust. A small problem is that the type analysis result with Rust type parsing open (with option '-enzyme-rust-type') and the type analysis result with Rust type parsing closed (without option '-enzyme-rust-type') are just the same, though the mid results during analysis are not the same. Does that matter?
If this test case is OK, I'll add more test cases to cover other Rust types in the following several days.

@cychen2021

cychen2021 commented Oct 3, 2021

Copy link
Copy Markdown
Contributor Author

@wsmoses I have reduced the f32 test case to the minimum. Could you check whether it's OK? If so I'll add more test cases for other types alike.

@wsmoses

wsmoses commented Oct 3, 2021

Copy link
Copy Markdown
Member

LGTM go ahead and add the other cases

Comment thread .gitignore Outdated
@cychen2021

Copy link
Copy Markdown
Contributor Author

I'm done adding tests for all types we support now. Please check if this can be merged. @wsmoses

@cychen2021

Copy link
Copy Markdown
Contributor Author

@wsmoses Hi, please see if this can be merged when you are convenient.

@wsmoses wsmoses self-requested a review November 16, 2021 05:14
@cychen2021

cychen2021 commented Nov 28, 2021

Copy link
Copy Markdown
Contributor Author

@wsmoses Hi, I meet some problems. I noticed that the rust type parser was incompatible with LLVM version under 9. I've used the macro LLVM_MAJOR_VERSION to resolve this. However, I found that the test cases are also incompatible with LLVM version under 9 because the newer LLVM I use generates different IR. Do you know how to deal with this?

@cychen2021

Copy link
Copy Markdown
Contributor Author

@wsmoses Hi, I meet some problems. I noticed that the rust type parser was incompatible with LLVM version under 9. I've used the macro LLVM_MAJOR_VERSION to resolve this. However, I found that the test cases are also incompatible with LLVM version under 9 because the newer LLVM I use generates different IR. Do you know how to deal with this?

I've solved this by simply deleting flags that are not supported by LLVM under 9

@cychen2021 cychen2021 requested a review from wsmoses November 30, 2021 12:02
@ZuseZ4

ZuseZ4 commented Dec 1, 2021

Copy link
Copy Markdown
Collaborator

@wsmoses What do you propose here? The failing tests seem to be unrelated, they also affect other open PRs.

@wsmoses

wsmoses commented Dec 1, 2021

Copy link
Copy Markdown
Member

Those failures are currently expected and this is good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants