[IR][Pass] Refactor the fusion implementation by yaoyaoding · Pull Request #164 · hidet-org/hidet

yaoyaoding · 2023-04-07T02:36:09Z

This PR refactor the implementation of post-scheduling fusion.

Previously, we use a TaskGraph to store the sub-graph inside each Task.

After this refactor, we remove the TaskGraph attribute of Task, but created a new kind of operator to represent the fused sub-graph. This allows us to support more kinds of fusion, and make the IR more clean.

Some other udpates:

Each lib.so will contains a single operator. We fix the name of packed function to 'launch' (previously, the name of packed function is the name of the task). This will avoid passing the function name and make the code more clean.
Now, the implement_cpu and implement_cuda methods of class Task can return a list of IRModule, indicating the tunable schedules.
Move the generation of 'launch' function to pass list. Now, we do not need to call the add_packed_func to manually add it.

Introduces a `SyncLLM` and `AsyncLLM` interface to interact with the LLM, closes #164. ### SyncLLM.generate Takes in 1 or a list of n prompts, and 0, 1, or a list of n sampling parameters. - If no sampling parameter is provided, greedy sampling is used. - If 1 prompt and 1 sampling parameter is provided, the return is a single `SequenceOutput`. - If a list of n prompts and 1 sampling parameter is provided, the sampling parameter is applied to all prompts and the return is a list of `SequenceOutput`. - If a list of n prompts and a list of n sampling parameters is provided, the sampling parameters are applied respectively to each prompt. - Any other configuration is invalid. ### AsyncLLM.generate Takes in 1 prompt and 0 or 1 sampling parameters. The same default from the synchronous version applies if no sampling parameters are provided. _Without blocking_, returns a async iterator over `SequenceOutput`, which is updated with every token generated. ### Usage Here's an example script to demonstrate the API. ```py import asyncio import random from hidet.apps.llm import create_llm from hidet.apps.llm.sampler import SamplingParams async def _demo_async(): llm = create_llm("meta-llama/Llama-2-7b-chat-hf", is_async=True) prompts = [ "Hello, how are you?", "How do you feel about the current political climate?", "What is your favorite food?", "What is your favorite color?", "What is your favorite movie?", "What is your favorite book?", "What is your favorite song?", "What is your favorite animal?", "What is your favorite hobby?", "When is your birthday?", ] coros = [] for prompt in prompts: async def f(prompt): await asyncio.sleep(random.randint(1, 60)) print("Incoming request: ", prompt) params = SamplingParams(temperature=0.0, max_tokens=random.randint(10, 100)) stream = llm.generate(prompt, sampling_params=params) final = None async for output in stream: # print(output.tokens) final = output print("=====") print("Completed request: ", prompt) print("Output: ", final.output_text) print("=====") coros.append(f(prompt)) await asyncio.gather(*coros) def demo_async(): asyncio.run(_demo_async()) def demo_sync(): llm = create_llm("meta-llama/Llama-2-7b-chat-hf", is_async=False) prompts = [ "Hello, how are you?", "How do you feel about the current political climate?", "What is your favorite food?", "What is your favorite color?", "What is your favorite movie?", "What is your favorite book?", "What is your favorite song?", "What is your favorite animal?", "What is your favorite hobby?", "When is your birthday?", ] for output in llm.generate(prompts): print("=====") print("Completed request: ", output.prompt) print("Output: ", output.output_text) print("=====") if __name__ == "__main__": demo_sync() # demo_async() ``` --------- Co-authored-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

yaoyaoding added 11 commits April 6, 2023 17:51

.

f6417e6

.

f2011ad

.

f283d4b

.

e5ca5a9

.

30cc27e

.

734cc1a

.

a197e3d

.

766c6ba

.

4dc2584

.

2e279b8

.

5ccaa79

yaoyaoding merged commit 3cc75b6 into hidet-org:main Apr 7, 2023

yaoyaoding deleted the refactor-fusion branch April 7, 2023 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IR][Pass] Refactor the fusion implementation#164

[IR][Pass] Refactor the fusion implementation#164
yaoyaoding merged 11 commits intohidet-org:mainfrom
yaoyaoding:refactor-fusion

yaoyaoding commented Apr 7, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yaoyaoding commented Apr 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yaoyaoding commented Apr 7, 2023 •

edited

Loading