Skip to content

feat: integrates Aws bedrock into golem#27

Merged
mschuwalow merged 21 commits intogolemcloud:mainfrom
iambenkay:aws-bedrock
Jul 19, 2025
Merged

feat: integrates Aws bedrock into golem#27
mschuwalow merged 21 commits intogolemcloud:mainfrom
iambenkay:aws-bedrock

Conversation

@iambenkay
Copy link
Copy Markdown
Contributor

@iambenkay iambenkay commented Jun 12, 2025

What's New?

Integrates AWS Bedrock into Golem LLM.

Implementation details

  1. Uses wasi-async-runtime and custom wasi reqwest client to run aws-sdk operations.
  2. Introduces a feature flag nopoll to disable polling related features for llm-bedrock crate. (polling is not needed for aws bedrock integration since the SDK does not expose a compatible pollable type).
  3. Added get_config_key_or_none to golem_llm for optional config variables.
  4. Introduces infer crate to infer mime type.

Showcase

bedrock-showcase.mov

Test 6 Fix

test6-fix.mov

/claim #2
closes #2

@algora-pbc algora-pbc bot mentioned this pull request Jun 13, 2025
@iambenkay iambenkay changed the title feat: integrations Aws bedrock into golem feat: integrates Aws bedrock into golem Jun 13, 2025
@iambenkay iambenkay marked this pull request as ready for review June 13, 2025 23:30
@iambenkay
Copy link
Copy Markdown
Contributor Author

There was one cargo check failing. I have fixed that now!

@iambenkay iambenkay force-pushed the aws-bedrock branch 3 times, most recently from bd5725e to 023723e Compare June 15, 2025 23:15
@iambenkay
Copy link
Copy Markdown
Contributor Author

@jdegoes @vigoo Kindly find the time to review this PR. I included some new readability related changes over the weekend. I also added more trace logs and cleaned up unnecessary unwraps.

@iambenkay
Copy link
Copy Markdown
Contributor Author

@jdegoes and @vigoo This PR is still waiting for a review. Check it out.

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jun 19, 2025

While running test case 6, I noticed that the retry_prompt implementation doesn't work as fluid as it is meant to. It repeats some parts of the previous message stream and other times outrightly returns an incorrect resumption stream.

I changed it to be more accurate. Video attached below

test6-fix.mov

@iambenkay
Copy link
Copy Markdown
Contributor Author

@jdegoes @vigoo can you take a look at this PR for AWS Bedrock?

@iambenkay
Copy link
Copy Markdown
Contributor Author

I have migrated the bedrock implementation to work with the new folder structure @vigoo and @jdegoes

Copy link
Copy Markdown
Contributor

@mschuwalow mschuwalow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vigoo, this one looks good from my side

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jul 4, 2025

@mschuwalow I checked out the library and I can't believe I didn't know about that! 🔥

I switched to using this runtime instead; all integration tests are still passing 🔥 and it does feel snappier!
Thanks for the recommendation!

@vigoo
Copy link
Copy Markdown
Collaborator

vigoo commented Jul 9, 2025

So my follow up question is, since we are in single threaded env anyways, why is it bad to call Pollable::block?

By using the async runtime's block_on, all the pollables are collected in the reactor and they get polled together. This allows them to advance concurrently, and allows to achieve concurrency for the rust futures. If you block on a single pollable, that blocks the thread and nothing else can advance

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jul 9, 2025

I understand that, I will use the block_on for the sake of future proofing (I think I read somewhere that some efforts are being made to bring multithreading to golem later on) but I was more curious about if this can benefit a single threaded system like the golem worker runtime?

Edit:

I got the answer to my question. Even though my scenario does not show it, if I am trying to run multiple pollables from the same callback in a concurrent way I would need to use this block_on.

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jul 9, 2025

Edit:
I found the root cause and I described it in a later comment.


So @vigoo and @kmatasfp something is wrong with using the wait_for on the pollable. It is not apparent what the issue is, but it panics and this happens randomly (succeeds other times and when it panics golem automatically recovers and actually runs it to the end without any further panic). It is happening especially for tests that are streaming responses from bedrock. The error is weird because I am explicitly passing a pollable.

This one in the attached image is test2 a test case where we are not simulating a panic.

image

Below is the implementation of WasiSleep that leads to the erratic panic I described above:

impl AsyncSleep for WasiSleep {
    fn sleep(&self, duration: std::time::Duration) -> Sleep {
        let reactor = self.reactor.clone();

        let fut = async move {
            let nanos = duration.as_nanos() as u64;
            let pollable = monotonic_clock::subscribe_duration(nanos);

            reactor
                .clone()
                .wait_for(unsafe { std::mem::transmute(pollable) })
                .await;
        };
        Sleep::new(Box::pin(UnsafeFuture::new(fut)))
    }
}

unsafe impl Send for WasiSleep {}
unsafe impl Sync for WasiSleep {}

#[derive(Clone)]
pub struct UnsafeFuture<Fut> {
    inner: Fut,
}

impl<F> UnsafeFuture<F>
where
    F: Future,
{
    pub fn new(inner: F) -> Self {
        Self { inner }
    }
}

unsafe impl<F> Send for UnsafeFuture<F> where F: Future {}
unsafe impl<F> Sync for UnsafeFuture<F> where F: Future {}

impl<F> Future for UnsafeFuture<F>
where
    F: Future,
{
    type Output = F::Output;

    fn poll(
        mut self: std::pin::Pin<&mut Self>,
        cx: &mut std::task::Context<'_>,
    ) -> std::task::Poll<Self::Output> {
        let pinned_future = unsafe { self.as_mut().map_unchecked_mut(|this| &mut this.inner) };
        pinned_future.poll(cx)
    }
}

When I switch to calling .block() directly (as in the PR) all tests pass without any panic.

@iambenkay
Copy link
Copy Markdown
Contributor Author

I got an idea why, and I am going to implement it and update you guys on my findings.

Overview

I will implement HttpClient trait coming from aws-config rather than rely on the implementation in aws_smithy_wasm since it does not utilize a reactor.

I will use the concrete http client provided here: golemcloud/reqwest/blob/9e0c586a3f2fc2f9fe32ddf46c2a49152777693b/src/wasi/async_client.rs#L65 and implement it.

I have learnt so much from this. I am actually excited

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jul 9, 2025

This was such an interesting process for me 🔥. I have changed from using the default aws-smithy-wasm provided client to the async wasi one linked above.

I had to refactor how the runtime was instantiated by moving them to the top-level.

After my migration, I ran all the tests and they are still working beautifully!

Thank you everyone, those were really great improvements! cc @vigoo @kmatasfp @mschuwalow

Let me know if you need me to make a new video showcasing what is going on.

Copy link
Copy Markdown
Contributor

@mschuwalow mschuwalow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iambenkay could you please rebase on top of the latest changes in master?

The unsafe impls for Sync and Send are a bit unfortunate but should not be a problem (until wasi p3 :D). Thank you for your work on this.

@kmatasfp I'll wait for an approval from you as well as you already looked into it a lot. Will merge once you give a +1

@iambenkay
Copy link
Copy Markdown
Contributor Author

Alright @mschuwalow i will rebase and give you the green light once it is done, thank you!

@iambenkay
Copy link
Copy Markdown
Contributor Author

iambenkay commented Jul 17, 2025

Alright @mschuwalow this is good to go!

By WASI p3, the wasi runtime we are currently using will need to be updated to be thread sendable anyways and that would automatically make the use of unsafe unnecessary. We will see by then!

@mschuwalow mschuwalow merged commit 3a3a0c0 into golemcloud:main Jul 19, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add AWS Bedrock Provider

5 participants