Skip to content

feat(common): introduce async stack trace#4383

Merged
mergify[bot] merged 50 commits into
mainfrom
bz/debug-stack
Aug 8, 2022
Merged

feat(common): introduce async stack trace#4383
mergify[bot] merged 50 commits into
mainfrom
bz/debug-stack

Conversation

@BugenZhao

@BugenZhao BugenZhao commented Aug 2, 2022

Copy link
Copy Markdown
Contributor

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR introduces a brand new tracing of async stack trace, which allows one to check where all tasks are pending and why it's pending. Note that multiple futures can be polled in a single task with select and join, so the pending status snapshot is actually a tree instead of a stack.

Samples

To make it intuitive, here're some samples.

  • The chain actor has been running for 1s. The rearranged chain steals a barrier 427ms ago, while the batch query executor on the other arm has waited for the epoch for 829ms and is still unfinished yet. Since there're multiple iterators for different vnodes and we try_join_all on them, they're in the same level in the tree.
Actor 5170 [1.005471416s]
  Epoch 2771234005188608 [427.634708ms]
    RearrangedChain (actor 5170, executor 181) [427.632666ms]
      BatchQuery (actor 5170, executor 180) [829.7615ms]
        store_wait_epoch [829.638ms]
        store_wait_epoch [829.655416ms]
        store_wait_epoch [829.576041ms]
        ...
      ReceiverExecutor (actor 5170, executor 0) [427.644333ms]
        LocalInput (actor 4111) [427.642458ms]
  • The source executor received a barrier of 2777905903042560 156ms ago, while there are no new chunks received for 8s.
Actor 8 [8.000397291s]
  Epoch 2777905903042560 [156.222541ms]
    SourceExecutor 800000001 (actor 8, executor 1) [156.219166ms]
      source_recv_barrier [156.218875ms]
      source_recv_chunk [!!! 7.996535125s]
  • This actor tried to dispatch the barrier to the downstream, but has been back-pressured for 7.5s! Will there be some problems in actor 68?
Actor 122 [13.001253875s]
  Epoch 2776266980392960 [!!! 7.793291666s]
    dispatch_barrier [!!! 7.550658708s]
      LocalOutput (actor 68) [!!! 7.5506555s]
  • For actor 68, there's only one arm in the trace of hash join's barrier aligning, which means the other side has been ready for a while. However, we've waited for 11s on the remote inputs and no messages have been received. There must be some issues.
    The epoch number in this actor 2667 is also smaller than the actor 122 2669, showing that the concurrent checkpoint is stuck. (A real case from here 😄)
Actor 68 [14.501581875s]
  Epoch 2776266751148032 [!!! 11.714676875s]
    ProjectExecutor 4400000082 (actor 68, executor 130) [!!! 11.714678s]
      FilterExecutor 4400000080 (actor 68, executor 128) [!!! 11.714679208s]
        HashJoinExecutor 440000007E (actor 68, executor 126) [!!! 11.714680208s]
          hash_join_barrier_align [!!! 11.714681125s]
            MergeExecutor (actor 68, executor 124) [!!! 11.714596916s]
              RemoteInput (actor 141) [!!! 11.714595583s]
              RemoteInput (actor 142) [!!! 11.714596458s]
              RemoteInput (actor 143) [!!! 11.714597083s]
              ...

Usages

To get the traces printed, one may launch the cluster with env-var RW_ASYNC_STACK_TRACE set and use risedev ctl stream trace [--actor-id 233] to find a recent captured report. You may also print it periodically in the logs by commenting out this line.

https://github.com/singularity-data/risingwave/blob/bc44aa526fced1ddfab523f6d133b05c4ff48a7c/src/compute/src/server.rs#L199-L205

The tags must be attached manually (unless we hack the future compiling in rustc). Luckily, it's easy enough by adding a chained call of .stack_trace(..). &'static str and String are both acceptable here, while the former one may perform better.

https://github.com/singularity-data/risingwave/blob/bc44aa526fced1ddfab523f6d133b05c4ff48a7c/src/storage/src/monitor/monitored_store.rs#L232-L236

Implementations

Writing a double-linked tree in Rust is hard and ugly, not to mention that we must take care of future pending, ready, aborting, and sending to other tasks. I finally decided to implement this with the arena tree, which can be both cleaner and more efficient. For the tracing part, if you're interested, check src/utils/stack_trace.

Limitations

Since we need to manipulate the tree every time we polled a future, this seems significantly hurt the performance with dev profile. That's why we disable it by default. May further investigate it later.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@codecov

codecov Bot commented Aug 3, 2022

Copy link
Copy Markdown

Codecov Report

Merging #4383 (ac46e48) into main (51b5b96) will decrease coverage by 0.00%.
The diff coverage is 63.50%.

@@            Coverage Diff             @@
##             main    #4383      +/-   ##
==========================================
- Coverage   74.34%   74.33%   -0.01%     
==========================================
  Files         847      853       +6     
  Lines      125199   126753    +1554     
==========================================
+ Hits        93084    94228    +1144     
- Misses      32115    32525     +410     
Flag Coverage Δ
rust 74.33% <63.50%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/batch/src/executor/test_utils.rs 63.20% <0.00%> (-0.30%) ⬇️
src/batch/src/rpc/service/task_service.rs 0.00% <0.00%> (ø)
src/batch/src/task/task_manager.rs 69.95% <0.00%> (-2.17%) ⬇️
src/cmd/src/bin/ctl.rs 100.00% <ø> (ø)
src/common/src/lib.rs 100.00% <ø> (ø)
src/common/src/session_config/mod.rs 31.06% <ø> (-0.89%) ⬇️
src/common/src/util/debug/context.rs 96.42% <ø> (ø)
src/common/src/util/mod.rs 0.00% <ø> (ø)
src/compute/src/rpc/service/stream_service.rs 0.00% <0.00%> (ø)
src/compute/src/server.rs 0.00% <0.00%> (ø)
... and 130 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>

@skyzh skyzh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM, good work!

Comment thread src/compute/src/server.rs
// periodically.
//
// Comment out the following line to enable.
#[cfg(any())]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to make it a feature lol

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I've left a TODO here for now. We may introduce a more complete and unified configuration for all kinds of tracings.

Comment thread src/ctl/src/cmd_impl/stream/trace.rs Outdated

let mut all_traces = BTreeMap::new();

// FIXME: the compute node may not be accessible from the our network with their listen

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have client address and listen address separately for compute node. The address you get here should be a client address, and should be accessible any machine within the cluster.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assume the streaming service of compute nodes is also accessible with risectl. However, it's possible that the cluster is running in an internal network and only the meta service is exposed, so we may fail to connect to the compute nodes. Left a more detailed explanation here.🥵

Ok((block, len))
}
})
.stack_trace("block_cache_lookup")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not a "stack_trace". It's like a tracing span instead. I can do the renaming later when I introduce minitrace to the system. At that time, we might have something like .rw_span which creates both minitrace span and async stack trace info,

Comment thread src/stream/src/executor/debug/trace.rs Outdated
This reverts commit b8927ad.
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Comment on lines +37 to +41
#[derive(Debug, Clone)]
pub enum SpanValue {
Slice(&'static str),
Shared(Arc<String>),
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems equivalent to Cow?

Suggested change
#[derive(Debug, Clone)]
pub enum SpanValue {
Slice(&'static str),
Shared(Arc<String>),
}
pub type SpanValue = Cow<'static, str>;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna make this struct cheap to clone, so Arc<String> is used. For example, we need to trace the Future actor.next() multiple times with the same span Actor 233, and they'd better share a span instance.

BTW, I've just realized that constructing an Arc<str> from &'static str or String both leads to allocation. 🥵

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is it Cow<Arc> ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry... I used to think that cloning Cow::Owned only borrows a reference, actually it clones the data! 😰

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is it Cow ?

Cow only accepts a single borrowed type, the owned type must be <B as ToOwned>::Owned. 🥵

@wangrunji0408 wangrunji0408 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Next step we can add some unit tests to make sure the reported traces are equal to our expectations. I'm also looking forward to seeing it on crates.io! 🥰

@mergify mergify Bot merged commit 79b1b24 into main Aug 8, 2022
@mergify mergify Bot deleted the bz/debug-stack branch August 8, 2022 08:45
nasnoisaac pushed a commit to nasnoisaac/risingwave that referenced this pull request Aug 9, 2022
* move context to debug module

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add trace context

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add trace context for streaming actor

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* handle cancel

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add span for storage

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add todos

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use rc refcell

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* simplify interfaces

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* print in stream manager

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* fix cancel stream next

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add more traces

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use set and add time

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add stream test

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* record actor

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add rpc to risectl

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* assertions and refactor

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* support stack trace for release

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add more traces to hummock

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add trace report type

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* yield if ns = 0

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add traces for outputs and source

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* print !!! on long span

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use current thread runtime for ctl

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* do not check when drop

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* move to separated crate

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add more docs

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* refine docs

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* bump tokio

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* minor refine

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* fix licenses

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* refactor with arena

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* print detached

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use arc for span

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use coarsetime

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* rename to async_stack_trace

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* split to multiple modules

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* allow printing detached

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* turn off by default, add release ci step

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* add hints to ctl

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* do not collect test analytics on e2e

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* use Arc<String>

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* try with context

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* try enable pr ci

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* fmt

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* Revert "try enable pr ci"

This reverts commit b8927ad.

* minor fixes

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* do not use offset arc

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/feature Type: New feature.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants