Skip to content

broken error handling in message_handler_loop #91

@samtkaplan

Description

@samtkaplan

Problem introduced by fdf56f4. The change means that we don't handle the case where wpid is invalid. In turn, this can result in an "Unhandled Task ERROR":

From worker 7:    Unhandled Task ERROR: no process with id 0 exists
From worker 7:    Stacktrace:
      From worker 7:     [1] error(s::String)
      From worker 7:       @ Base ./error.jl:35
      From worker 7:     [2] worker_from_id(pg::Distributed.ProcessGroup, i::Int64)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1098
      From worker 7:     [3] worker_from_id(pg::Distributed.ProcessGroup, i::Int64)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1090 [inlined]
      From worker 7:     [4] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:213
      From worker 7:     [5] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:133
      From worker 7:     [6] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:121
      From worker 7:
      From worker 7:    caused by: Process(1) - Invalid connection credentials sent by remote.
      From worker 7:    Stacktrace:
      From worker 7:     [1] error(s::String)
      From worker 7:       @ Base ./error.jl:35
      From worker 7:     [2] process_hdr(s::Sockets.TCPSocket, validate_cookie::Bool)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:265
      From worker 7:     [3] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:158
      From worker 7:     [4] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:133
      From worker 7:     [5] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
      From worker 7:       @ Distributed /opt/julia/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:121

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions