Skip to content

slipdev: simplify and solidify byte-unstuffing#8268

Merged
bergzand merged 1 commit intoRIOT-OS:masterfrom
miri64:slipdev/enh/simplify-byte-unstuffing
Jan 9, 2018
Merged

slipdev: simplify and solidify byte-unstuffing#8268
bergzand merged 1 commit intoRIOT-OS:masterfrom
miri64:slipdev/enh/simplify-byte-unstuffing

Conversation

@miri64
Copy link
Copy Markdown
Member

@miri64 miri64 commented Dec 14, 2017

Contribution description

This simplifies and solidifies the reversal of SLIP's byte-stuffing
(aka byte-unstuffing ;-)) by

  1. Using tsrb instead of ringbuffer: there are two actors. The ISR
    and the device's event handler.
  2. Moving the byte-unstuffing from the UART RX-handler (i.e. the ISR)
    to the device's receive function (potentially not the ISR)
  3. Removing the pktfifo member. The current number of bytes in the
    ringbuffer is returned for recv(data = NULL, len = 0). If that is
    more than the packet contains (due to the byte stuffing it most
    likely will be) the packet is reallocated in GNRC anyway.

Issues/PRs references

Alternative to #8258, fixes #8003.

@miri64 miri64 added Area: drivers Area: Device drivers Area: network Area: Networking Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Dec 14, 2017
@miri64 miri64 added the State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet label Dec 14, 2017
@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 14, 2017

I noticed some weird behavior, when I don't assign a GUA to the SLIP interface (might be unrelated, since the link-local address is fe80::). I'd like to investigate this first, before merging.

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 14, 2017

The problem was, that either the ID or the sequence number of the pong message was wrong, so I got an "error: unexpected parameter" message. Not sure what caused this, but I'm unable to reproduce it now...

@miri64 miri64 removed the State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet label Dec 14, 2017
Copy link
Copy Markdown
Member

@jnohlgard jnohlgard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The theory behind the PR sounds reasonable. I have not had time to review the implementation yet.

@kb2ma
Copy link
Copy Markdown
Member

kb2ma commented Dec 14, 2017

Test results after a some experiments in the shell on samr21 Xplained Pro.

  • I can repeatedly run ping6 from mote to workstation, and it works. Great!
  • I send a gcoap GET from the mote to my workstation. Times out.
  • If I run ping6 after gcoap, it shows one or two timeouts and then a successful ping or two.
  • If I run ping6 again, I see 'error: unexpected parameters'.

I can try to dig deeper later on to see how the gcoap request fails.

@bergzand
Copy link
Copy Markdown
Member

I can confirm the findings of @kb2ma. Furthermore I'd like to add that ping from the workstation to the node (nucleo-f446 with a cheap usb to serial converter) can have response times between 4 to 5 seconds. It looks like something is causing a lot of delay in the line somewhere.

Measuring this with wireshark I observe that when I start a ping sequence (after having some traffic between the nodes), there is first some leftover replies from the previous sequence and then the current responses start coming through. This stops as soon as the ping sequence stops.

ping

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 18, 2017

@bergzand which host is which in your wireshark dump?

@bergzand
Copy link
Copy Markdown
Member

@bergzand which host is which in your wireshark dump?

  • [AAAA::1]: My workstation
  • [AAAA::2]: The node (gcoap example with slipdev)

@bergzand
Copy link
Copy Markdown
Member

Header should be:
num, timestamp, src, dst, proto, length, info.

I'm suspecting that somewhere the tsrb desyncs, causing an offset in the packet passed to the upper layer. For example in my pcap, packet number 13 arriving in the buffer causing packet number 1 to be passed to gnrc (and generating the delayed response packet).

@jnohlgard
Copy link
Copy Markdown
Member

Could this be caused by bugs in the ethos implementation?

@jnohlgard
Copy link
Copy Markdown
Member

Nm, I just reread the PR title

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 18, 2017

Could this be caused by bugs in the ethos implementation?

As you noticed yourself: ethos has nothing to do with this PR. However, it might be worth investigating what ethos is doing to prevent this issue.

}
}
else if (len < dev->pktfifo[idx]) {
else if (len < tsrb_avail(&dev->inbuf)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is theoretically possible to have multiple packets in the tsrb at when this is checked? If a byte arrives in the tsrb between the _recv() call to get the expected buffer size and the _recv() call to copy the buffer to the pktsnip, this would evaluate to true and return an error.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional problem clarification: If the above is possible, gnrc_netif_raw would return, but the current packet in the tsrb would never get removed, and would instead be returned as a received packet when the next packet is added to the tsrb (, as that one would trigger the next SLIP_END; NETDEV_EVENT_RX_COMPLETE etc sequence).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have fixed that now. See my latest commit.

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 19, 2017

With the latest commit (a9c6648), I am able to force ping (sudo ping6 -f ...) from the host workstation without any significant loss:

[mlenders@sarajevo RIOT]<3 sudo ping6 -f aaaa::2
PING aaaa::2(aaaa::2) 56 data bytes
.^C
--- aaaa::2 ping statistics ---
2041 packets transmitted, 2040 received, 0% packet loss, time 33626ms
rtt min/avg/max/mdev = 19.455/19.629/20.534/0.206 ms, pipe 2, ipg/ewma 16.483/19.655 ms

There were multiple problems:

  1. The problem @bergzand pointed out: ENOBUFS is now returned, when res > len inside the fetch loop.
  2. On data remove in receive, len was used but it wasn't checked for SLIP_END. This way data of a packet that might have started to come in between the interrupt and the recv(NULL, 0) call might have been removed
  3. The un-escaping in the receive function wasn't handled properly: If an escaped END byte came in the loop would also stop (the byte variable was the stop-condition, but also was overwritten on un-escape).

@bergzand
Copy link
Copy Markdown
Member

@miri64 Works for me here too. I can't get it to break anymore.

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 19, 2017

Then, since I'm going on vacation in 3 min, I guess I squash? :D

@miri64 miri64 force-pushed the slipdev/enh/simplify-byte-unstuffing branch from 4026535 to 72652ce Compare December 19, 2017 15:02
@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 19, 2017

(didn't wait for a reply, just squashed)

@miri64 miri64 added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Dec 19, 2017
@kb2ma
Copy link
Copy Markdown
Member

kb2ma commented Dec 20, 2017

Works for me, too! Many thanks. :-)

@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Dec 27, 2017

Fixed Murdock issue. Can I get an ACK now?

dev->inesc = 0;
break;
}
/* falls through intentionally to default */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These (this one and the case below) dont fall through anymore.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they do, if dev->inesc is not set.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(but I can extent the comment to make that clear).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, completely missed that, better this way!

#define SLIP_END (0xC0U)
#define SLIP_ESC (0xDBU)
#define SLIP_END_ESC (0xDCU)
#define SLIP_ESC_ESC (0xDDU)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for the switch to upper case here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular, I guess. Shall I revert?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if it is not too much work :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C-v3jlgu isn't too much work IMHO ;)

@bergzand
Copy link
Copy Markdown
Member

bergzand commented Jan 5, 2018

ACK, Ready for squash.

This simplifies and solidifies the reversal of SLIP's byte-stuffing
(aka byte-unstuffing ;-)) by

1. Using `tsrb` instead of `ringbuffer`: there are two actors. The ISR
   and the device's event handler.
2. Moving the byte-unstuffing from the UART RX-handler (i.e. the ISR)
   to the device's receive function (potentially not the ISR)
3. Removing the `pktfifo` member. The current number of bytes in the
   ringbuffer is returned for `recv(data = NULL, len = 0)`. If that is
   more than the packet contains (due to the byte stuffing it most
   likely will be) the packet is reallocated in GNRC anyway.
@miri64 miri64 force-pushed the slipdev/enh/simplify-byte-unstuffing branch from c882206 to c61a343 Compare January 8, 2018 10:51
@miri64
Copy link
Copy Markdown
Member Author

miri64 commented Jan 8, 2018

Squashed

@bergzand bergzand merged commit 2f4a7e2 into RIOT-OS:master Jan 9, 2018
@miri64 miri64 deleted the slipdev/enh/simplify-byte-unstuffing branch January 9, 2018 10:05
@aabadie aabadie added this to the Release 2018.01 milestone Jan 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: drivers Area: Device drivers Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ping fails from border router with slip

5 participants