socket_zep: port to radio HAL#16932
Conversation
Is this what's missing? If so, the driver only needs to send ACK when requested, because the SubMAC will handle the ACK. |
cpu/native/socket_zep/socket_zep.c
Outdated
| { | ||
| (void) dev; | ||
|
|
||
| /* TODO: handle ACK */ |
There was a problem hiding this comment.
You don't require to handle ACK here, since the SubMAC will do it automatically.
There was a problem hiding this comment.
So I only need to send ACKs, I don't need to process them at all?
There was a problem hiding this comment.
yes, you only need to send an ACK frame "TXRXTurnaroundTime" usecs after receiving a valid frame.
You should forward the ACKs to the upper layer, without requiring to process them.
Alternatively if ACKs are not relevant for the scope of ZEP, you can also "omit" them with the ACK timeout CAP. Then, the TX_DONE callback will be interpreted as "the radio is not triggering an ACK timeout so it received an ACK"
There was a problem hiding this comment.
TXRXTurnaroundTime is symbol_time * 12. For O-QPSK this is 192 us.
| break; | ||
| } | ||
| case ZEP_V2_TYPE_ACK: { | ||
| break; |
There was a problem hiding this comment.
you should pass this ACK to the upper layer if you want the SubMAC to take care of retransmissions. Alternatively, you can implement this here and use caps to indicate that the ACK's are managed, but probably the former will be easier
There was a problem hiding this comment.
Ah so I can just pass on the L2 ACK payload and don't have to distinguish between Data frames and ACK frames in the driver at all?
There was a problem hiding this comment.
I don't think that these are supposed to be used in tandem to L2 ACKs. I believe more (since they were only introduced with V2 of ZEP) that they are a ZEP internal (as always: ZEP is rather sparsely documented). So rather handle them separately, in case we need them later for their actual purpose.
There was a problem hiding this comment.
Ah so I can just pass on the L2 ACK payload and don't have to distinguish between Data frames and ACK frames in the driver at all?
This is correct. The SubMAC will take care of it
I don't think that these are supposed to be used in tandem to L2 ACKs. I believe more (since they were only introduced with V2 of ZEP) that they are a ZEP internal (as always: ZEP is rather sparsely documented). So rather handle them separately, in case we need them later for their actual purpose.
I don't know how ZEP works. In fact, do ZEP retransmissions exists? Can these frame get lost somehow?
There was a problem hiding this comment.
ZEP retransmissions exists?
ZEP just wraps a IEEE 802.15.4 frame inside a UDP packet, so it should be able to do everything a real IEEE 802.15.4 network can do.
Can these frame get lost somehow?
If we simulate a network with ZEP dispatcher they can get lost, just like in a real network. 😉
There was a problem hiding this comment.
I don't know how ZEP works. In fact, do ZEP retransmissions exists? Can these frame get lost somehow?
Nobody knows, I think... the only doc that can be found on this is the packet format in the Wireshark code and Wiki. Since ZEP could, in theory, be sent in normal Internet traffic, I think having retransmissions in ZEP can make sense, if you want to keep the ZEP layer somewhat transparent from 802.15.4.
If we simulate a network with ZEP dispatcher they can get lost, just like in a real network. wink
I think in that case, the IEEE 802.15.4 frame should rather be dropped, not the ZEP message. Yes this implies also the ZEP message to be dropped, so semantics, I know, but a crucial difference when it comes to the layer retransmissions are happening on ;-).
cpu/native/include/socket_zep.h
Outdated
| typedef struct { | ||
| netdev_ieee802154_t netdev; /**< netdev internal member */ | ||
| const socket_zep_params_t *params; | ||
| ieee802154_dev_t *hal; /**< radio HAL */ |
There was a problem hiding this comment.
in the SPI radios it's possible to pass this pointer as an ISR argument, so you don't need to store it here. Would it be possible here?
There was a problem hiding this comment.
For the ISR this works, but there are other places where I need to access sock_fd
There was a problem hiding this comment.
but why does the private descriptor needs access to the hal? can't this be accessed directly from the private descriptor?
|
What I'm trying to understand is why I'm getting so many retransmissions: |
This means either the SubMAC didn't receive an ACK on time or didn't receive it at all. Did you implement the ACK mechanism? |
|
I added ACK sending in 6b117a4, without a delay so far as I assume it's not needed on |
Hmmmm does the send function handle FCS? I think I saw in the other send function that the FCS was calculated manually. This should be the same for the ACK |
|
Ah I wasn't sure if the sub-MAC would check it, added it now. (still no difference though, there might be something else wrong) |
|
do you see the ACK frames in wireshark? |
|
Yes the ACKs are there in Wireshark, but instead of them I read 3 bytes (always edit ah now I get it! The sub-mac wants to read a 3 byte ACK and doesn't use With this it appears to be working :D |
Glad to hear it works! But this I didn't understand. Why do you need to store the full frame? ...
uint8_t ack[3];
switch (ev) {
case IEEE802154_FSM_EV_RX_DONE:
assert(!ieee802154_radio_has_irq_ack_timeout(&submac->dev));
if (ieee802154_radio_read(&submac->dev, ack, 3, NULL) &&
ack[0] & IEEE802154_FCF_TYPE_ACK) {
ieee802154_submac_ack_timer_cancel(submac);
ieee802154_tx_info_t tx_info;
tx_info.retrans = submac->retrans;
bool fp = (ack[0] & IEEE802154_FCF_FRAME_PEND);
ieee802154_radio_set_frame_filter_mode(&submac->dev, IEEE802154_FILTER_ACCEPT);
return _tx_end(submac, fp ? TX_STATUS_FRAME_PENDING : TX_STATUS_SUCCESS,
&tx_info);
}
return IEEE802154_FSM_STATE_WAIT_FOR_ACK;
...It just calls |
because before the ACK frame there comes the ZEP header which I also need to read.I have to store that somewhere. |
ah! I see. Yes, that seems reasonable. |
Ah that doesn't work, when I do a partial read on a datagram, the remaining bytes will simply be discarded. Btw.: I still quite often get Is this due to me not waiting before sending the ACK? |
Hmmm if there's a header before, I think the RX buffer would be the only option.
Could you try this patch? diff --git a/sys/include/net/ieee802154/submac.h b/sys/include/net/ieee802154/submac.h
index 044903e4e3..9dff7e047e 100644
--- a/sys/include/net/ieee802154/submac.h
+++ b/sys/include/net/ieee802154/submac.h
@@ -549,7 +549,9 @@ ieee802154_fsm_state_t ieee802154_submac_process_ev(ieee802154_submac_t *submac,
*/
static inline void ieee802154_submac_ack_timeout_fired(ieee802154_submac_t *submac)
{
- ieee802154_submac_process_ev(submac, IEEE802154_FSM_EV_ACK_TIMEOUT);
+ if (submac->fsm_state == IEEE802154_FSM_STATE_WAIT_FOR_ACK) {
+ ieee802154_submac_process_ev(submac, IEEE802154_FSM_EV_ACK_TIMEOUT);
+ }
}
/**I think the issue is a race condition between the RX_DONE event ACK_TIMEOUT event of the SubMAC. The solution would be to either PR that patch or simply ignore the ACK_TIMEOUT state on IDLE. Since I don't see other sources of interrupts there, I think it's safe to assume the only case where an ACK_TIMEOUT is triggered in IDLE is because of this race condition. Therefore probably the latter would be cleaner. The most elegant solution would be to extend the state machne to support better state transitions (let's say "enter" and "exit" events for each state). We would only need to modify one function ( |
|
With that patch things are running a lot more smoothly! Does it also make sense for 'real' radios? |
|
Great to hear that!
The mechanism yes, but I would probably choose to write is as a transition. There are some transitions that are simply ignored, but some others are intentionally asserted because it might be caused by a wrong driver implementation or a bug in the SubMAC. What those lines are trying to prevent there is a race condition between ztimer and the RX_DONE event of the radio, due to the fact they are posted in a single queue. When reading the ACK frame takes longer, the timer fires before calling We could simply ignore ACK_TIMEOUT events during IDLE. I think it's unlikely this would happen because of a faulty driver or a bug in the SubMAC. pre-print EDIT: I just noticed there's an diff --git a/drivers/netdev_ieee802154_submac/netdev_ieee802154_submac.c b/drivers/netdev_ieee802154_submac/netdev_ieee802154_submac.c
index baa1efd509..d5692a5f57 100644
--- a/drivers/netdev_ieee802154_submac/netdev_ieee802154_submac.c
+++ b/drivers/netdev_ieee802154_submac/netdev_ieee802154_submac.c
@@ -151,6 +151,9 @@ void ieee802154_submac_ack_timer_cancel(ieee802154_submac_t *submac)
submac);
xtimer_remove(&netdev_submac->ack_timer);
+ /* Prevent a race condition between the RX_DONE event and the ACK timeout */
+ netdev_submac->isr_flags &= ~NETDEV_SUBMAC_FLAGS_ACK_TIMEOUT;
+
}
static int _send(netdev_t *netdev, const iolist_t *pkt)
|
|
That patch works too. I get |
bcf03a1 to
765a273
Compare
|
diff --git a/cpu/native/socket_zep/Kconfig b/cpu/native/socket_zep/Kconfig
new file mode 100644
index 0000000000..4b3d70d305
--- /dev/null
+++ b/cpu/native/socket_zep/Kconfig
@@ -0,0 +1,17 @@
+menuconfig MODULE_SOCKET_ZEP
+ bool "Socket-based ZEP"
+ depends on CPU_MODEL_NATIVE
+ depends on TEST_KCONFIG
+ select MODULE_IOLIST
+ select MODULE_CHECKSUM
+ select MODULE_RANDOM
+ select MODULE_IEEE802154
+ help
+ UDP socket-based IEEE 802.15.4 device over ZEP
+
+config MODULE_SOCKET_ZEP_HELLO
+ bool "Send a dummy HELLO packet on startup"
+ depends on MODULE_SOCKET_ZEP
+ help
+ Say y to send a dummy HELLO packet on startup. This is used to make
+ dispatchers aware of the node.
diff --git a/drivers/Kconfig.net b/drivers/Kconfig.net
index 37738a53c5..0634122267 100644
--- a/drivers/Kconfig.net
+++ b/drivers/Kconfig.net
@@ -18,6 +18,7 @@ rsource "ncv7356/Kconfig"
rsource "pn532/Kconfig"
rsource "rn2xx3/Kconfig"
rsource "slipdev/Kconfig"
+rsource "$(RIOTCPU)/native/socket_zep/Kconfig"
rsource "sx126x/Kconfig"
rsource "sx127x/Kconfig"
rsource "tja1042/Kconfig"
diff --git a/tests/ieee802154_hal/Makefile b/tests/ieee802154_hal/Makefile
index e2b7e95b3d..24fa7d1e5e 100644
--- a/tests/ieee802154_hal/Makefile
+++ b/tests/ieee802154_hal/Makefile
@@ -24,6 +24,10 @@ ifeq ($(BOARD), native)
ZEP_PORT_BASE ?= 17754
TERMFLAGS += -z [::1]:$(ZEP_PORT_BASE)
USEMODULE += socket_zep
+ # the same for Kconfig
+ ifeq (1,$(TEST_KCONFIG))
+ KCONFIG_ADD_CONFIG += $(APPDIR)/app.config.test.native
+ endif
endif
USEMODULE += od
diff --git a/tests/ieee802154_hal/app.config.test.native b/tests/ieee802154_hal/app.config.test.native
new file mode 100644
index 0000000000..3ffd15ce5f
--- /dev/null
+++ b/tests/ieee802154_hal/app.config.test.native
@@ -0,0 +1,7 @@
+CONFIG_MODULE_SOCKET_ZEP=y
+
+# Should be autoselecting the MODULE_PRNG_HWRNG if possible
+# Since the makefile cannot we have to override until end of migration
+# Remove when TEST_KCONFIG is complete
+CONFIG_MODULE_PRNG_TINYMT32=y
+ |
|
ping @jia200x @leandrolanzieri |
The Kconfig changes LGTM |
|
@jia200x anything missing to squash? |
|
please squash! |
tests/socket_zep tests for this condition, so we better handle it gracefully instead of crashing.
0f5aca5 to
4a9ceb5
Compare
4a9ceb5 to
c5f9d50
Compare
|
all green @jia200x |
|
it looks fine to me! @benpicco could you post the test results again, after all the requested changes were addressed? Otherwise, let's get this one soon! |
|
I ran the test with (25% packet loss between A and B) masterthis PRand with a perfect link (flat topology) this PR |
jia200x
left a comment
There was a problem hiding this comment.
ACK. Code-wise looks good and I trust the test results.
|
Thank you for the review! |
Contribution description
This ports
socket_zepto the radio HAL so it can make use of automatic retransmissions.implement ACK handlingTesting procedure
flat topology
This should work the same as in
master:start the ZEP dispatcher
start two
nativeinstances that communicate with the ZEP interfaceBroadcast ping should still work:
As well as unicast
advanced topology
(this requires #16957)
start the ZEP dispatcher for two nodes on a symmetric link with 75% probability of successful transmisson
Unicast ping should still work reliably
Compared to master (no retransmissions):
Issues/PRs references
depends on #16957 for ACKs to be dispatched in non-flat topologies