Conversation
- Added `netdev_driver_t::drop` to drop the received frame - Added `netdev_driver_t::size` to get the expected size of the received frame without dropping it - Added module `netdev_driver_glue` that implements those two functions by calling `netdev_driver_t::recv` for dropping the received frame and to get its size, which before this commit was the canonical API - Added a `LOG_WARNING` to auto_init when this glue is used to motivate programmers to move to the new API - Added this glue to all drivers - Added comments to indicate where the dropping frame functionality is not implemented
|
Btw: I will pause working on this from tomorrow till the last week of September. However, edits from maintainers are enabled :-) |
|
While I think it is a good idea to make the code more readable, this also is a major API change, so its needs to be handled with care. Only argument I personally see against it is a minor (so I won't block because of that) increase in RAM and ROM. |
|
Hmm, an increase in RAM is unexpected to me :-( Maybe there is something wrong in my code, as this change should only increase ROM size by a few bytes. |
The size of the driver struct increases due to this, that's mainly what I was getting it with the RAM size increase... But I could be wrong with that. |
|
I think the initial approach here is good. From my memory, most radio drivers implement the separate usage cases of |
I think the driver struct is marked as |
Except for netdev_tap unfortunately, see my fix in #9834 |
|
+76b code for gnrc_networking on samr21-xpro.
I'm not so sure about this. Most drivers do something like:
Seperating size() from recv() will add another function, with duplicated device locking, and probably a third internal function that gets the actual size from the device. all additional code, as I think this is one of the cases where a slight name mismatch (
How about we just add "drop()" to the interface? |
|
If I remember correctly, the cc110x does not handle the drop and the size case. Apparently newlib's memcpy checks for NULL pointers and ignores that calls. So even for the more common size case bugs can go unnoticed, if the received frames remains in memory until the next arrives. Changing the name would also increase awareness. But it would be interesting to see by how much ROM size increases with the driver being properly adapted to the new API, instead of introducing the extra glue functions. Maybe than it is no longer ~100 Bytes. Also the LOG_WARNING about legacy code in auto_init.c contributes significantly to the ~100B increase when glue code is used. |
No, cc110x (apart from being undermaintained) uses different higher level code, which handles the "size" and / or allocation problem differently. (Which will obviously be a kick in the balls for the next guy trying to use cc110x with something else than gnrc).
I keep hearing that ~N bytes don't matter. Each individual case doesn't, but we have hundreds of these cases. They add up, and in the and, there's a bloated binary. Should drop() be factored out, would recv() be fine as is, with updated documentation? |
|
I don't like the approach of adding a new module here, it will not detect any existing issues with missing functionality. IMO the only way to do this correctly is to refactor this directly inside the device drivers. A first step would be to fix the existing device drivers so that they all conform to the current API (add missing drop functionality). After the drivers are conformant, we can continue with doing whatever API changes we agree on. @kaspar030 I see your point about a large overlap between size and recv, but I think the benefit from having a clear API outweighs the cost of two more function pointers with associated function calls. If the split is done correctly, there should be only the cost of a few function calls added to the ROM size: put the common recv/size prologue inside a separate (private) function, call it from recv and size, remove some of the conditionals in the original combined implementation. |
|
(I'm arguing for keeping size() in recv())
There would not only be more function pointers, it would lead to more complex implementations or duplicated code. In recv() and size() case, I don't see the benefit. From a netdev user perspective, the expectation would be that recv() called with a buffer returns the next packet, and calling recv() with "NULL" as buffer is a mistake. Should the question arise "how do I find out the next packet size before allocating a buffer?", both existing code and the documention point to the "unclear" method of using From a device driver developer perspective, having to return the size when the supplied buf is NULL might not be obvious, but will fail very fast, and reading the doxygen or any other driver should clear things up. The price of getting this clear is quite a lot more boilerplate code in every driver, as both size() and recv() will need to get the actual size, and both lock the device. Using the glue method as in this PR makes things look way more complex with no benefit at all, but correctly implementing the proposed API will lead to more lines of either boilerplate or redundant code. Personally, I'd implement size() as recv(..., NULL, ...) anyways. At that point, we're paying extra pointers, extra functions, extra indirection and an API change for this "clarity". I estimate at least two more functions and +~20 lines of code per driver. If we change perspective and size and recv were distinct functions already, and someone would propose a PR integrating "size()" into "recv()", removing the function calls (both ROM and execution overhead), removing ~250 lines of redundant boilerplate code (12 drivers times 20 lines), would we dismiss the PR because it messes up API clarity? IMO we have two actual issues ("drop" not being implemented by drivers and thus bad behaviour, and "recv()" overloaded so this could stay unnoticed). We should fix those and add "drop()", implementing it for all drivers. Whether having |
|
@kaspar030 good arguments. I am in favour of your proposal: fork drop into a separate function, keep size and recv in the same function. |
|
Maybe a good compromise in case anyone is still arguing for splitting size and recv would be to have an inline function in static inline int netdev_recv_buf_size(netdev_t *dev)
{
return dev->driver->recv(dev, NULL, 0, NULL);
} |
|
As reply to @kaspar030's arguments:
This is to simplified. int main(void) {return 0;}Above program is pretty much the least bloated C program. Adding any code to it would increase RAM & ROM requirements. So obviously, a change that increases ROM/RAM requirements is not necessary bad. You have to check if the benefits outweigh the increase in code size. So careful consideration is better than just dismiss any change increasing ROM/RAM right away. An investigation by how much ROM requirements actually increase by porting a driver (after the drop has been implemented for fair comparison) to the new API candidates would clearly help this careful consideration.
This is only true, if the upper layer do use this feature. In case of the cc110x, it is neither implemented nor used. The same is true for the nrfble and possibly others. In case of separate function pointers, upper layer code could simply check for the presence of the required implementation by a simple
Basically it would change from int recv(netdev_t *dev, void *buf, size_t len, void *info)
{
header_t hdr;
lock(dev);
parse_header(dev, &hdr);
if (!buf) {
unlock(dev);
return hdr.size;
}
[...]
}to int size(netdev_t *dev)
{
header_t hdr;
lock(dev);
parse_header(dev, &hdr);
unlock(dev);
return hdr.size;
}
int recv(netdev_t *dev, void *buf, size_t len, void *info)
{
header_t hdr;
lock(dev);
parse_header(dev, &hdr);
[...]
}The latter removes 4 lines of code and adds 8, so its 4 lines longer. I bet in non-pseudocode it will be at most 8 lines more, when having static helper functions for locking and header parsing. And those helper functions need to be added for the drop function in many drivers anyway, in case the hardware buffers more than one frame (like the en28j60): The frame is dropped by advancing the position in a ring buffer by the frame length.
The current API may have led to the drop functionally being unimplemented in many cases, which let to possible denial of service attacks against the en28j60 driver: Sending two frames fast enough and the network stack got stuck in a loop trying to drop the second frame. Lets assume an upper layer would simply allocate maximum frame size buffers when there is no memory pressure. Im that case a driver not implementing the size feature would just work. But under memory pressure the upper layer may adapt using the size feature and allocate only as much memory as actually required. With an overloaded Summing up: A clean API regarding the drop feature could very well have prevented a DoS vulnerability in the en28j60 driver. It is easy to imagine the same for a clean API regarding the size feature. I believe security and maintainability to be bigger challenges in the IoT than the resource requirements. And a clean API can help to increase both. So in case a clean API can be implemented without to much RAM/ROM overhead, it is IMO worth the trouble. |
|
To add some numbers to the discussion: How the numbers have been obtaineddiff --git a/examples/gnrc_networking/Makefile b/examples/gnrc_networking/Makefile
index 86b7a94c1..5fb504c23 100644
--- a/examples/gnrc_networking/Makefile
+++ b/examples/gnrc_networking/Makefile
@@ -34,6 +34,7 @@ USEMODULE += ps
USEMODULE += netstats_l2
USEMODULE += netstats_ipv6
USEMODULE += netstats_rpl
+USEMODULE += enc28j60
# Comment this out to disable code in RIOT that does safety checking
# which is not needed in a production environment but helps in theCompiled with Differences in ROM/RAM requirementsSo zero RAM overhead in any case, 108 Bytes overhead with separate drop only, 146 Bytes overhead with both drop and size separated. Differences in lines of code:Summary
So trading in the clean separation between the |
This is a text-book argument. Rarely "maintainability" and "security" are not number one points when it comes to networking software developed in a distributed way. But can you provide numbers for "cleanlyness", security or maintainability benefits? Touching all those drivers (which are perfectly working apart from the unimplemented "drop()"), refactoring them, testing them, reviewing the changes, changing documentation, (changing API), ... is a shot in the foot, maintanance wise. If there was an identified security problem (e.g., a bug), it would be worth the trouble. But there is not. On the contrary, there are valid arguments that the size()-in-recv() cannot cause much harm. Seperating size() from recv(), I see more internal functions, more code, more ROM requirement, more runtime overhead and developer time better invested somewhere else. |
I agree, in literature the opinion that security and maintainability is more valuable than size is vastly shared. That would be a point for a separate size function.
There are no metrics that allow to quantity security or maintainability that have a sound foundation. But concluding to give up on both until adequate metrics come available and focus on size instead is just dangerous!
Please share them. This far, you only shared an argument that it does cause harm:
|
The point here is that "security" and "maintainability" are often used in a technical discussion as joker arguments. Any number of changes could be justified with them. Rewriting everything in Rust would kill a whole class of (security sensitive) bugs and through it's more modern language features probably provides quite some maintainability advantages. Why don't we do that? Software development is all about tradeoffs. RIOT is an embedded system, low ressource usage is key. If we'd go with "clean" on every occasion (there are hundreds of APIs), the price would add up to hundreds of bytes, maybe kilobytes. For some use cases, that doesn't matter at all. For some, it does. Maybe take a look at code usage over time here. RIOT is getting heavier. If that weight is due to necessary features, that is acceptable. If it is through bloat that doesn't make a compiled system any better, it is not. Again, we're developing embedded software. For desktop computers, technological advances make low-level optimization a little less important every day. For embedded systems, technological advances pack the same amount of RAM and flash into smaller packages or make them use less power. The more I think about this, the more I'm convinced that even seperating drop() is not worth the extra bytes. netdev is more of an internal API. We'd expect maybe a couple of dozen users, compared to more user facing APIs like sock, which will have much higher user counts. It is entirely possible to go through all drivers and make sure they expose dropping functionality as expected. IMO we can fix the missing drop without adding ROM usage. "misleading" API can be dealt with differently. Documentation goes a long way and is for free in terms of ROM usage. There are more options for improving netdev that don't add bloat. |
Because POSIX mandates a C-API, which is more complex to provide on top of a Rust implementation. Because Rust is not available for a lot of embedded devices, while C is available for virtually all platforms. Because Rust is relatively young, which means fewer people have experience and thus less contributors and uses, which means non-backward-compatible language changes are more likely to happen. See? Not a joker argument ;-)
I saw this "its ugly, but it is internal, so users are not affected"-argument quite often. Every time an internal and unnecessary bug results of this, people start realising that internal functions do have quite a lot of (indirect) users, that are all affected by the bug. Also, first-time contributers will likely add device drivers, when they own a particular yet unsupported board and what to use RIOT on it. So even though this API is internal, non-core developers will sooner or later get directly in touch with it. The documentation already is quite clear, it is the API that is not clear yet. So the API is right now the weakest link in the chain, not the documentation. Reinforcing a strong link in a chain without changing the weakest often doesn't help at all.
No. But just adding the missing functionality without changing the API would increase the ROM size significantly less. However, this is not a one-time effort: Every added driver will needed to be reviewed with additional care. The fact that even a lot of capable programmer and reviewers overlooked this issue proves how easily this issue is missed. Also: What about drivers using upper layer not requiring the size() implementation? With the clean API it could be left out without having to worry much: Adding an
In the upper layers the code is already quite readable, e.g. in So, it looks like most arguments for the different approaches on how to react to the reoccurring bug of the missing drop implementation have been exchanged. I updated the issue #9805 and added the two additional proposals and the major pros/cons for each. (This could also be moved to the wiki, so everyone can add pros and cons, to get a more complete list, if there is demand for that.) |
|
Closed in favor of #10410 |
Contribution description
netdev_driver_t::dropto drop the received framenetdev_driver_t::sizeto get the expected size of the received frame without dropping itnetdev_driver_gluethat implements those two functions by callingnetdev_driver_t::recvfor dropping the received frame and to get its size, which before this commit was the canonical APILOG_WARNINGto auto_init when this glue is used to motivate programmers to move to the new APIIssues/PRs references
#9805