Conversation
Codecov Report
@@ Coverage Diff @@
## dev #1367 +/- ##
==========================================
+ Coverage 54.13% 54.15% +0.02%
==========================================
Files 138 137 -1
Lines 20636 20564 -72
Branches 5199 5183 -16
==========================================
- Hits 11171 11137 -34
+ Misses 6553 6517 -36
+ Partials 2912 2910 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
Sorry for spamming reviewer (un)assignment - I'm being wishy washy about whether to change to a Vec, which I suspect would be faster and use less memory in the common case. Going to sketch it out a bit more. By the way - do you happen to know whether it's important to assign the lowest free descriptor first? Or is arbitrary order ok? e.g. if we create fd's 0, 1, 2, 3, 4, 5, then destroy 2, then destroy 3, is it ok if the next one created is 3? The C version sorts the free list to ensure the lowest free # is always returned, but the tests pass with arbitrary order. |
I believe the Linux behavior is to always return the lowest free descriptor. I'm not sure that it's important to mirror this exact behavior, though I suppose some app could build logic that's dependent on this behavior. |
No problem. I think a vector is a good idea, it's probably faster in all cases, but I'm not sure at what point the memory usage of a spares vector becomes greater than a hash table. In a large tor experiment, what would the upper bound be on the number of file descriptors per process? Maybe ~2,000 sockets for a guard/exit?
The order of the descriptors is useful for i/o redirection. It's not as important now that there are close(0);
open("filename");And you could rely on the kernel giving the new file a fd of 0. I think we should keep this behaviour as it is the expected behaviour and is part of posix, unless there's a strong reason not to. It also makes debugging error cases easier in tests when the program uses the same fd numbers both inside and outside shadow. I don't think we need to keep a sorted list though, we could just iterate over them until we find an unused one. |
| /// Remove the descriptor at the given index and return it. Panics if there | ||
| /// is no descriptor at `idx`. | ||
| pub fn remove(&mut self, idx: u32) -> CompatDescriptor { | ||
| let mut descriptor = self.descriptors.remove(&idx).unwrap(); | ||
| self.available_indices.push(idx); | ||
| descriptor.set_handle(0); | ||
| descriptor | ||
| } |
There was a problem hiding this comment.
descriptortable_remove() is called from process_deregisterCompatDescriptor(), and I don't think we want shadow to panic if a plugin calls close() on a fd that doesn't exist. Maybe return an Option instead?
| pub fn set(&mut self, index: u32, descriptor: CompatDescriptor) -> Option<CompatDescriptor> { | ||
| let mut descriptor = descriptor; |
There was a problem hiding this comment.
Shortened as just pub fn set(... , mut descriptor: CompatDescriptor)?
There was a problem hiding this comment.
I thought it'd be a little weird to have mut in the interface, since the caller shouldn't care. I checked cargo doc though, and it looks like the generated docs do ignore it, so it's mostly moot :). Done.
| super::export::compatdescriptor_setHandle( | ||
| &mut descriptor as *mut CompatDescriptor, | ||
| index.try_into().unwrap(), | ||
| ); |
There was a problem hiding this comment.
compatdescriptor_setHandle() is only called from descriptor_table.c, so this compatdescriptor_setHandle() function could be removed entirely from shadow and replaced with just an inline:
if let CompatDescriptor::Legacy(d) = &mut descriptor {
unsafe { c::descriptor_setHandle(d.ptr(), index) }
}There was a problem hiding this comment.
I'd already created CompatDescriptor::set_handle (i.e. a Rust native version), but had missed a couple call sites. Fixed.
| table: *mut DescriptorTable, | ||
| descriptor: *mut CompatDescriptor, | ||
| ) -> c_int { | ||
| let table = unsafe { table.as_mut().unwrap() }; |
There was a problem hiding this comment.
table.as_mut() -> notnull_mut_debug(table).as_mut()? (And same with the other descriptortable_ functions below).
There was a problem hiding this comment.
as_mut and as_ref are safe to call on null pointers: https://doc.rust-lang.org/std/primitive.pointer.html#method.as_mut
| /// Store a descriptor object for later reference at the next available index | ||
| /// in the table. The chosen table index is stored in the descriptor object and | ||
| /// returned. The descriptor is guaranteed to be stored successfully. | ||
| /// | ||
| /// NOTE: that this consumes a reference to the descriptor, so if you are storing | ||
| /// it outside of the descriptor table you will need to ref it after calling | ||
| /// this function. |
There was a problem hiding this comment.
These comments from descriptor_table.h are out of date. The descriptor object doesn't generally store the table index anymore (only the legacy descriptors do).
Also, the descriptor table consumes the descriptor, so it should not be stored or used after this function call.
| /// Stop storing the descriptor so that it can no longer be referenced. The table | ||
| /// index that was used to store the descriptor is cleared from the descriptor | ||
| /// and may be assigned to new descriptors that are later added to the table. | ||
| /// Returns an owned pointer to the CompatDescriptor if the descriptor was found | ||
| /// in the table and removed, and NULL otherwise. | ||
| /// | ||
| /// NOTE: this will not unref the descriptor and you should do so manually. | ||
| #[no_mangle] | ||
| pub unsafe extern "C" fn descriptortable_remove( |
There was a problem hiding this comment.
The returned type CompatDescriptor doesn't have a reference count. Maybe the comment should say it should be freed with compatdescriptor_free()?
Also this function doesn't ever return NULL, but probably should following the earlier comment about process_deregisterCompatDescriptor().
| pub struct DescriptorTable { | ||
| descriptors: std::collections::HashMap<u32, CompatDescriptor>, | ||
| available_indices: Vec<u32>, | ||
| index_counter: u32, | ||
| } |
There was a problem hiding this comment.
If we want to keep the available indices sorted so that we can always take the smallest, a BTreeSet might work well here.
Ok, yeah that clinches it for me :). I see
Yeah, that's the main drawback. While I still think a Vec is probably going to better in most cases, it won't be by that much, and could have unexpected poor behavior if it is sparse. Leaving it as a HashMap for this PR. |
| let idx = if let Some(idx) = self.available_indices.iter().next() { | ||
| // Un-borrow from `available_indices`. | ||
| let idx = *idx; | ||
| // Take from `available_indices` | ||
| trace!("Reusing available index {}", idx); | ||
| self.available_indices.remove(&idx); | ||
| idx |
There was a problem hiding this comment.
Ah, it's too bad BTreeSet::pop_first() isn't stable yet.
There was a problem hiding this comment.
Yeah :/. I suppose we could backport it like I did for stream_len, but doesn't seem worth it for just this.
7faa630 to
4412027
Compare
4412027 to
da9022f
Compare
Wanted to start nailing down some more of how the object graph will work out in Rust; started to port Process to Rust, but looks like DescriptorTable needed to be ported first. (Though in hindsight I suppose the Rust Process could have just stored a
Box<DescriptorTable>)