-
Notifications
You must be signed in to change notification settings - Fork 14
Unifying threads, functions and thread pooling #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| int32 id = 1; | ||
| int32 appId = 2; | ||
| int32 appIndex = 3; | ||
| string masterHost = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I introduced these fields, but rearranged and renumbered the rest as they didn't make sense. According to the protobuf docs it's also most efficient to use numbers 1-15 for regularly used fields, so I made sure that's the case.
| @@ -0,0 +1,22 @@ | |||
| #pragma once | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DummyExecutor and the associated factory are needed to set a default in tests
|
|
||
| int main(int argc, char** argv) | ||
| { | ||
| auto logger = faabric::util::getLogger(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chunk of code seemed to be duplicated over all MPI native examples so I factored it out
csegarragonz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. We can go over the details offline.
| FaabricMain.cpp | ||
| ) | ||
|
|
||
| faabric_lib(runner "${LIB_FILES}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could remove the use of LIB_FILES? I wouldn't point it out but given that the file is fully new, we can already start applying faasm/faasm#413
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure whether it works with the faabric_lib function, if it was add_library it would be fine. I'll see if it's a straight swap, if not we can update all uses of faabric_lib as part of that issue.
| executeWithTestExecutor(req, false); | ||
|
|
||
| auto& sch = faabric::scheduler::getScheduler(); | ||
| faabric::Message result = sch.getFunctionResult(msgId, 1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use conf.boundTimeout instead of the numbers? (Seen it elsewhere as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes having magic numbers like this isn't great. They're there to make sure the tests time out if they go wrong, but to give the system time to perform the action we're testing in the background. The bound timeout is usually longer than we'd want a test to hang before failing, but perhaps I can find a nicer way to set these values. (In this case boundTimeout has already been set back to its default value which is too long)
| bindQueue = std::make_shared<InMemoryMessageQueue>(); | ||
|
|
||
| // Set up the initial resources | ||
| int cores = faabric::util::getUsableCores(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep using the word cores? Feels weird to initialize a int cores variable and then set_slots with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I agree, this number is the number of cores, so calling it cores makes sense. We're then setting the number of slots equal to the number of cores, but the two are different things (i.e. we could have more slots than cores if we found that overloading didn't make much of a performance difference). They just happen to be 1:1 at the moment.
| throw std::runtime_error("Message without master host"); | ||
| std::string funcStr = faabric::util::funcToString(msg, false); | ||
|
|
||
| // Remove from warm executors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this operations could be simplified/made more efficient with std::remove_if. Namely, we can do something like:
warmExecutors[funcStr].erase(std::remove_if(
warmExecutors[funcStr].begin(), warmExecutors[funcStr].end(),
[](const auto& execPtr) {
execPtr.id == exec->id;
}), v.end());It's also worth having this other bad boy in mind: std::remove_copy_if, which copies matching elements from one container to another.
| return scheduler; | ||
| } | ||
|
|
||
| std::vector<std::string> Scheduler::callFunctions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited to point to the right line number.
src/scheduler/Scheduler.cpp
Outdated
| std::unordered_set<std::string> allHosts = getAvailableHosts(); | ||
| if (offset < nMessages) { | ||
| // At this point we know we need to enlist unregistered hosts | ||
| std::unordered_set<std::string> allHosts = getAvailableHosts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were to use an ordered container like std::set instead of std::unordered_set we could use std::set_difference, which feels a bit like what we are doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that might make sense. In general I avoid the ordered collections unless absolutely necessary as I've seen the performance of std::map to be noticeably different to std::unordered_map when doing inserts (i.e. enough to actually care about in something like the scheduler). The verbosity of std::set_difference also means in this case it would actually be more lines of code than what's there now. We will also still have to have the loop to iterate through the resulting diff, so I'm not sure it's an easy decision here, but I'll have a go.
The bigger worry in this bit of code is that we're querying Redis to get the list of available hosts, which will dwarf any performance benefit from using set_difference, so I'll actually have a look and see if I can remove that too...
|
|
||
| void Scheduler::callFunction(faabric::Message& msg, bool forceLocal) | ||
| { | ||
| // TODO - avoid this copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT we are not copying anymore? Could we remove the TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure *req->add_messages() = msg; is doing a copy. If this function weren't so widely used I'd get rid of it altogether; all scheduling of functions should be done through callFunctions. Fortunately it's mostly just used in tests, and I'll port the important bits of Faasm (like the function chaining calls) to use callFunctions.
The purpose of this change is to unify the treatment of threads and functions into a single simple executor interface. This is necessary to support remote threading and scheduling of distributed threads, making them first class citizens alongside functions (in terms of scheduling decisions).
The key changes are:
Executorsubclasses is much simpler, with each only needing to implementexecuteTask, which takes a bulk request and a set of indexes saying which messages from that bulk request it needs to execute. For a simple function this request will hold one message and the list of indexes will be{0}.Executorsubclasses, including error handling, how they execute functions and threads etc.