| Authors | Project | Build Status | Code Quality |
|---|---|---|---|
| N. Curti E. Giampieri |
easyDAG | Linux/MacOS : Windows : |
Codacy : |
Modern C++ parallel scheduler. This project is the optimization and extension of the original easyDAG project in Python.
- Prerequisites
- Installation
- Usage
- Testing
- Contribution
- References
- Authors
- License
- Acknowledgments
- Citation
The easyDAG project is written in C++ using a large amount of c++17 features.
The package installation can be performed via CMake.
-
Follow your system prerequisites here.
-
Clone the
easyDAGpackage from this repository, or download a stable release
git clone https://github.com/Nico-Curti/easyDAG.git
cd easyDAGeasyDAGcould be built with CMake and Make or with the build scripts in the project. Example:
Unix OS:
./build.shWindows OS:
PS \> ./build.ps1We recommend the use of CMake for the installation since it is the most automated way to reach your needs.
First of all make sure you have a sufficient version of CMake installed (3.9 minimum version required).
If you are working on a machine without root privileges and you need to upgrade your CMake version a valid solution to overcome your problems is provided here.
With a valid CMake version installed first of all clone the project as:
git clone https://github.com/Nico-Curti/easyDAG
cd easyDAGThe you can build the easyDAG package with
mkdir -p build
cd build && cmake .. && cmake --build . --target installor more easily
./build.shif you are working on a Windows machine the right script to call is the build.ps1.
The following algebra.cpp example shows the basic APIs of easyDAG
#include <step.hpp>
#include <iostream>
int main ()
{
float x1 = 1.f;
double x2 = 2.;
float y1 = 2.f;
float y2 = 4.f;
// ____ a
InputVariable a(x1); // |
InputVariable b(x2); // ___ mul_1 *
InputVariable c(y1); // | |____ b
InputVariable d(y2); // |
// sum __ + ____ c
auto mul_1 = a * d; // | |
auto mul_2 = b * c; // |___ mul_2 *
auto sum = mul_1 + mul_2; // |____ d
std :: cout << "(a * d) + (b * c) = " << sum() << std :: endl;
// (a * d) + (b * c) = 8.f
auto equation = (a * d) + (b * c);
assert (equation() == sum());
return 0;
}The DAG is automatically evaluated using the template recursion of the Step object.
The result evaluation is performed asynchronously using future variables (in this case is useless this kind of optimization!).
A full list of default lambda math-functions is provided in the math.hpp file.
In the graphviz.cpp example we show how you can use easyDAG to visualize your DAG scheme using DOT.
const int N = 10;
const std :: string pipeline_name = "pipeline";
auto init = [](const int & N)
{
std :: vector < float > x(N);
return x;
};
auto fill = [](std :: vector < float > x)
{
std :: fill(x.begin(), x.end(), 1.f);
return x;
};
auto concat = [](std :: vector < float > x, std :: vector < float > y)
{
std :: vector < float > res(x.size() + y.size());
std :: copy_n(x.begin(), x.size(), res.begin());
std :: copy_n(y.begin(), y.size(), res.begin() + x.size());
return res;
};
auto reduce = [](std :: vector < float > res)
{
return std :: accumulate(res.begin(), res.end(), 0.f);
};
auto size = InputVariable(N);
size.set_name(size);
Task init_x(init, size);
Task init_y(init, size);
init_x.set_name(init_x);
init_y.set_name(init_y);
Task fill_x(fill, init_x);
Task fill_y(fill, init_y);
fill_x.set_name(fill_x);
fill_y.set_name(fill_y);
Task concatenate(concat, fill_x, fill_y);
concatenate.set_name(concatenate);
Task reduction(reduce, concatenate);
reduction.set_name(reduction);
reduction.graphviz(std :: cout, pipeline_name);We set a name to each step (the default value is just Task).
The work-flow can be summarized as:
- we create two vectors starting from just a dimension size;
- we fill the two vectors with some values;
- we concatenate them;
- we apply a sum reduction.
The DAG is written in the stdout in the DOT format so if you want to visualize the png you can re-direct the stdout to a file and use the command:
dot -Tpng graph.dot > reduction.pngNote: The default settings highlight computational step as boxes while variables as circle nodes.
In the cached.cpp example we proof the usage of cached values into the DAG evaluation.
const int N = 10;
auto init = [&](const int & N)
{
std :: cout << "I'm the initializer" << std :: endl;
float * x = new float[N];
return x;
};
auto fill = [&](float * x)
{
std :: cout << "I'm the filler" << std :: endl;
std :: fill_n(x, N, 1.f);
return x;
};
auto sum = [&](float * x)
{
std :: cout << "I'm the sum reduction" << std :: endl;
return std :: accumulate(x, x + N, 0.f);
};
auto prod = [&](float * x)
{
std :: cout << "I'm the prod reduction" << std :: endl;
return std :: accumulate(x, x + N, 1.f, std :: multiplies < float >());
};
auto a = InputVariable(N);
Task init_step(init, a);
Task fill_step(fill, init_step);
Task sum_step(sum, fill_step);
Task prod_step(prod, fill_step);
auto result = sum_step();
std :: cout << "Summation result: " << result << std :: endl;
// Re-calling the same evaluation the lambdas are not recomputed (aka no cout)!
std :: cout << "Re-called sum value: " << sum_step () << std :: endl;
// But also if we use common steps they do not need to be re-computed!
std :: cout << "Product result: " << prod_step() << std :: endl;If the DAG has been already evaluated its dependencies are not re-computed.
In this way long computations but also common steps do not need to be evaluated several times!
Starting from the scorer project we can rewrite the computation of the Matthews Correlation Coefficient using a DAG scheme.
You can find the full list of lambdas in the mcc.cpp example.
auto yt = InputVariable(y_true);
yt.set_name(y_true);
auto yp = InputVariable(y_pred);
yp.set_name(y_pred);
auto n_labels = InputVariable(Nlabels);
n_labels.set_name(Nlabels);
auto n_class = InputVariable < int >();
n_class.set_name(Nclass);
// Compute the unique classes
Task classes(get_classes, yt, yp, n_labels, n_labels);
classes.set_name(classes);
// set the Nclass as step for printing
n_class.set(Nclass);
// Compute the confusion matrix
Task confusion_matrix(get_confusion_matrix, yt, yp, n_labels,
classes, n_class);
confusion_matrix.set_name(confusion_matrix);
// Compute the True Positive
Task TP(get_TP, confusion_matrix, n_class);
TP.set_name(TP);
// Compute the False Positive
Task FP(get_FP, confusion_matrix, n_class);
FP.set_name(FP);
// Compute the False Negative
Task FN(get_FN, confusion_matrix, n_class);
FN.set_name(FN);
// Compute the Test outcome positive
Task TOP(get_TOP, TP, FP, n_class);
TOP.set_name(TOP);
// Compute the Condition positive or support
Task P(get_P, TP, FN, n_class);
P.set_name(P);
// Compute the Matthews Correlation Coefficient
Task MCC(get_overall_MCC, confusion_matrix, TOP, P, n_class);
MCC.set_name(MCC);
auto res = MCC();
std :: cout << "Matthews Correlation Coefficient: " << res << std :: endl;The helper.h and utils.h scripts include some useful APIs to manage Task variables.
First of all you can check the type of the current variable with
float x = 2.f;
int y = 2;
auto a = InputVariable(x);
auto b = InputVariable < decltype(y) > ();
auto sum1 = a + b;
static_assert ( utils :: is_variable < decltype(x) > :: value == true, "It is a variable");
static_assert ( utils :: is_variable < decltype(a) > :: value == true, "It is a variable");
static_assert ( utils :: is_variable < decltype(b) > :: value == true, "It is a variable");
static_assert ( utils :: is_variable < decltype(sum1) > :: value == false, "It is not a variable");
static_assert ( utils :: is_step < decltype(x) > :: value == false, "It is not a step");
static_assert ( utils :: is_step < decltype(a) > :: value == false, "It is not a step");
static_assert ( utils :: is_step < decltype(b) > :: value == false, "It is not a step");
static_assert ( utils :: is_step < decltype(sum1) > :: value == true, "It is a step");In this way you can check at compile time the variable types.
If you have already build a DAG you can be interested about the number of operations involved in the computation.
To this purpose the utils :: OperationCount can summarize you these kind of information. For example:
float x1 = 10.f;
double x2 = 2.;
float y1 = 3.f;
float y2 = 4.f;
auto a = InputVariable(x1);
auto b = InputVariable(x2);
auto c = InputVariable(y1);
auto d = InputVariable(y2);
auto sum = (a * b) + (c + d) + x1;
auto cnt = utils :: OperationCount < decltype(sum) >();
std :: cout << "The current DAG has " << cnt.num_variables << " variables" << std :: endl;
std :: cout << "The current DAG has " << cnt.num_operations << " operations" << std :: endl;
std :: cout << "The current DAG has " << cnt.num_nodes << " nodes" << std :: endl;
// The current DAG has 5 variables
// The current DAG has 4 operations
// The current DAG has 9 nodeseasyDAG uses CMake to build a full list of tests. You can disable tests setting the -DBUILD_TEST=OFF during the building.
All the test are performed using the Catch2 (v2.11.0) library.
The test scripts can be found here.
Any contribution is more than welcome ❤️. Just fill an issue or a pull request and we will check ASAP!
See here for further informations about how to contribute with this project.
See also the list of contributors who participated in this project.
The easyDAG package is licensed under the MIT "Expat" License.
Thanks goes to all contributors of this project.
If you have found easyDAG helpful in your research, please consider citing this project repository
@misc{easyDAG,
author = {Curti, Nico and Giampieri, Enrico},
title = {Easy DAG in C++},
year = {2019},
publisher = {GitHub},
howpublished = {\url{https://github.com/Nico-Curti/easyDAG}},
}


