This project has documents and diagrams for design and implementation of a conversational agent named “Data Acquirer” aimed at providing code or data to data acquisition commands.
Project's goal is to make a general component design and a more or less complete grammar for the envisioned dialogs.
Here is a mind-map that outlines of the scope of the project:
Here is a mind-map that shows the components of the conversational agent:
See the corresponding Raku packages [AAr1, AAr2].
The following diagram shows one way of using DataAcquirer:
We can have a Command Line Interface (CLI) with which we can specify a chain (UNIX-like pipeline) of commands:
dsl translate -c "get a sample of 3000 JSON commodity files from AWS;
parse them into long form data frame;
make a data package for them;
open a notebook with that data package loaded" |
dsl data-acquire |
dsl data-wrangle |
dsl make-notebookThe chain of four commands above:
- Parses and interprets the natural language commands
- And produces, say, JSON or XML records that have executable code.
- Gets the commodities files from some default location
- Might ask for credentials.
- Uses some JSON parsing package in some programming language to make the long form data frame
- Makes a data package with the long form data frame
- Python and R data packages are regular packages.
- WL data package can be a resource function or resource object.
- Uploads the data package in some repository
- Local packages installation folder or a private cloud.
- Creates a notebook and populates with command(s) loading the data package
- Opens the notebook in some notebook interpreter.
- Jupyter notebooks can be opened in a Web browser or VSCode.
- R notebooks, in IntelliJ or RStudio
- WL/Mathematica notebooks, in Mathematica
Of course the command data-acquire might/should ask for credentials.
Similarly, all commands can take user spec, e.g. -u joedoe32.
An alternative representation of the workflow above:
dsl interpret -c "get a sample of 3000 JSON commodity files from AWS" |
dsl interpret -c "parse JSON files collection" |
dsl interpret -c "transform into long form data frame" |
dsl interpret -c "make a data package" |
dsl interpret -c "open a notebook with that data package loaded" Here is a diagram that shows the interaction between workflow steps and DAW components:
(The NLP Template Engine has its own repository, [AAr4].)
DataAcquirer is very similar to the conversational agent Sous Chef Susana in both component design and grammar design and elements.
See also the Raku package: DSL::English::FoodPreparationWorkflows, [AAr3].
[AAr1] Anton Antonov, DSL::Entity::Metadata Raku package, (2021), GitHub/antononcube.
[AAr2] Anton Antonov, DSL::English::DataAcquisitionWorkflows Raku package, (2021), GitHub/antononcube.
[AAr3] Anton Antonov, DSL::English::FoodPreparationWorkflows Raku package, (2021), GitHub/antononcube.
[AAr4] Anton Antonov, NLP Template Engine, (2021), GitHub/antononcube.



