Anton Antonov
MathematicaForPrediction at GitHub
MathematicaVsR project at GitHub
November, 2016
This project has multiple sub-projects for the different data wrangling tasks needed to statistics (machine learning and data mining).
Data wrangling R is heavily influenced by the creation (publication and description) of the packages "plyr", [1,2], and "reshape2", [3].
The need in R for a package like "plyr" is because of R's central data structures, (vectors, lists, data frames) and the complicated system data structure transformation functions. (See, for example, Circle 4 of the book "The R inferno", [4].) In Mathematica the functionalities in "plyr" are easily programmed with common, base Mathematica functions.
Nevertheless, the know-how of data wrangling in R is much more streamlined -- both in base functions and packages -- and there are multiple easy to find resources on Internet for doing particular data wrangling tasks (with R.)
A list of some basic comparison documents and codes.
-
Mathematica
-
"Automatically generated data ingestion report"
-
R
-
"Simple data reading and analysis functionalities", (RMarkdown file)
-
"Automatically generated data ingestion report"
-
[1] Hadley Wickham, "plyr: Tools for Splitting, Applying and Combining Data", CRAN. Also see http://had.co.nz/plyr/.
[2] Hadley Wickham, "The Split-Apply-Combine Strategy for Data Analysis", (2011), Volume 40, Issue 1, Journ. of Stat. Soft.
[3] Hadley Wickham, "reshape2: Flexibly Reshape Data: A Reboot of the Reshape Package", CRAN.
[4] Patrick Burns, The R inferno, 2012, free PDF link.