Conversation
|
@github-actions crossbow submit preview-docs |
|
Revision: 787c130 Submitted crossbow builds: ursacomputing/crossbow @ actions-ae307f872e
|
amoeba
left a comment
There was a problem hiding this comment.
Hi @dgreiss, thanks for the follow-up PR. A few high-level points:
- Is it possible at this point to update your PR without all the whitespace changes? It takes extra work to figure out what text actually changed
- Could the description of Apache Arrow be kept but just put lower down? I don't feel like the README would be too long with it included
|
Sounds good @thisisnic, I'll do that today. |
|
@thisisnic are you able to edit the PR? I don't think I can. Here's my patch that addresses my comments. This is also available on https://github.com/amoeba/arrow/tree/gh-35875-dgreiss-update-r-readme. diff --git a/r/README.md b/r/README.md
index 8b6a050055..1f953a8e6f 100644
--- a/r/README.md
+++ b/r/README.md
@@ -12,7 +12,7 @@
The R `{arrow}` package provides access to many of the features of the [Apache Arrow C++ library](https://arrow.apache.org/docs/cpp/index.html) for R users. The goal of arrow is to provide an Arrow C++ backend to `{dplyr}`, and access to the Arrow C++ library through familiar base R and tidyverse functions, or `{R6}` classes.
-To learn more about the Apache Arrow project, see the parent documentation of the [Arrow Project](https://arrow.apache.org/). The Arrow project provides functionality for a wide range of data analysis tasks to store, process and move data fast. See the [read/write article](articles/read_write.html) to learn about reading and writing data files, [data wrangling](article/data_wrangling.html) to learn how to use dplyr syntax with arrow objects, and the [function documentation](reference/acero.html) for a full list of supported functions within dplyr queries.
+To learn more about the Apache Arrow project, see the parent documentation of the [Arrow Project](https://arrow.apache.org/). The Arrow project provides functionality for a wide range of data analysis tasks to store, process and move data fast. See the [read/write article](articles/read_write.html) to learn about reading and writing data files, [data wrangling](articles/data_wrangling.html) to learn how to use dplyr syntax with arrow objects, and the [function documentation](reference/acero.html) for a full list of supported functions within dplyr queries.
## Installation
@@ -65,6 +65,18 @@ Additional features include:
- Fine control over column types to work seamlessly with databases and data warehouses
- Toolkit for building connectors to other applications and services that use Arrow
+## What is Apache Arrow?
+
+Apache Arrow is a cross-language development platform for in-memory and
+larger-than-memory data. It specifies a standardized language-independent
+columnar memory format for flat and hierarchical data, organized for efficient
+analytic operations on modern hardware. It also provides computational libraries
+and zero-copy streaming, messaging, and interprocess communication.
+
+This package exposes an interface to the Arrow C++ library, enabling access to
+many of its features in R. It provides low-level access to the Arrow C++ library
+API and higher-level access through a dplyr backend and familiar R functions.
+
## Arrow resources
There are a few additional resources that you may find useful for getting started with arrow:
@@ -85,7 +97,10 @@ the [Apache Arrow Community](https://arrow.apache.org/community/) page.
If you encounter a bug, please file an issue with a minimal reproducible
example on [GitHub issues](https://github.com/apache/arrow/issues).
Log in to your GitHub account, click on **New issue** and select the type of
-@@ -104,11 +92,8 @@ features\*\* section of the [Contributing to Apache
+issue you want to create. Add a meaningful title prefixed with **`[R]`**
+followed by a space, the issue summary and select component **R** from the
+dropdown list. For more information, see the **Report bugs and propose
+features** section of the [Contributing to Apache
Arrow](https://arrow.apache.org/docs/developers/#contributing) page
in the Arrow developer documentation. |
Co-authored-by: Bryce Mecum <petridish@gmail.com>
|
Thanks @amoeba! FYI another alternative route that I take when I have had a long day and don't want to have to remind myself how to push to someone else's branch is also to just apply the changes as "suggestions", and then if you're a committer, you should be able to accept those suggestions on the PR. |
|
This is ready for merging once CI passes. Thanks very much @dgreiss, the readme is looking a lot more straightforward now! :) |
|
Ah, that's a nice trick, thanks @thisisnic. Still working on getting that commit bit :) |
|
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 54ff758. There was 1 benchmark result indicating a performance regression:
The full Conbench report has more details. It also includes information about 21 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
#35875 #35082 and #32895 make a number of recommendations to update the the Readme
What changes are included in this PR?
Rewording and reorganizing the Readme and sidebar.
Are these changes tested?
n/a
Are there any user-facing changes?
Yes