-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Follow-up on arrow-avro Documentation #8402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7aa5764 to
e5cd3d8
Compare
e5cd3d8 to
93b6e0a
Compare
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jecsand838 -- this is looking great. I had a few suggestions, but otherwise it looks like a great improvement to me
arrow-avro/src/lib.rs
Outdated
| //! 3) Decode it back to Arrow using a `Decoder` configured with a `SchemaStore` that | ||
| //! maps the schema ID to the Avro schema used by the writer. | ||
| //! | ||
| //! ```ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we enable this one now that we have merged
arrow-avro/src/lib.rs
Outdated
| //! This example registers the writer schema (computing a Rabin fingerprint), writes a | ||
| //! single‑row Avro body, constructs the SOE frame, and decodes it back to Arrow. | ||
| //! | ||
| //! ```ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise here, this should now work, right and we can use the Avro writer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, and I just pushed up those changes.
arrow-avro/src/lib.rs
Outdated
| //! while (n & !0x7F) != 0 { out.push(((n as u8) & 0x7F) | 0x80); n >>= 7; } | ||
| //! out.push(n as u8); | ||
| //! } | ||
| //! fn enc_len(l: usize, out: &mut Vec<u8>) { enc_long(l as i64, out); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't' actually use an AvroStreamWriter but the text says it does.
Can we update it to use the writer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout! I cleaned this up.
arrow-avro/src/reader/mod.rs
Outdated
| //! {"name":"id","type":"long"}]}"#.to_string()); | ||
| //! let fp = store.register(avro_schema)?; | ||
| //! | ||
| //! // Minimal Avro long encoder (zig-zag + varint). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the Avro writer here too? If the idea is to show how to use the decoder, maybe we could just hide the code that creates the input body (aka prefix all the lines related to creating the file with # )
The same thing applies to the examples below too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. I made sure to hide the input body code as well.
arrow-avro/src/writer/mod.rs
Outdated
| //! - **Confluent wire format**: magic `0x00` + **big‑endian** 4‑byte schema ID and Avro body. | ||
| //! <https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html#wire-format> | ||
| //! | ||
| //! ## Quickstart: Write an OCF in memory (runnable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally suggest moving this example to the AvroWriter struct itself as I think it will be easier to find
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the docs before pushing and agree with you. 100% good callout!
arrow-avro/src/writer/mod.rs
Outdated
| //! use std::sync::Arc; | ||
| //! use arrow_array::{ArrayRef, Int64Array, RecordBatch}; | ||
| //! use arrow_schema::{DataType, Field, Schema}; | ||
| //! use arrow_avro::writer::AvroStreamWriter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise, i suggest moving this to the docs on AvroStreamWriter as I think it will be easier to find
arrow-avro/src/writer/mod.rs
Outdated
| //! If you plan to interoperate with a schema registry, add the appropriate prefix yourself | ||
| //! (see links above). | ||
| //! | ||
| //! ```ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can run too?
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
034b6e7 to
2cc0264
Compare
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2cc0264 to
9fe5fe4
Compare
f8a11b4 to
76827d7
Compare
@alamb Really appreciate the solid review. I went ahead and pushed up changes that address your comments. Let me know what you think when you get a chance! |
|
CI failure unrelated to this PR |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thanks @jecsand838
# Which issue does this PR close? Follow on to - #8402 # Rationale for this change Let's tell people about arrow-avro # What changes are included in this PR? Touchup the main crate docs to mention arrow-avro # Are these changes tested? CI and I tested the docs manually # Are there any user-facing changes? Docs only cc @jecsand838
Which issue does this PR close?
Rationale for this change
@alamb had some recommendations for improving the
arrow-avrodocumentation in #8316. This is a follow-up to address those suggestions.What changes are included in this PR?
lib.rsdocumentationreader/mod.rsimproved documentation and inlined exampleswriter/mod.rsimproved documentation and inlined examplesNOTE: Some doc tests are temporarily ignored until #8371 is merged in.
Are these changes tested?
Yes, doc tests have been included which all run (with the exception of 3 ignored ones that will work soon)
Are there any user-facing changes?
N/A