Skip to content

Added Dataframe output format#6

Merged
auxten merged 1 commit into
chdb-io:pybindfrom
nmreadelf:pybind
Apr 16, 2023
Merged

Added Dataframe output format#6
auxten merged 1 commit into
chdb-io:pybindfrom
nmreadelf:pybind

Conversation

@nmreadelf

@nmreadelf nmreadelf commented Apr 14, 2023

Copy link
Copy Markdown
Collaborator

Added Dataframe output format

@nmreadelf nmreadelf force-pushed the pybind branch 6 times, most recently from f6de615 to 4681d7a Compare April 15, 2023 13:50
@nmreadelf nmreadelf changed the title Added to_df method Added Dataframe output_format Apr 15, 2023
@nmreadelf nmreadelf changed the title Added Dataframe output_format Added Dataframe output format Apr 15, 2023
Comment thread chdb/__init__.py Outdated
Comment thread chdb/__main__.py Outdated
@nmreadelf nmreadelf force-pushed the pybind branch 4 times, most recently from d65027d to 7b9fa38 Compare April 16, 2023 00:32
@auxten auxten merged commit 5232555 into chdb-io:pybind Apr 16, 2023
@auxten auxten linked an issue Apr 16, 2023 that may be closed by this pull request
@auxten auxten linked an issue Apr 17, 2023 that may be closed by this pull request
auxten pushed a commit that referenced this pull request Jun 27, 2023
auxten pushed a commit that referenced this pull request Jun 28, 2023
@auxten

auxten commented Jun 30, 2023

Copy link
Copy Markdown
Member

@all-contributors please add @nmreadelf for code

@allcontributors

Copy link
Copy Markdown
Contributor

@auxten

I've put up a pull request to add @nmreadelf! 🎉

auxten pushed a commit that referenced this pull request Aug 15, 2023
auxten pushed a commit that referenced this pull request Nov 9, 2023
auxten pushed a commit that referenced this pull request Jun 7, 2024
wudidapaopao added a commit to wudidapaopao/chdb-core-fork that referenced this pull request Feb 26, 2026
auxten added a commit that referenced this pull request May 21, 2026
The previous empty-result short-circuit in
``test_random_chain_matches_pandas`` checked
``isinstance(ds_result, pd.DataFrame)``, but ``apply_chain`` returns
a lazy ``DataStore``, not a materialized ``pd.DataFrame`` - so the
isinstance branch was always False and the short-circuit never
fired. CI on commit 64f38b8 surfaced this when Hypothesis generated
the chain ``filter(v>0) -> filter(v>99) -> sort(w desc) ->
groupby(cat).agg({'v':'sum'})`` (empty intermediate after the
second filter; column drift between pandas' agg-projected output
and DataStore's source-bound short-circuit output).

Two changes:

- ``tests/test_property_based_chains.py``: replace the isinstance
  check with a duck-typed ``len(r) == 0`` helper that works on
  ``DataStore`` and ``pd.DataFrame`` alike. Reference the new
  journey test in the comment so future readers can find the
  tracked bug.

- ``tests/journeys/test_property_chain_followups.py`` (new): verbatim
  regression for the falsifying chain, marked
  ``@unittest.expectedFailure``. Follows the
  ``.cursor/rules/chdb-ds.mdc`` rule #6 contract: every Hypothesis
  falsifier becomes a permanent journey test; when the underlying
  empty-intermediate column-projection bug in the executor is
  fixed, flip the xfail and drop the property-test skip in the
  same commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pandas dataframe output introduced pyarrow and pandas dependency Convert chdb result to pandas/DataFrame format

2 participants