Skip to content

fix: handle string datetime in interpreter#3913

Closed
djbarnwal wants to merge 4 commits intovega:mainfrom
djbarnwal:fix/string-datetime-interpreter
Closed

fix: handle string datetime in interpreter#3913
djbarnwal wants to merge 4 commits intovega:mainfrom
djbarnwal:fix/string-datetime-interpreter

Conversation

@djbarnwal
Copy link
Copy Markdown
Member

Fixes #3632

string datetimes do not work with the expression interpreter. Didn't use the isString util as the package has no dependencies. Thought of keeping it that way.

@hydrosquall
Copy link
Copy Markdown
Member

hydrosquall commented Jan 19, 2025

I haven't tested this locally yet, but from reading this looks like a reasonable approach, thanks for contributing a fix.

If you rebase your branch against the latest main, you will be able to get a deploy preview for this PR posted automatically for easier testing.

Would you be able to help with adding a unit test and/or a sample spec to ensure the fix persists?

@djbarnwal
Copy link
Copy Markdown
Member Author

djbarnwal commented Jan 20, 2025

Hey @hydrosquall thanks for following up on this.

If you rebase your branch against the latest main, you will be able to get a deploy preview for this PR posted automatically for easier testing.

I have now merged main into this PR.

Would you be able to help with adding a unit test and/or a sample spec to ensure the fix persists?

I don't see any function specific tests inside the vega-interpreter package. Can you point me to where I need to add a sample spec or unit test?

Copy link
Copy Markdown
Member

@hydrosquall hydrosquall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @djbarnwal, to run tests locally, see

yarn run lerna run test --scope=vega-interpreter

Currently this PR did not pass the basemap scene test (I'm surprised to see that since it does not seem to parse dates... )

image

The input is here, and we validate against the generated scenegraphs here

I believe we would want to make sure that there is a spec where a datetime is provided as a string when using interpreter mode so that when this suite runs, your new JSON would be picked up.

On a side note, I agree with you that having some locally scoped tests could be beneficial to look into adding. I'll discuss it with the team and see if that can be added to the project in a followup.

@djbarnwal djbarnwal requested a review from hydrosquall February 3, 2025 05:28
@djbarnwal
Copy link
Copy Markdown
Member Author

djbarnwal commented Feb 3, 2025

Currently this PR did not pass the basemap scene test

I ran the test suite locally multiple times but couldn't recreate this. Seems to pass now in CI

I believe we would want to make sure that there is a spec where a datetime is provided as a string when using interpreter mode so that when this suite runs, your new JSON would be picked up

I have a added a simple Vega spec which was converted from a Vega Lite spec mentioned in the original issue #3632

Copy link
Copy Markdown
Member

@hydrosquall hydrosquall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test! One question for you about implementation + an optional idea to simplify the test, and I think we'd be good to go.


const datetime = (y, m, d, H, M, S, ms) =>
new Date(y, m || 0, d != null ? d : 1, H || 0, M || 0, S || 0, ms || 0);
const datetime = (...args) => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is called many times, so we would benefit from not needing to create 2 intermediary arrays per call (one with the spread operator, the other with the args destructure).

What do you think of something like this?

const datetime = (yearOrTimestring, m = 0, d = 1, H = 0, M = 0, S = 0, ms = 0) =>
  typeof yearOrTimestring === 'string' 
    ? new Date(yearOrTimestring)
    : new Date(yearOrTimestring, m, d, H, M, S, ms);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I will adopt this change.

"stacked-area",
"stacked-bar",
"stocks-index",
"stocks-single-line",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this realistic test spec 👍 I didn't realize this test output scenegraphs were so big (1700 lines!). I see that the others nearby are already similar to this size ( https://github.com/vega/vega/blob/da55a528a4569522b225e652e7106a1afa05d10d/packages/vega/test/scenegraphs/stocks-index.json ), so this one isn't an outlier.

I think we could show this example working with a simpler scenegraph (i.e. just the points for 2004 2006) as we don't have to plot multiple years to see whether this change works. What do you think?

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "description": "Google's stock price over in the second half of 2004.",
  "background": "white",
  "padding": 5,
  "width": 200,
  "height": 200,
  "style": "cell",
  "data": [
    {
      "name": "stocks",
      "url": "data/stocks.csv",
      "format": {"type": "csv", "parse": {"date": "date"}, "delimiter": ","},
      "transform": [
        {"type": "filter", "expr": "datum.symbol==='GOOG'"},
        {"type": "filter", "expr": "year(datum.date) === 2004"}
      ]
    }
  ],
  "marks": [
    {
      "name": "marks",
      "type": "line",
      "style": ["line"],
      "sort": {"field": "x"},
      "from": {"data": "stocks"},
      "encode": {
        "update": {
          "stroke": {"value": "#4c78a8"},
          "description": {
            "signal": "\"date: \" + (timeFormat(datum[\"date\"], '%b %d, %Y')) + \"; price: \" + (format(datum[\"price\"], \"\"))"
          },
          "x": {"scale": "x", "field": "date"},
          "y": {"scale": "y", "field": "price"},
          "defined": {
            "signal": "isValid(datum[\"date\"]) && isFinite(+datum[\"date\"]) && isValid(datum[\"price\"]) && isFinite(+datum[\"price\"])"
          }
        }
      }
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "time",
      "domain": {
        "fields": [
          {"signal": "{data: datetime(\"2004-08-01T00:00:00\")}"},
          {"signal": "{data: datetime(\"2004-12-01T00:00:00\")}"}
        ]
      },
      "range": [0, {"signal": "width"}]
    },
    {
      "name": "y",
      "type": "linear",
      "domain": {"data": "stocks", "field": "price"},
      "range": [{"signal": "height"}, 0],
      "nice": true,
      "zero": true
    }
  ],
  "axes": [
    {
      "scale": "x",
      "orient": "bottom",
      "gridScale": "y",
      "grid": true,
      "tickCount": {"signal": "ceil(width/40)"},
      "domain": false,
      "labels": false,
      "aria": false,
      "maxExtent": 0,
      "minExtent": 0,
      "ticks": false,
      "zindex": 0
    },
    {
      "scale": "y",
      "orient": "left",
      "gridScale": "x",
      "grid": true,
      "tickCount": {"signal": "ceil(height/40)"},
      "domain": false,
      "labels": false,
      "aria": false,
      "maxExtent": 0,
      "minExtent": 0,
      "ticks": false,
      "zindex": 0
    },
    {
      "scale": "x",
      "orient": "bottom",
      "grid": false,
      "title": "date",
      "labelFlush": true,
      "labelOverlap": true,
      "tickCount": {"signal": "ceil(width/40)"},
      "zindex": 0
    },
    {
      "scale": "y",
      "orient": "left",
      "grid": false,
      "title": "price",
      "labelOverlap": true,
      "tickCount": {"signal": "ceil(height/40)"},
      "zindex": 0
    }
  ]
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I have pushed a commit with the updated spec where the source data is filtered on the year 2005 so that we have a full year of data. I have also added clip: true so that the mark doesn't extend beyond the domain and we have limited number of points in the scenegraph.

Copy link
Copy Markdown
Member

@hydrosquall hydrosquall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey thanks for making the simpler spec and the code change. I'd like to get your opinion about how we document this change going forward, and flag a possible bug

I see that our docs only advertise this "list of options" form, and was initially just going to suggest we advertise that both string and "sequence of numbers" forms were OK.

https://vega.github.io/vega/docs/expressions/#datetime

However, I looked into how our non-interpreter constructor works

and found that since we use the bare Date constructor, we support 5 kinds of date construction. One of these, passing an integer timestamp as the sole argument, will not be accurately represented with the new var name (y can actually be dateString, year, or timestamp).

What do you think about us changing the behavior and documentation "datetime takes the same arguments as Javascript's new Date" and linking out to MDN, rather than enumerating what each positional argument means in a custom way?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/Date#parameters

From the look of the docs, it looked like support for these other types of datetime constructor (string, # of milliseconds, another date object) may not have been intentional, so one could argue that the bug here is that interpreter is closer to spec, and "regular" mode could be made stricter.

However, at this point people are already using those other date formats in their date constructor, so it may be too late to restrict the non-interpreter mode to only support these individual date and time component values formats.

@djbarnwal
Copy link
Copy Markdown
Member Author

What do you think about us changing the behavior and documentation "datetime takes the same arguments as Javascript's new Date" and linking out to MDN, rather than enumerating what each positional argument means in a custom way?

I agree with the proposed change. As you mentioned, many users—including myself—have been taking advantage of the broader range of date formats available. This flexibility has been especially useful for interoperability between JavaScript-based runtimes and Vega-Lite.

Instead of maintaining Vega's own custom standard for date construction, it makes sense to align with the established behavior of JavaScript’s new Date

Copy link
Copy Markdown
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request. I think it's okay to depend on vega-utils to get the isString method. Yes, it's the same but this way we have a single place where this logic sits and it can be deduped for smaller bundles.

@hydrosquall
Copy link
Copy Markdown
Member

@domoritz as a next step, what do you think about merging this PR as-is, vs making adjustments for the string util and/or changing the function signature to match the JS date constructor?

I think both the follow-ups could come in a separate PR, just wanted to check if we are on the same page. The first PR would allow us to make a release that solves the streamlit case ( streamlit/streamlit#5733 (comment) )

domoritz added a commit that referenced this pull request Mar 26, 2025
@domoritz
Copy link
Copy Markdown
Member

moved to #4042 with utils import and a simpler test

domoritz added a commit that referenced this pull request Mar 26, 2025
@domoritz
Copy link
Copy Markdown
Member

done in #4042

@domoritz domoritz closed this Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vega-interpreter does not handle datetimes consistently

3 participants