Skip to content

Timezone information dropped from polars dataframe #3921

@kjgoodrick

Description

@kjgoodrick

What happened?

When plotting from a polars dataframe that has a datetime with timezone the timezone is not passed on to vega-lite. This is a noticable problem when you are plotting data across a time change. Simple example below:

import polars as pl
import altair as alt

from polars import col as c

start_datetime = pl.datetime(2023, 11, 5, time_zone='US/Mountain')

datetimes = pl.datetime_range(
    start_datetime,
    start_datetime.dt.offset_by("3h"),
    "1h",
    closed="both",
    eager=True
)

# Create a Polars DataFrame with a datetime column (with timezone)
df = pl.DataFrame({
    'datetime': datetimes,
    'value': [10, 20, 30, 40]
})

# Plot with Altair using the datetime column (with timezone)
color = alt.Color("value:N", title="Value")
chart_dt = alt.Chart(df).mark_point().encode(
    alt.X('datetime:T', title='Datetime (with timezone)'),
    alt.Y('value:Q', title='Value'),
    color
).properties(title='Altair plot: Datetime with timezone')
Image

Looking at the data supplied to vega-lite in the editor, the timezone information has been dropped, so vega-lite does not know that the second 1 AM data point is at the later 1 AM.

"datasets": {
    "zzz": [
      {"datetime": "2023-11-05T00:00:00", "value": 10},
      {"datetime": "2023-11-05T01:00:00", "value": 20},
      {"datetime": "2023-11-05T01:00:00", "value": 30},
      {"datetime": "2023-11-05T02:00:00", "value": 40}
    ]
  }

If the datetime column is manually converted to a string the chart appears as expected:

# Plot with Altair using the string column
chart_str = alt.Chart(df.with_columns(c.datetime.dt.to_string())).mark_point(point=True).encode(
    alt.X('datetime:T', title='Datetime (as string)'),
    alt.Y('value:Q', title='Value'),
    color
).properties(title='Altair plot: Datetime as string')

chart_str
Image
"datasets": {
    "zzz": [
      {"datetime": "2023-11-05 00:00:00.000000-06:00", "value": 10},
      {"datetime": "2023-11-05 01:00:00.000000-06:00", "value": 20},
      {"datetime": "2023-11-05 01:00:00.000000-07:00", "value": 30},
      {"datetime": "2023-11-05 02:00:00.000000-07:00", "value": 40}
    ]
  }

A molab replication of this is here Open in molab

What would you like to happen instead?

Altair should preserve timezone information if it exists in the dataframe and the example plot should display as it does when converting to string manually.

Which version of Altair are you using?

6.0.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugneeds-triageBug report needs maintainer response

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions