Skip to content

Commit

Permalink
Merge branch 'main' into blog/ibis-udf-rewriting
Browse files Browse the repository at this point in the history
  • Loading branch information
hussainsultan authored Feb 7, 2025
2 parents c372323 + 4b92c3b commit 0713806
Show file tree
Hide file tree
Showing 126 changed files with 2,722 additions and 1,748 deletions.
2 changes: 1 addition & 1 deletion .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
FROM mcr.microsoft.com/vscode/devcontainers/python:3.13
COPY --from=ghcr.io/astral-sh/uv:0.5.26 /uv /uvx /bin/
COPY --from=ghcr.io/astral-sh/uv:0.5.29 /uv /uvx /bin/
ARG USERNAME=vscode

RUN apt-get update && \
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/assign.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ jobs:
runs-on: ubuntu-latest
if: github.event.comment.body == '/take'
steps:
- uses: pozil/auto-assign-issue@v2.1.2
- uses: pozil/auto-assign-issue@v2.2.0
with:
assignees: ${{ github.event.comment.user.login }}
2 changes: 1 addition & 1 deletion .github/workflows/create-rotate-key-issue.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Generate a GitHub token
uses: actions/[email protected].2
uses: actions/[email protected].3
id: generate_token
with:
app-id: ${{ secrets.SQUAWK_BOT_APP_ID }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docs-preview.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
cancel-in-progress: true
if: github.event.label.name == 'docs-preview'
steps:
- uses: actions/[email protected].2
- uses: actions/[email protected].3
id: generate_token
with:
app-id: ${{ secrets.DOCS_BOT_APP_ID }}
Expand Down
36 changes: 25 additions & 11 deletions .github/workflows/ibis-backends-cloud.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,18 +22,20 @@ env:

jobs:
test_backends:
name: ${{ matrix.backend.title }} python-${{ matrix.python-version }}
name: ${{ matrix.backend.title }} python-${{ matrix.python-version }}-${{ matrix.os }}
# only a single bigquery or snowflake run at a time, otherwise test data is
# clobbered by concurrent runs
concurrency:
group: ${{ matrix.backend.title }}-${{ matrix.python-version }}-${{ github.event.label.name || 'ci-run-cloud' }}
cancel-in-progress: false

runs-on: ubuntu-latest
runs-on: ${{ matrix.os }}
if: github.event_name == 'push' || github.event.label.name == 'ci-run-cloud'
strategy:
fail-fast: false
matrix:
os:
- ubuntu-latest
python-version:
- "3.10"
- "3.13"
Expand All @@ -51,33 +53,44 @@ jobs:
extras:
- --extra athena
include:
- python-version: "3.10"
- os: ubuntu-latest
python-version: "3.10"
backend:
name: bigquery
title: BigQuery
extras:
- --extra bigquery
- python-version: "3.13"
- os: ubuntu-latest
python-version: "3.13"
backend:
name: bigquery
title: BigQuery
extras:
- --extra bigquery
- --extra geospatial
- python-version: "3.10"
- os: ubuntu-latest
python-version: "3.10"
backend:
name: snowflake
title: Snowflake + Snowpark
key: snowpark
extras:
- --extra snowflake
- python-version: "3.11"
- os: ubuntu-latest
python-version: "3.11"
backend:
name: snowflake
title: Snowflake + Snowpark
key: snowpark
extras:
- --extra snowflake
- os: windows-latest
python-version: "3.12"
backend:
name: snowflake
title: Snowflake
extras:
- --extra snowflake
# this allows extractions/setup-just to list releases for `just` at a higher
# rate limit while restricting GITHUB_TOKEN permissions elsewhere
permissions:
Expand All @@ -97,7 +110,7 @@ jobs:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.sha }}

- uses: actions/[email protected].2
- uses: actions/[email protected].3
id: generate_token
with:
app-id: ${{ secrets.DOCS_BOT_APP_ID }}
Expand Down Expand Up @@ -138,6 +151,7 @@ jobs:

- name: setup databricks credentials
if: matrix.backend.name == 'databricks'
shell: bash
run: |
{
echo "DATABRICKS_HTTP_PATH=${DATABRICKS_HTTP_PATH}"
Expand All @@ -151,6 +165,7 @@ jobs:

- name: setup snowflake credentials
if: matrix.backend.name == 'snowflake'
shell: bash
run: |
pyversion="${{ matrix.python-version }}"
{
Expand All @@ -160,6 +175,9 @@ jobs:
echo "SNOWFLAKE_DATABASE=${SNOWFLAKE_DATABASE}"
echo "SNOWFLAKE_SCHEMA=${SNOWFLAKE_SCHEMA}_python${pyversion//./}_${{ matrix.backend.key }}"
echo "SNOWFLAKE_WAREHOUSE=${SNOWFLAKE_WAREHOUSE}"
if ${{ matrix.backend.key == 'snowpark' }}; then
echo "SNOWFLAKE_SNOWPARK=1" >> "$GITHUB_ENV"
fi
} >> "$GITHUB_ENV"
env:
SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
Expand All @@ -176,10 +194,6 @@ jobs:
aws-region: us-east-2
role-to-assume: arn:aws:iam::070284473168:role/ibis-project-athena

- name: enable snowpark testing
if: matrix.backend.key == 'snowpark'
run: echo "SNOWFLAKE_SNOWPARK=1" >> "$GITHUB_ENV"

- name: "run parallel tests: ${{ matrix.backend.name }}"
run: just ci-check "${{ join(matrix.backend.extras, ' ') }} --extra examples" -m ${{ matrix.backend.name }} --numprocesses auto --dist loadgroup

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
release:
runs-on: ubuntu-latest
steps:
- uses: actions/[email protected].2
- uses: actions/[email protected].3
id: generate_token
with:
app-id: ${{ secrets.APP_ID }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/update-nix-flakes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
- name: install nix
uses: DeterminateSystems/nix-installer-action@v16

- uses: actions/[email protected].2
- uses: actions/[email protected].3
id: generate-token
with:
app-id: ${{ secrets.SQUAWK_BOT_APP_ID }}
Expand Down
6 changes: 3 additions & 3 deletions ci/release/dry_run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,14 @@ nix develop '.#release' -c git commit -m 'test: semantic-release dry run' --no-v
unset GITHUB_ACTIONS

nix develop '.#release' -c npx --yes \
-p "semantic-release@24.0.0" \
-p "semantic-release" \
-p "@semantic-release/commit-analyzer" \
-p "@semantic-release/release-notes-generator" \
-p "@semantic-release/changelog" \
-p "@semantic-release/exec" \
-p "@semantic-release/git" \
-p "semantic-release-replace-plugin@1.2.0" \
-p "conventional-changelog-conventionalcommits@8.0.0" \
-p "semantic-release-replace-plugin" \
-p "conventional-changelog-conventionalcommits" \
semantic-release \
--ci \
--dry-run \
Expand Down
6 changes: 3 additions & 3 deletions ci/release/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
set -euo pipefail

nix develop '.#release' -c npx --yes \
-p "semantic-release@24.0.0" \
-p "semantic-release" \
-p "@semantic-release/commit-analyzer" \
-p "@semantic-release/release-notes-generator" \
-p "@semantic-release/changelog" \
-p "@semantic-release/github" \
-p "@semantic-release/exec" \
-p "@semantic-release/git" \
-p "semantic-release-replace-plugin@1.2.0" \
-p "conventional-changelog-conventionalcommits@8.0.0" \
-p "semantic-release-replace-plugin" \
-p "conventional-changelog-conventionalcommits" \
semantic-release --ci
11 changes: 7 additions & 4 deletions compose.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
services:
clickhouse:
image: clickhouse/clickhouse-server:25.1.2.3-alpine
image: clickhouse/clickhouse-server:25.1.3.23-alpine
ports:
- 8123:8123 # http port
- 9000:9000 # native protocol port
environment:
CLICKHOUSE_USER: ibis
CLICKHOUSE_PASSWORD: ""
healthcheck:
interval: 1s
retries: 10
test:
- CMD-SHELL
- wget -qO- 'http://127.0.0.1:8123/?query=SELECT%201' # SELECT 1
- wget -qO- 'http://ibis@clickhouse:8123/?query=SELECT%201' # SELECT 1
volumes:
- clickhouse:/var/lib/clickhouse/user_files/ibis
networks:
Expand Down Expand Up @@ -98,7 +101,7 @@ services:
- trino

minio:
image: bitnami/minio:2025.1.20
image: bitnami/minio:2025.2.3
environment:
MINIO_ROOT_USER: accesskey
MINIO_ROOT_PASSWORD: secretkey
Expand Down Expand Up @@ -554,7 +557,7 @@ services:
- impala

risingwave:
image: ghcr.io/risingwavelabs/risingwave:v2.1.2
image: ghcr.io/risingwavelabs/risingwave:v2.1.3
command: "standalone --meta-opts=\" \
--advertise-addr 0.0.0.0:5690 \
--backend mem \
Expand Down

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions docs/posts/campaign-finance/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ raw
```

```{python}
# For a more comprehesive description of the columns and their meaning, see
# For a more comprehensive description of the columns and their meaning, see
# https://www.fec.gov/campaign-finance-data/contributions-individuals-file-description/
columns = {
"CMTE_ID": "keep", # Committee ID
Expand Down Expand Up @@ -101,9 +101,9 @@ columns = {
"SUB_ID": "drop", # Submission ID. Unique number assigned to each transaction.
}
renaming = {old: new for old, new in zip(raw.columns, columns.keys())}
renaming = dict(zip(columns.keys(), raw.columns))
to_keep = [k for k, v in columns.items() if v == "keep"]
kept = raw.relabel(renaming)[to_keep]
kept = raw.rename(renaming)[to_keep]
kept
```

Expand Down Expand Up @@ -214,7 +214,7 @@ from ibis.expr.types import StringValue, DateValue
def mmddyyyy_to_date(val: StringValue) -> DateValue:
return val.cast(str).lpad(8, "0").to_timestamp("%m%d%Y").date()
return val.cast(str).lpad(8, "0").nullif("").to_timestamp("%m%d%Y").date()
cleaned = cleaned.mutate(date=mmddyyyy_to_date(_.TRANSACTION_DT)).drop("TRANSACTION_DT")
Expand Down
6 changes: 4 additions & 2 deletions docs/posts/ibis-duckdb-geospatial/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ streets
Using the deferred API, we can check which streets are within `d=10` meters of distance.

```{python}
sts_near_broad = streets.filter(_.geom.d_within(broad_station_subquery, 10))
sts_near_broad = streets.filter(_.geom.d_within(broad_station_subquery, distance=10))
sts_near_broad
```

Expand Down Expand Up @@ -227,7 +227,9 @@ data we can't tell the street near which it happened. However, we can check if t
distance of a street.

```{python}
h_street = streets.filter(_.geom.d_within(h_near_broad.select(_.geom).as_scalar(), 2))
h_street = streets.filter(
_.geom.d_within(h_near_broad.select(_.geom).as_scalar(), distance=2)
)
h_street
```

Expand Down
Loading

0 comments on commit 0713806

Please sign in to comment.