-
Notifications
You must be signed in to change notification settings - Fork 82
Fix serverless compatibility by replacing cache() with conditional persistence #2218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix serverless compatibility by replacing cache() with conditional persistence #2218
Conversation
Use Unity Catalog volumes instead of .cache() for serverless. Auto-detects compute type. Fixes: [NOT_SUPPORTED_WITH_SERVERLESS]
|
✅ 51/51 passed, 5 flaky, 4m27s total Flaky tests:
Running from acceptance #3364 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2218 +/- ##
==========================================
- Coverage 64.05% 63.93% -0.12%
==========================================
Files 100 100
Lines 8624 8640 +16
Branches 893 894 +1
==========================================
Hits 5524 5524
- Misses 2928 2944 +16
Partials 172 172 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…on-serverless-compute
Use specific exception types instead of broad Exception catch to satisfy CI linter rules. Add cluster ID check for improved detection.
Avoids CONFIG_NOT_AVAILABLE exceptions by fetching all configs at once. Passes all linter checks.
| """Detect if running on serverless compute""" | ||
| try: | ||
| # Try to get compute type from Spark conf | ||
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you link the documentation of this property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark.databricks.clusterUsageTags.clusterType is an internal Databricks metadata tag used to identify the compute type. It's not officially documented in public Databricks docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") | |
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType", "") |
It is possible to find out what is there by running spark.conf.getAll in a databricks notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For serveless:
Total configs: 3
spark.databricks.execution.timeout = 9000
spark.sql.ansi.enabled = true
spark.sql.shuffle.partitions = auto
| """Detect if running on serverless compute""" | ||
| try: | ||
| # Try to get compute type from Spark conf | ||
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") | |
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType", "") |
It is possible to find out what is there by running spark.conf.getAll in a databricks notebook
| try: | ||
| # Try to get compute type from Spark conf | ||
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") | ||
| if compute_type is None: | ||
| compute_type = "" | ||
| return "serverless" in compute_type.lower() | ||
| except (AnalysisException, AttributeError, KeyError, RuntimeError): | ||
| # If detection fails (Spark config unavailable or invalid), assume serverless for safety | ||
| logger.warning("Unable to detect compute type, assuming serverless mode") | ||
| return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| try: | |
| # Try to get compute type from Spark conf | |
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "") | |
| if compute_type is None: | |
| compute_type = "" | |
| return "serverless" in compute_type.lower() | |
| except (AnalysisException, AttributeError, KeyError, RuntimeError): | |
| # If detection fails (Spark config unavailable or invalid), assume serverless for safety | |
| logger.warning("Unable to detect compute type, assuming serverless mode") | |
| return True | |
| try: | |
| compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType") | |
| except (AnalysisException, SparkNoSuchElementException): | |
| compute_type = "unknown" | |
| return "standard" not in compute_type.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I investigated this a bit on a standard cluster:
spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType") returns 'Standard_D8ads_v6'
on a serverless cluster it does not work, that is why on errors we can assume serverless
distinguishes serverless (CONFIG_NOT_AVAILABLE) from classic clusters.
…on-serverless-compute
…on-serverless-compute
…ss-compute' of github.com:databrickslabs/lakebridge into 1438-feature-remorph-reconcile-fails-to-run-on-serverless-compute
🐛 Problem
Reconciliation fails on Databricks serverless compute with:
[NOT_SUPPORTED_WITH_SERVERLESS] PERSIST TABLE is not supported on serverless compute
🔍 Root Cause
The reconciliation process uses .cache() for performance optimization, but serverless compute does not support DataFrame caching operations.
✅ Solution
Implemented serverless detection and conditional caching strategy:
Changes Made
New _is_serverless() method checks for clusterNodeType config
Classic clusters: config exists → returns False
Serverless: config throws CONFIG_NOT_AVAILABLE → returns True
Classic clusters: Uses .cache() for performance (existing behavior)
Serverless: Skips caching to avoid runtime errors
Technical Details
Detection Method:
node_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType")
✅ Classic: Returns node type (e.g., i3.2xlarge)
❌ Serverless: Throws AnalysisException with CONFIG_NOT_AVAILABLE
Fixed issue: #1438
Tests