Skip to content

Conversation

@BesikiML
Copy link
Contributor

@BesikiML BesikiML commented Jan 7, 2026

🐛 Problem
Reconciliation fails on Databricks serverless compute with:
[NOT_SUPPORTED_WITH_SERVERLESS] PERSIST TABLE is not supported on serverless compute
🔍 Root Cause
The reconciliation process uses .cache() for performance optimization, but serverless compute does not support DataFrame caching operations.
✅ Solution
Implemented serverless detection and conditional caching strategy:
Changes Made

  1. Added Serverless Detection Method
    New _is_serverless() method checks for clusterNodeType config
    Classic clusters: config exists → returns False
    Serverless: config throws CONFIG_NOT_AVAILABLE → returns True
  2. Conditional Caching Logic
    Classic clusters: Uses .cache() for performance (existing behavior)
    Serverless: Skips caching to avoid runtime errors
    Technical Details
    Detection Method:
    node_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType")
    ✅ Classic: Returns node type (e.g., i3.2xlarge)
    ❌ Serverless: Throws AnalysisException with CONFIG_NOT_AVAILABLE

Fixed issue: #1438

Tests

  • manually tested
  • added unit tests
  • added integration tests

Use Unity Catalog volumes instead of .cache() for serverless. Auto-detects compute type.
Fixes: [NOT_SUPPORTED_WITH_SERVERLESS]
@BesikiML BesikiML requested a review from m-abulazm January 7, 2026 03:16
@BesikiML BesikiML requested a review from a team as a code owner January 7, 2026 03:16
@BesikiML BesikiML linked an issue Jan 7, 2026 that may be closed by this pull request
1 task
@BesikiML BesikiML self-assigned this Jan 7, 2026
@github-actions
Copy link

github-actions bot commented Jan 7, 2026

✅ 51/51 passed, 5 flaky, 4m27s total

Flaky tests:

  • 🤪 test_transpiles_informatica_to_sparksql (23.854s)
  • 🤪 test_transpile_teradata_sql_non_interactive[True] (22.807s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (4.164s)
  • 🤪 test_transpile_teradata_sql (23.798s)
  • 🤪 test_transpile_teradata_sql_non_interactive[False] (5.898s)

Running from acceptance #3364

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

❌ Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.93%. Comparing base (25312ad) to head (8525599).

Files with missing lines Patch % Lines
...bricks/labs/lakebridge/reconcile/reconciliation.py 0.00% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2218      +/-   ##
==========================================
- Coverage   64.05%   63.93%   -0.12%     
==========================================
  Files         100      100              
  Lines        8624     8640      +16     
  Branches      893      894       +1     
==========================================
  Hits         5524     5524              
- Misses       2928     2944      +16     
  Partials      172      172              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Use specific exception types instead of broad Exception catch
to satisfy CI linter rules. Add cluster ID check for improved detection.
Avoids CONFIG_NOT_AVAILABLE exceptions by fetching all configs at once.
Passes all linter checks.
"""Detect if running on serverless compute"""
try:
# Try to get compute type from Spark conf
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you link the documentation of this property?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark.databricks.clusterUsageTags.clusterType is an internal Databricks metadata tag used to identify the compute type. It's not officially documented in public Databricks docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType", "")

It is possible to find out what is there by running spark.conf.getAll in a databricks notebook

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For serveless:

Total configs: 3

spark.databricks.execution.timeout = 9000
spark.sql.ansi.enabled = true
spark.sql.shuffle.partitions = auto

"""Detect if running on serverless compute"""
try:
# Try to get compute type from Spark conf
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType", "")

It is possible to find out what is there by running spark.conf.getAll in a databricks notebook

Comment on lines 78 to 87
try:
# Try to get compute type from Spark conf
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
if compute_type is None:
compute_type = ""
return "serverless" in compute_type.lower()
except (AnalysisException, AttributeError, KeyError, RuntimeError):
# If detection fails (Spark config unavailable or invalid), assume serverless for safety
logger.warning("Unable to detect compute type, assuming serverless mode")
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
# Try to get compute type from Spark conf
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterType", "")
if compute_type is None:
compute_type = ""
return "serverless" in compute_type.lower()
except (AnalysisException, AttributeError, KeyError, RuntimeError):
# If detection fails (Spark config unavailable or invalid), assume serverless for safety
logger.warning("Unable to detect compute type, assuming serverless mode")
return True
try:
compute_type = self._spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType")
except (AnalysisException, SparkNoSuchElementException):
compute_type = "unknown"
return "standard" not in compute_type.lower()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated this a bit on a standard cluster:
spark.conf.get("spark.databricks.clusterUsageTags.clusterNodeType") returns 'Standard_D8ads_v6'

on a serverless cluster it does not work, that is why on errors we can assume serverless

distinguishes serverless (CONFIG_NOT_AVAILABLE) from classic
clusters.
…ss-compute' of github.com:databrickslabs/lakebridge into 1438-feature-remorph-reconcile-fails-to-run-on-serverless-compute
@BesikiML BesikiML changed the title Add serverless compute support with Unity Catalog volume persistence Fix serverless compatibility by replacing cache() with conditional persistence Jan 12, 2026
@BesikiML BesikiML requested a review from m-abulazm January 12, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Remorph Reconcile fails to run on serverless compute

3 participants