Skip to content

added support for YAML config of queries #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

richbenmintz
Copy link

The Goal of this PR is to include support for YAML based query defintion files, for ease of source control and maintenance.

I have added support for YAML based query configuration and provided a sample yaml config file in the media section.
I have also modified file location to be a parameter and the appended the default lakehouse mount to support both pandas and file open for yaml

I have tested the code locally and it seems to work.

results of my test in following screen shot

image

image

@richbenmintz
Copy link
Author

@microsoft-github-policy-service agree

@DAXNoobJustin
Copy link
Contributor

Hey @richbenmintz,

This is a great idea! Love it.

Could you change the loading of the DAX queries to a function outside of the config cell? Hoping to keep that as simple as possible for the users.

Something like:

Config cell:

# Read DAX queries from the Excel or YAML file uploaded to the attached lakehouse
# The first column must be 'queryId' and additional columns should contain variants of the DAX query.
query_file_path = "Files/DAXQueries.xlsx"  # Path to the query file relative to the mount
query_file_mount_path = "/default"              # Mount location where the file is stored
query_worksheet_name = "DAXQueries"          # Worksheet name (for Excel files)

Helper function cell:

@log_function_calls
def load_dax_queries(file_path: str, mount_path: str, worksheet_name: str = None) -> pd.DataFrame:
    """
    Loads the DAX queries from the given file. Supports Excel and YAML formats.
    
    Args:
        file_path (str): Relative path to the query file.
        mount_path (str): The mount path where the file is stored.
        worksheet_name (str, optional): Worksheet name for Excel files.
        
    Returns:
        pd.DataFrame: DataFrame containing the DAX queries.
    """
    file_type = file_path.split('.')[-1].lower()
    full_path = f"{notebookutils.fs.getMountPath(mount_path)}/{file_path}"
    
    if file_type == 'xlsx':
        # Use the worksheet_name if provided, defaulting to the first sheet otherwise.
        return pd.read_excel(full_path, sheet_name=worksheet_name)
    elif file_type in ['yml', 'yaml']:
        with open(full_path, 'r') as f:
            data = yaml.load(f, Loader=yaml.FullLoader)
        return pd.DataFrame(data)
    else:
        raise ValueError(f"Unsupported file type: {file_type}")

Main run_dax_queries function

@log_function_calls
def run_dax_queries() -> None:
    """
    Main entry point for running all DAX queries from the Excel file.
    Manages the log table, capacity checks, and iterates over all queries and their combinations.
    """
    print("🚀 Starting all DAX queries")

    # Load the DAX queries using the configuration parameters.
    dax_queries = load_dax_queries(query_file_path, query_file_mount_path, query_worksheet_name)

@richbenmintz
Copy link
Author

richbenmintz commented Mar 13, 2025 via email

@richbenmintz
Copy link
Author

I have updated the code as suggested

Copy link
Contributor

@DAXNoobJustin DAXNoobJustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@itsnotaboutthecell itsnotaboutthecell merged commit 2b24402 into microsoft:main Mar 13, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants