Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Allow dbt show to output json data without extra logs (by updating --quiet) #9840

Open
3 tasks done
b-per opened this issue Apr 2, 2024 · 10 comments · May be fixed by #9958
Open
3 tasks done

[Feature] Allow dbt show to output json data without extra logs (by updating --quiet) #9840

b-per opened this issue Apr 2, 2024 · 10 comments · May be fixed by #9958
Labels
enhancement New feature or request show related to the dbt show command

Comments

@b-per
Copy link
Contributor

b-per commented Apr 2, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

I would like the output of dbt show to be valid JSON, containing just the data and no logs.

Some potential solutions:

  • dbt show could change its behaviour for --quiet, printing the results.
  • or we could have a new parameter to dbt show that performs this behaviour if we don't want to change the --quiet behaviour

What I want to achieve is to pipe a dbt show ... --output json into jq or jless. Today dbt --quiet show ... doesn't print anything to stdout, which feels counterintuitive (is it a bug actually?)

Describe alternatives you've considered

As suggested in the Community Slack I tried dbt --log-format json show ... --output=json ; but while the data is valid JSON, it doesn't show the results in a nice way as the real data results are encoded in a single string with escaped characters.

Who will this benefit?

People using the CLI and wanting a better/usable way to explorer the results of dbt show.

Are you interested in contributing this feature?

Yes

Anything else?

No response

@b-per b-per added enhancement New feature or request triage labels Apr 2, 2024
@b-per
Copy link
Contributor Author

b-per commented Apr 2, 2024

For posterity, I got this function working with zsh but it looks quite convoluted.

dbtjson(){

    zparseopts -D -E t:=opt_t -target:=opt_t l:=opt_limit -limit:=opt_limit

    if [[ -n $opt_t ]]; then
    target=("--target" "$opt_t[2]")
    else
    target=""
    fi

    if [[ -n $opt_limit ]]; then
    limit=("--limit" "${opt_limit[2]}")
    else
    limit=""
    fi

    echo "Param: use -t pr --target for target" >&2
    echo "Param: use -l or --limit for limit" >&2
    echo "dbt --log-format json show "$target[@]" --inline \"$@\" --output json "$limit[@] " | jq '.data | select(.preview) | .preview' | jq 'fromjson'" >&2
    dbt --log-format json show $target[@] --inline "$@" --output json $limit[@] | jq '.data | select(.preview) | .preview' | jq 'fromjson'
}

With this function, we can do dbtjson -t duckdb "select * from stg_nodes" and then pipe the output to:

  • jq or jless to explore the data
  • dasel -r json -w csv | pspg --csv to show the data in the pspg Postgres pager (using dasel)
  • ...

@jtcohen6
Copy link
Contributor

jtcohen6 commented Apr 2, 2024

I think I buy what you're suggesting as the behavior of dbt -q show!

Questions:

@b-per
Copy link
Contributor Author

b-per commented Apr 2, 2024

To me, the answer to both questions is Yes

@dbeatty10
Copy link
Contributor

To me, the answer to both questions is Yes

Agreed! 🚀

I've run into this same thing before (as recently as this past week), and the proposed behavior would make the output MUCH more ergonomic to work with.

@dbeatty10
Copy link
Contributor

dbeatty10 commented Apr 2, 2024

Show me

Let's assume a model named show_me.sql and that jq is already installed.

models/show_me.sql

{%- set noun = "money" -%}
{%- set noun = "JSON" -%}

select 'Show me the {{ noun }}!' as jerry_maguire
image

Examples for --select

Here's commands that mimic the desired behavior of dbt show and dbt compile when the --output flag is json vs. text:

dbt --log-format json show    --select show_me --output json | jq -r 'select(.data.preview != null)  | .info.msg'
dbt --log-format json compile --select show_me --output json | jq -r 'select(.data.compiled != null) | .info.msg'
dbt --log-format json show    --select show_me --output text | jq -r 'select(.data.preview != null)  | .data.preview'
dbt --log-format json compile --select show_me --output text | jq -r 'select(.data.compiled != null) | .data.compiled'

Here's the output of each (along with the associated desired dbt syntax to produce it):

dbt show -s show_me --output json
{
  "node": "show_me",
  "show": [
    {
      "jerry_maguire": "Show me the JSON!"
    }
  ]
}
dbt compile -s show_me --output json
{
  "node": "show_me",
  "compiled": "select 'Show me the JSON!' as jerry_maguire"
}
dbt show -s show_me
| jerry_maguire     |
| ----------------- |
| Show me the JSON! |
dbt compile -s show_me
select 'Show me the JSON!' as jerry_maguire

Examples for --inline

To see the behavior of --inline, just swap --select show_me with --inline "$(cat models/show_me.sql)" in the commands above. It will yield identical output.

@dbeatty10
Copy link
Contributor

Avoiding regressions

We'd just want to make sure that --log-format and --log-format-file still behave as desired. Namely, we don't want to introduce a regression related to the --log-format* flags on accident. See #8358 for a separate issue related to structured logs when --output json for dbt list and show (we should also consider compile).

For dbt show, there's 4 combinations to consider between the following:

  • --output (default) vs. json
  • --select vs. --inline

--log-format / --log-format-file each have (4) possibilities to consider as well: text (default), json, debug, and default.

@dbeatty10
Copy link
Contributor

I think we can separate these into two different issues, both of which we're interested in (in priority order):

  1. There is stand-alone output (without other logs) when the --quiet flag is passed to show or compile
  2. --quiet is the default setting for show (and potentially compile as well)

The 2nd one needs more refinement, and I'll create a separate issue for it.

Proposed acceptance criteria (for the 1st item above)

Main scenario

When:

  • The -q / --quiet flag is included
  • The subcommand is show or compile
  • The --inline flag is used or a single node is selected via -s / --select

Then:

  • There is some output:
    • There is only valid JSON when --output json
    • There is only the relevant text when --output text (default)
    • There is not any additional log output (that begins with timestamps, etc.)
    • See #9840 (comment) for examples

Additional scenarios

When:

  • --log-format json

Then:

  • The JSON logs are unchanged from the current behavior

When:

  • --log-format text (default)

Then:

  • The text logs are unchanged from the current behavior

@dbeatty10 dbeatty10 removed the triage label Apr 2, 2024
@dbeatty10 dbeatty10 changed the title [Feature] Allow dbt show to output json data without extra logs (maybe just fix/update --quiet ?) [Feature] Allow dbt show to output json data without extra logs (by updating --quiet) Apr 2, 2024
@dbeatty10
Copy link
Contributor

For the 2nd item above, created #9843 and labeled it for further refinement.

@dbeatty10 dbeatty10 added the show related to the dbt show command label Apr 16, 2024
@ChenyuLInx
Copy link
Contributor

real data results are encoded in a single string with escaped characters.

This is really hard to accomplish via log_format=json, since log-format controls all of the log of dbt-core, where it would convert all events fired by dbt-core to a JSON.

I think we can do this by making sure some show result itself gets to stdout when --quiet is specified. And what this show result if formated in would be controlled by another flag --show-result-format or something like that.
Do you think that would work?

I have some PR that's adding a PrintEvent that always going to get to stdout even when --quiet is specified here.

@dbeatty10
Copy link
Contributor

@ChenyuLInx Thanks for that info 🧠 I'll try using the approach used in #10131 within my proposed solution in #9958.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request show related to the dbt show command
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants