Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBD::Sponge corrected and lazy default PRECISION #12

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

pilcrow
Copy link

@pilcrow pilcrow commented Jun 15, 2014

DBD::Sponge defaults statement handle PRECISION to the length of column NAMEs, which means that a result set's PRECISION can be nonsensically smaller than would actually be required for the data:

$sponge->prepare("sponges",
                 { NAME =>   ['phylum',   'class'],                # PRECISION defaults to [6,5]
                   rows => [ ['Porifera', 'Hexactinellida'], ] }); # but data would need at least [8,14]

This request changes that default to the columnar maximum lengths of the data, computed only if needed. Test cases and small POD update. (Noticed this playing with yet another DBI::Shell alternative.)

@pilcrow
Copy link
Author

pilcrow commented Jun 15, 2014

sub _max_columnar_lengths {
  my ($numFields, $rows) = @_;
  my @precision = map { max map { length ($_||"") } @$_ } @$rows;
  return wantarray ? @precision : \@precision;
}

Nope. That gives us the max data length in each row, one per row. We want the column-wise max data length. (Also see second commit — I renamed sub, and just want to make sure you are looking at final form of this PR.)

@rehsack
Copy link
Member

rehsack commented Dec 9, 2014

Hi @pilcrow
would you please be so kind and re-submit a rebased PR with tests and commented changes?
For me it's difficult to guess where the patch stands as I'm not that familiar with meta-data in DBD's (I usually transfer those jobs to @Tux or @mjegh)

@pilcrow
Copy link
Author

pilcrow commented Dec 9, 2014

Wilco.

On Tuesday, December 9, 2014, Jens Rehsack [email protected] wrote:

Hi @pilcrow https://github.com/pilcrow
would you please be so kind and re-submit a rebased PR with tests and
commented changes?
For me it's difficult to guess where the patch stands as I'm not that
familiar with meta-data in DBD's (I usually transfer those jobs to @Tux
https://github.com/Tux or @mjegh https://github.com/mjegh)


Reply to this email directly or view it on GitHub
#12 (comment).

@rehsack
Copy link
Member

rehsack commented May 5, 2015

@pilcrow - I still miss the comments in changes and tests. @Tux or @mjegh - is the PR clear or do you need more information? Can you please place feedback requests here?

@pilcrow
Copy link
Author

pilcrow commented May 10, 2015

@rehsack, I updated the branch against current DBI and added some explanatory comments to the tests and changes. (I also documented DBD-Sponge's default TYPE.)

@Tux, @mjegh, it's been a while on this PR, but the basic issue is that a Spongey data set incorrectly defaults PRECISION to the length of the name of fields rather than the length of the data. To correct this, if the user does not specify PRECISION but later requests that attribute, we loop through the rows and remember the longest length for each column. The .t file covers this, user-supplied PRECISION, and a few other DBD-Sponge basics.

@timbunce
Copy link
Member

I agree the current behaviour is broken.
I'm unsure about this fix though.

The DBI docs for PRECISION say:

For numeric columns, the value is the maximum number of digits (without considering a sign character or decimal point). Note that the "display size" for floating point types (REAL, FLOAT, DOUBLE) can be up to 7 characters greater than the precision (for the sign + decimal point + the letter E + a sign + 2 or 3 digits).

For any character type column the value is the OCTET_LENGTH, in other words the number of bytes, not characters.

(More recent standards refer to this as COLUMN_SIZE but we stick with PRECISION for backwards compatibility.)

So I see two problems:

  • This fix uses the length in characters not bytes. That's easy to fix.
  • This fix applies the character type semantics to all columns and that would be wrong for any numeric columns. This could be fixed by checking the TYPE of the column and using different logic if it's numeric.

@pilcrow
Copy link
Author

pilcrow commented Jul 21, 2015

OK I'll revisit.

-Mike

On Sat, Jul 18, 2015 at 7:04 AM, Tim Bunce [email protected] wrote:

I agree the current behaviour is broken.
I'm unsure about this fix though.

The DBI docs for PRECISION https://metacpan.org/pod/DBI#PRECISION say:

For numeric columns, the value is the maximum number of digits (without
considering a sign character or decimal point). Note that the "display
size" for floating point types (REAL, FLOAT, DOUBLE) can be up to 7
characters greater than the precision (for the sign + decimal point + the
letter E + a sign + 2 or 3 digits).

For any character type column the value is the OCTET_LENGTH, in other
words the number of bytes, not characters.

(More recent standards refer to this as COLUMN_SIZE but we stick with
PRECISION for backwards compatibility.)

So I see two problems:

  • This fix uses the length in characters not bytes. That's easy to fix.
  • This fix applies the character type semantics to all columns and
    that would be wrong for any numeric columns. This could be fixed by
    checking the TYPE of the column and using different logic if it's
    numeric.


Reply to this email directly or view it on GitHub
#12 (comment).

@timbunce
Copy link
Member

Thanks @pilcrow!

@timbunce
Copy link
Member

Any news on this @pilcrow ?

@Tux
Copy link
Member

Tux commented Aug 11, 2024

@pilcrow ?

@pilcrow
Copy link
Author

pilcrow commented Aug 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants