Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected, Misleading mypy Failures After pandas-stubs Update: DataFrame.from_dict() Behavior #928

Open
pmaier-bhs opened this issue May 24, 2024 · 5 comments

Comments

@pmaier-bhs
Copy link

pmaier-bhs commented May 24, 2024

Describe the bug
The DataFrame.from_dict() method allows parsing lists of dictionaries, where each dictionary is interpreted as a single row. However, this behavior is not reflected in the typed method signatures coded in pandas-stubs. We use that behavior and add # type: ignore comments to suppress mypy errors.

This worked without issue up until pandas-stubs version 2.2.1.240316, but with the update to version 2.2.2.240514, it leads to unexpected mypy failures, see below. The specific change responsible might be this commit.

To Reproduce

  1. Provide a minimal runnable pandas example that is not properly checked by the stubs.
import pandas as pd

# %% mypy error Returning Any from function declared to return "int"  [no-any-return]


def f() -> int:
    b = [
        {"key1": "value1", "key2": 42},
        {"key1": "value2", "key2": 123},
    ]
    df = pd.DataFrame.from_dict(b)  # type: ignore
    return df.shape[0]
  1. Indicate which type checker you are using (mypy or pyright).

mypy

  1. Show the error message received from that type checker while checking your example.
✕ mypy failed.
17:19:33.84 [ERROR] Completed: Typecheck using MyPy - mypy - mypy failed (exit code 1).
src/python/mdl-data-insertion/mdl_data_insertion/util/typing_problem.py:12: error: Returning Any from function declared to return "int"  [no-any-return]
Found 1 error in 1 file (checked 1 source file)

Please complete the following information:

  • OS: MacOS
  • OS Version 14.5 (23F79)
  • python version 3.10.13
  • version of type checker: 1.9.0
  • version of installed pandas-stubs: 2.2.2.240514
@pmaier-bhs
Copy link
Author

pmaier-bhs commented May 24, 2024

Btw. the following example works, i.e. shows no issue:

b = [
    {"key1": "value1", "key2": 42},
    {"key1": "value2", "key2": 123},
]
df = pd.DataFrame.from_dict(b)  # type: ignore
i: int = df.shape[0]

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented May 28, 2024

You wrote:

The DataFrame.from_dict() method allows parsing lists of dictionaries

If I look at the docs https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_dict.html , it doesn't say that a list of dict is allowed. So while that may work, that may be a pandas bug or a documentation bug. Can you create an issue in pandas about the docs of to_dict() and see what the pandas core team has to say?

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented May 28, 2024

Related to #929 .

@pmaier-bhs
Copy link
Author

You wrote:

The DataFrame.from_dict() method allows parsing lists of dictionaries

If I look at the docs https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_dict.html , it doesn't say that a list of dict is allowed. So while that may work, that may be a pandas bug or a documentation bug. Can you create an issue in pandas about the docs of to_dict() and see what the pandas core team has to say?

Sure!

@pmaier-bhs
Copy link
Author

Created a pandas issue, see pandas-dev/pandas#58862.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants