Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace numpy-Python comparison with dtype #210

Merged
merged 2 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions changelog_entry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
- bump: patch
changes:
changed:
- Replaced unsafe numpy-Python comparison with use of numpy dtype to convert byte-string arrays to Unicode ones within enums
7 changes: 5 additions & 2 deletions policyengine_core/enums/enum.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,11 @@ def encode(cls, array: Union[EnumArray, np.ndarray]) -> EnumArray:
if isinstance(array, EnumArray):
return array

# if array.dtype.kind == "b":
if isinstance(array == 0, bool):
# First, convert byte-string arrays to Unicode-string arrays
# Confusingly, Numpy uses "S" to refer to byte-string arrays
# and "U" to refer to Unicode-string arrays, which are also
# referred to as the "str" type
if array.dtype.kind == "S":
# Convert boolean array to string array
array = array.astype(str)

Expand Down
39 changes: 39 additions & 0 deletions tests/core/enums/test_enum.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import pytest
import numpy as np
from policyengine_core.enums.enum import Enum
from policyengine_core.enums.enum_array import EnumArray


def test_enum_creation():
"""
Test to make sure that various types of numpy arrays
are correctly encoded to int-typed EnumArray instances;
check enum_array.py to see why int-typed
"""

test_simple_array = ["MAXWELL", "DWORKIN", "MAXWELL"]

class Sample(Enum):
MAXWELL = "maxwell"
DWORKIN = "dworkin"

sample_string_array = np.array(test_simple_array)
sample_item_array = np.array(
[Sample.MAXWELL, Sample.DWORKIN, Sample.MAXWELL]
)
explicit_s_array = np.array(test_simple_array, "S")

encoded_array = Sample.encode(sample_string_array)
assert len(encoded_array) == 3
assert isinstance(encoded_array, EnumArray)
assert encoded_array.dtype.kind == "i"

encoded_array = Sample.encode(sample_item_array)
assert len(encoded_array) == 3
assert isinstance(encoded_array, EnumArray)
assert encoded_array.dtype.kind == "i"

encoded_array = Sample.encode(explicit_s_array)
assert len(encoded_array) == 3
assert isinstance(encoded_array, EnumArray)
assert encoded_array.dtype.kind == "i"
Loading