Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash in OrderedDict similiar to #83771 #119004

Open
chibinz opened this issue May 13, 2024 · 6 comments
Open

Crash in OrderedDict similiar to #83771 #119004

chibinz opened this issue May 13, 2024 · 6 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@chibinz
Copy link

chibinz commented May 13, 2024

Bug report

Bug description:

import collections

global count
count = 0

class Evil():
    def __eq__(self, other):
        global count
        print(count)

        if count == 1:
            l.clear()
            print("cleared l")

        count += 1
        return True

    def __hash__(self):
        return 3

l = collections.OrderedDict({Evil(): 4, 5: 6})
r = collections.OrderedDict({Evil(): 4, 5: 6})

print(l == r)

CPython versions tested on:

3.10, 3.13

Operating systems tested on:

Linux

@chibinz chibinz added the type-bug An unexpected behavior, bug, or error label May 13, 2024
@graingert
Copy link
Contributor

graingert commented May 13, 2024

this seems to work correctly in 3.12.3+ and Python 3.13.0a6+ for me (ubuntu Jammy)

 graingert@conscientious  ~/projects  python3.11 crash.py
0
1
cleared l
[1]    742219 segmentation fault (core dumped)  python3.11 crash.py
 ✘  graingert@conscientious  ~/projects  python3.12 crash.py
0
1
cleared l
False
 graingert@conscientious  ~/projects  python3.13 crash.py
0
1
cleared l
False
 graingert@conscientious  ~/projects  

it also works correctly on main: Python 3.14.0a0 (heads/main:7d7eec595a, May 13 2024, 17:38:47)

@sobolevn
Copy link
Member

I can reproduce locally on 7d7eec5 with --with-pydebug

Output on macos:

» PYTHONFAULTHANDLER=1 ./python.exe ex.py
Fatal Python error: Segmentation fault

Current thread 0x00000001fafabac0 (most recent call first):
  File "/Users/sobolev/Desktop/cpython2/ex.py", line 24 in <module>
[2]    61910 segmentation fault  PYTHONFAULTHANDLER=1 ./python.exe ex.py

The problem happens here:

int res = PyObject_RichCompareBool(
(PyObject *)_odictnode_KEY(node_a),
(PyObject *)_odictnode_KEY(node_b),
Py_EQ);

@sobolevn
Copy link
Member

I think we can add something like

    if (di->di_odict->od_state != di->di_state) {
        PyErr_SetString(PyExc_RuntimeError,
                        "OrderedDict mutated during iteration");
        goto done;
    }

🤔

@sobolevn
Copy link
Member

I came up with something like

» git patch
diff --git Objects/odictobject.c Objects/odictobject.c
index 53f64fc81e7..df5fe259d39 100644
--- Objects/odictobject.c
+++ Objects/odictobject.c
@@ -806,6 +806,11 @@ _odict_keys_equal(PyODictObject *a, PyODictObject *b)
 {
     _ODictNode *node_a, *node_b;
 
+    size_t state_a = a->od_state;
+    size_t state_b = b->od_state;
+    Py_ssize_t size_a = PyODict_SIZE(a);
+    Py_ssize_t size_b = PyODict_SIZE(b);
+
     node_a = _odict_FIRST(a);
     node_b = _odict_FIRST(b);
     while (1) {
@@ -820,10 +825,22 @@ _odict_keys_equal(PyODictObject *a, PyODictObject *b)
                                             (PyObject *)_odictnode_KEY(node_a),
                                             (PyObject *)_odictnode_KEY(node_b),
                                             Py_EQ);
-            if (res < 0)
+            if (res < 0) {
                 return res;
-            else if (res == 0)
+            }
+            else if (a->od_state != state_a || b->od_state != state_b) {
+                PyErr_SetString(PyExc_RuntimeError,
+                                "OrderedDict mutated during iteration");
+                return -1;
+            }
+            else if (size_a != PyODict_SIZE(a) || size_b != PyODict_SIZE(b)) {
+                PyErr_SetString(PyExc_RuntimeError,
+                                "OrderedDict changed size during iteration");
+                return -1;
+            }
+            else if (res == 0) {
                 return 0;
+            }
 
             /* otherwise it must match, so move on to the next one */
             node_a = _odictnode_NEXT(node_a);
                                                 

I am not sure that this is a correct and/or optimal solution though.

It raises an error like so:

» PYTHONFAULTHANDLER=1 ./python.exe ex.py
0
1
cleared l
Traceback (most recent call last):
  File "/Users/sobolev/Desktop/cpython2/ex.py", line 24, in <module>
    print(l == r)
          ^^^^^^
RuntimeError: OrderedDict changed size during iteration

@sobolevn sobolevn added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump and removed type-bug An unexpected behavior, bug, or error labels May 13, 2024
@Eclips4
Copy link
Member

Eclips4 commented May 13, 2024

@sobolevn I think your solution is the only one that is correct.
There's a known issues with clearing during the comparison: #82769
Unfortunately, the nature of OrderedDict does not allow us to apply the same fix as for #82769 (Just incref the reference counter of values to compare before PyObject_RichCompareBool).
The root cause is in how is clear method is implemented for OrderedDict. It's decref's keys in nodes and then frees the node.
So, we cannot "incref" keys in the _odict_keys_equal.
Let's imagine we implemented this:

    while (1) {
        if (node_a == NULL && node_b == NULL)
            /* success: hit the end of each at the same time */
            return 1;
        else if (node_a == NULL || node_b == NULL)
            /* unequal length */
            return 0;
        else {
            Py_INCREF(_odictnode_KEY(node_a));
            Py_INCREF(_odictnode_KEY(node_b));
            int res = PyObject_RichCompareBool(
                                            (PyObject *)_odictnode_KEY(node_a),
                                            (PyObject *)_odictnode_KEY(node_b),
                                            Py_EQ);

            if (res < 0)
                return res;
            else if (res == 0)
                return 0;

            /* otherwise it must match, so move on to the next one */
            node_a = _odictnode_NEXT(node_a);
            node_b = _odictnode_NEXT(node_b);
        }
    }

The first iteration of this loop will complete without any errors (actually clearing our OrderedDict). So, after this iteration our node_a and node_b become the next nodes (see the last lines of the cycle above).
But... some of these nodes actually freed! (in our example it's the node_a but it's not hard to clear both of our OrderedDicts :))
Now node_a points to something like 0xdddddddddddddddd and accessing its field key leads to a segfault.

@sobolevn
Copy link
Member

@Eclips4 yeap, this was the first thing I tried. node_a->key caused the crash on the second loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants