Add some missing test coverage

seanh · seanh · commit ee61ba1467ba · 2024-11-01T13:19:36.000Z
`TestLimitedWorker` is missing unit test coverage for a couple of branches in `LimitedWorker`. The reason why coverage isn't failing on `main` is that `TestUserPurger` (the unit tests for `UserPurger`) calls through to the real `LimitedWorker` code (doesn't mock it) and those `UserPurger` unit tests were adding coverage to `LimitedWorker`. In a future PR (#9049) I want to make some changes to `TestUserPurger` that would cause them to no longer cover certain branches of `LimitedWorker` and would cause a coverage failure. It isn't `TestUserPurger`'s job to cover every line and branch of `LimitedWorker`, that's `TestLimitedWorker`'s job, so I'm adding the missing coverage to `TestLimitedWorker` now to prevent those coverage failures in future. This illustrates some of the risks of integrated tests: * You make changes to the tests for one unit and that causes coverage failures in *other* units, creating extra work to fix them. If we used integrated tests frequently this could happen often. * You can have missing unit tests and not know it because the lines and branches are accidentally covered by the tests for *other* units. * You might have a unit test that you think is covering a branch but actually isn't going down that branch. Even though the test is wrong it could still be passing. Missing coverage could save you but won't if tests for *other* units are covering the branch that your broken test is supposed to be covering. In this case I wouldn't want `TestUserPurger` to mock `LimitedWorker`: `UserPurger` contains lots of complex SQL queries that it executes through `LimitedWorker` rather than directly. We want these queries to be executed against the real DB so we can test that various objects are deleted or not-deleted correctly. I think the ideal solution might be for `TestUserPurger` to mock `LimitedWorker`, intercept the SQLAlchemy queries that `UserPurger` passes to the mock `LimitedWorker`, and for the _tests_ to then execute those queries against the DB and assert that they return the right results. But I'm not going to bite off that relatively major change right now.
diff --git a/tests/unit/h/services/user_delete_test.py b/tests/unit/h/services/user_delete_test.py
@@ -468,6 +468,35 @@ def user(self, factories, db_session):
 
 
 class TestLimitedWorker:
+    def test_update(self, caplog, db_session, factories):
+        annotations = factories.Annotation.create_batch(size=2, text="ORIGINAL")
+        worker = LimitedWorker(db_session, limit=3)
+
+        updated_annotation_ids = worker.update(
+            Annotation, select(Annotation.id), {"text": "UPDATED"}
+        )
+
+        assert sorted(updated_annotation_ids) == sorted(
+            [annotation.id for annotation in annotations]
+        )
+        assert worker.limit == 1
+        for annotation in annotations:
+            assert annotation.text == "UPDATED"
+        assert caplog.record_tuples == [
+            ("h.services.user_delete", logging.INFO, "Updated 2 rows from annotation")
+        ]
+
+    def test_update_when_no_matching_rows(self, caplog, db_session):
+        worker = LimitedWorker(db_session, limit=3)
+
+        updated_annotation_ids = worker.update(
+            Annotation, select(Annotation.id), {"text": "UPDATED"}
+        )
+
+        assert updated_annotation_ids == []
+        assert worker.limit == 3
+        assert caplog.record_tuples == []
+
     def test_update_when_limit_exceeded(self, db_session, factories):
         annotation = factories.Annotation()
         original_text = annotation.text
@@ -525,6 +554,30 @@ def test_update_when_limit_reached(self, caplog, db_session, factories):
             ("h.services.user_delete", logging.INFO, "Updated 1 rows from annotation")
         ]
 
+    def test_delete(self, caplog, db_session, factories):
+        annotations = factories.Annotation.create_batch(size=2)
+        worker = LimitedWorker(db_session, limit=3)
+
+        deleted_annotation_ids = worker.delete(Annotation, select(Annotation.id))
+
+        assert worker.limit == 1
+        assert sorted(deleted_annotation_ids) == sorted(
+            [annotation.id for annotation in annotations]
+        )
+        assert caplog.record_tuples == [
+            ("h.services.user_delete", logging.INFO, "Deleted 2 rows from annotation")
+        ]
+        assert db_session.scalars(select(Annotation)).all() == []
+
+    def test_delete_when_no_matching_rows(self, caplog, db_session):
+        worker = LimitedWorker(db_session, limit=3)
+
+        deleted_annotation_ids = worker.delete(Annotation, select(Annotation.id))
+
+        assert worker.limit == 3
+        assert deleted_annotation_ids == []
+        assert caplog.record_tuples == []
+
     def test_delete_when_limit_exceeded(self, db_session, factories):
         annotation = factories.Annotation()
         worker = LimitedWorker(db_session, limit=0)