Skip to content

Commit 7bbb18a

Browse files
committed
Add support for marking tests as expected failures
1 parent cb4a2ac commit 7bbb18a

20 files changed

+738
-207
lines changed

docs/regression_test_api.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ Test Decorators
2727

2828
.. autodecorator:: reframe.core.decorators.simple_test
2929

30+
.. autodecorator:: reframe.core.decorators.xfail
31+
3032

3133
.. _builtins:
3234

@@ -86,6 +88,7 @@ The use of this module is required only when creating new tests programmatically
8688

8789
.. autofunction:: reframe.core.builtins.variable
8890

91+
.. autofunction:: reframe.core.builtins.xfail
8992

9093
.. _pipeline-hooks:
9194

docs/tutorial.rst

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1423,6 +1423,148 @@ Note that we are using another configuration file, which defines an MPI-enabled
14231423
You should adapt them if running on an actual parallel cluster.
14241424

14251425

1426+
Marking expected failures
1427+
=========================
1428+
1429+
.. versionadded:: 4.9
1430+
1431+
In ReFrame you can mark a test failure as expected and attach a message explaining the failure.
1432+
There are two types of failures, sanity and performance failures, and you can mark either type.
1433+
1434+
Expected sanity failures
1435+
------------------------
1436+
1437+
To mark an expected sanity failure, you need to decorate the test with the :func:`@xfail <reframe.core.decorators.xfail>` decorator as follows:
1438+
1439+
.. literalinclude:: ../examples/tutorial/stream/stream_runonly_xfail.py
1440+
:caption:
1441+
:lines: 5-
1442+
1443+
The :func:`@xfail <reframe.core.decorators.xfail>` decorator takes a message to issue when the test fails, explaining the failure, and optionally a predicate to control when the test is expected to fail (we will discuss this later in the section).
1444+
In the example above, we have introduced a typo in the sanity checking function to cause the test to fail, but we have marked it as an expected failure.
1445+
Here is the output:
1446+
1447+
.. code-block:: bash
1448+
:caption: Run in the single-node container
1449+
1450+
reframe -C tutorial/config/baseline.py -c tutorial/stream/stream_runonly_xfail.py -r
1451+
1452+
.. code-block:: console
1453+
1454+
1455+
[==========] Running 1 check(s)
1456+
[==========] Started on Thu May 15 09:10:06 2025+0000
1457+
1458+
[----------] start processing checks
1459+
[ RUN ] stream_test /2e15a047 @tutorialsys:default+baseline
1460+
[ XFAIL ] stream_test /2e15a047 @tutorialsys:default+baseline [demo failure]
1461+
[----------] all spawned checks have finished
1462+
1463+
[ PASSED ] Ran 1/1 test case(s) from 1 check(s) (0 failure(s), 1 expected failure(s), 0 skipped, 0 aborted)
1464+
[==========] Finished on Thu May 15 09:10:09 2025+0000
1465+
1466+
Notice that the test is marked as ``XFAIL`` and no ``FAILURE INFO`` is generated.
1467+
1468+
As mentioned previously you can control when a sanity failure should be considered expected or not by passing the ``predicate`` argument to :func:`@xfail <reframe.core.decorators.xfail>`.
1469+
In the following example, a failure is expected only if ``x<=2``:
1470+
1471+
.. literalinclude:: ../examples/tutorial/stream/stream_runonly_xfail_cond.py
1472+
:caption:
1473+
:lines: 5-
1474+
1475+
If we run with ``x=3``, the test will fail normally:
1476+
1477+
.. code-block:: bash
1478+
:caption: Run in the single-node container
1479+
1480+
reframe -C tutorial/config/baseline.py -c tutorial/stream/stream_runonly_xfail_cond.py -r -S x=3
1481+
1482+
.. code-block:: console
1483+
1484+
[----------] start processing checks
1485+
[ RUN ] stream_test /2e15a047 @tutorialsys:default+baseline
1486+
[ FAIL ] (1/1) stream_test /2e15a047 @tutorialsys:default+baseline
1487+
==> test failed during 'sanity': test staged in '/home/user/reframe-examples/stage/tutorialsys/default/baseline/stream_test'
1488+
[----------] all spawned checks have finished
1489+
1490+
1491+
If a test passes unexpectedly, its status will be set to ``XPASS`` and it will be counted as failure.
1492+
In the previous example, we fix the typo in the assertion for ``x=2``, so that the test passes insted of expectedly failing.
1493+
Here is the outcome:
1494+
1495+
.. code-block:: bash
1496+
:caption: Run in the single-node container
1497+
1498+
reframe -C tutorial/config/baseline.py -c tutorial/stream/stream_runonly_xfail_cond.py -r -S x=2
1499+
1500+
.. code-block:: console
1501+
1502+
[----------] start processing checks
1503+
[ RUN ] stream_test /2e15a047 @tutorialsys:default+baseline
1504+
[ XPASS ] stream_test /2e15a047 @tutorialsys:default+baseline
1505+
[----------] all spawned checks have finished
1506+
1507+
[ FAILED ] Ran 1/1 test case(s) from 1 check(s) (1 failure(s), 0 expected failure(s), 0 skipped, 0 aborted)
1508+
1509+
Note that the test is counted as a failure and the overall run result is FAIL.
1510+
The reason in the ``FAILURE INFO`` of the test mentions that the test has passed unexpectedly and also states the original reason of why it should fail.
1511+
1512+
.. code-block:: console
1513+
1514+
* Reason: unexpected success error: demo failure
1515+
1516+
1517+
Expected performance failures
1518+
-----------------------------
1519+
1520+
A test fails a performance check when it cannot meet its reference performance within certain user-defined bounds.
1521+
As dicussed :ref:`earlier <writing-your-first-test>`, a test may define multiple performance variables, each one associated with a different reference.
1522+
To mark an expected performance failure, you need to wrap the :attr:`~reframe.core.pipeline.RegressionTest.reference` tuple with the :func:`~reframe.core.builtins.xfail` builtin.
1523+
Here is an example:
1524+
1525+
.. code-block:: python
1526+
1527+
reference = {
1528+
'tutorialsys': {
1529+
'copy_bw': xfail('demo fail', (100_000, -0.1, 0.1, 'MB/s')),
1530+
'triad_bw': (30_000, -0.1, 0.1, 'MB/s')
1531+
}
1532+
}
1533+
1534+
For demonstration purposes, we have increased significantly the ``copy_bw`` reference so as to cause the test to fail and mark this as an expected failure.
1535+
Note that the :func:`~reframe.core.builtins.xfail` builtin is *different* from the :func:`@xfail <reframe.core.decorators.xfail>` decorator.
1536+
Although the first argument is always the message to be printed, the builtin takes the reference tuple as a second argument, whereas the decorator takes optionally a conditional.
1537+
1538+
.. note::
1539+
1540+
For testing out this example, you need to set the :attr:`reference` in the ``stream_run_only.py`` test based on your system's performance.
1541+
1542+
Since a test might define multiple performance variables, some of which may be marked as expected failures, the overall final state of the test has to be determined in a more complex way.
1543+
Assuming that the notation ``A>B`` means that ``A`` takes precendence over ``B``, the types of performance failures use the following hierarchy:
1544+
1545+
.. code-block:: console
1546+
1547+
FAIL > XPASS > XFAIL > PASS
1548+
1549+
In other words, if at least one performance variable fails, the test is a ``FAIL``.
1550+
If none of the performance variables fails, but at least one passes unexpectedly, the test is an ``XPASS``.
1551+
If none of the performance variables fails nor passes unexpectedly and all the expected failures fail, then the test is an ``XFAIL``.
1552+
In all other cases the test is a ``PASS``.
1553+
1554+
In cases of failures or unexpected passes, the status of every performance will be printed in the ``FAILURE INFO``.
1555+
Also, if colors are enabled (see :option:`--nocolor`), then each variable's status will be color coded in the ``P:`` lines printed just after the test finishes.
1556+
Here is an example combined failure (``XPASS`` and ``FAIL``):
1557+
1558+
1559+
.. code-block:: console
1560+
1561+
* Reason: performance error:
1562+
reference(s) met unexpectedly: copy_bw=40299.1 MB/s, expected 40000 (l=36000.0, u=44000.0)
1563+
failed to meet reference(s): triad_bw=30561.6 MB/s, expected 100000 (l=90000.0, u=110000.00000000001)
1564+
1565+
You can try different combinations of :func:`~reframe.core.builtins.xfail` markings and reference values to explore the behavior.
1566+
1567+
14261568
Managing the run session
14271569
========================
14281570

examples/tutorial/stream/stream_runonly.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,12 @@ class stream_test(rfm.RunOnlyRegressionTest):
1111
valid_systems = ['*']
1212
valid_prog_environs = ['*']
1313
executable = 'stream.x'
14+
reference = {
15+
'tutorialsys': {
16+
'copy_bw': xfail('bug 123', (100_000, -0.1, 0.1, 'MB/s')),
17+
'triad_bw': (100_000, -0.1, 0.1, 'MB/s')
18+
}
19+
}
1420

1521
@sanity_function
1622
def validate(self):
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Copyright 2016-2025 Swiss National Supercomputing Centre (CSCS/ETH Zurich)
2+
# ReFrame Project Developers. See the top-level LICENSE file for details.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
import reframe as rfm
6+
import reframe.utility.sanity as sn
7+
8+
9+
@rfm.simple_test
10+
@rfm.xfail('demo failure')
11+
class stream_test(rfm.RunOnlyRegressionTest):
12+
valid_systems = ['*']
13+
valid_prog_environs = ['*']
14+
executable = 'stream.x'
15+
16+
@sanity_function
17+
def validate(self):
18+
return sn.assert_found(r'Slution Validates', self.stdout)
19+
20+
@performance_function('MB/s')
21+
def copy_bw(self):
22+
return sn.extractsingle(r'Copy:\s+(\S+)', self.stdout, 1, float)
23+
24+
@performance_function('MB/s')
25+
def triad_bw(self):
26+
return sn.extractsingle(r'Triad:\s+(\S+)', self.stdout, 1, float)
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Copyright 2016-2025 Swiss National Supercomputing Centre (CSCS/ETH Zurich)
2+
# ReFrame Project Developers. See the top-level LICENSE file for details.
3+
#
4+
# SPDX-License-Identifier: BSD-3-Clause
5+
import reframe as rfm
6+
import reframe.utility.sanity as sn
7+
8+
9+
@rfm.simple_test
10+
@rfm.xfail('demo failure', lambda test: test.x <= 2)
11+
class stream_test(rfm.RunOnlyRegressionTest):
12+
valid_systems = ['*']
13+
valid_prog_environs = ['*']
14+
executable = 'stream.x'
15+
x = variable(int, value=1)
16+
17+
@sanity_function
18+
def validate(self):
19+
if self.x < 2 or self.x > 2:
20+
return sn.assert_found(r'Slution Validates', self.stdout)
21+
elif self.x == 2:
22+
return sn.assert_found(r'Solution Validates', self.stdout)
23+
24+
@performance_function('MB/s')
25+
def copy_bw(self):
26+
return sn.extractsingle(r'Copy:\s+(\S+)', self.stdout, 1, float)
27+
28+
@performance_function('MB/s')
29+
def triad_bw(self):
30+
return sn.extractsingle(r'Triad:\s+(\S+)', self.stdout, 1, float)

examples/tutorial/stream/stream_variables_fixtures.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,12 @@ class stream_test(rfm.RunOnlyRegressionTest):
2727
valid_prog_environs = ['+openmp']
2828
stream_binary = fixture(build_stream, scope='environment')
2929
num_threads = variable(int, value=0)
30+
reference = {
31+
'*': {
32+
'copy_bw': xfail('bug 123', (100_000, -0.1, 0.1, 'MB/s')),
33+
'triad_bw': (40_000, -0.1, 0.1, 'MB/s')
34+
}
35+
}
3036

3137
@run_after('setup')
3238
def set_executable(self):

reframe/core/builtins.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
#
99

1010
import functools
11+
from collections import namedtuple
12+
1113
import reframe.core.parameters as parameters
1214
import reframe.core.variables as variables
1315
import reframe.core.fixtures as fixtures
@@ -19,7 +21,7 @@
1921
__all__ = ['deferrable', 'deprecate', 'final', 'fixture', 'loggable',
2022
'loggable_as', 'parameter', 'performance_function', 'required',
2123
'require_deps', 'run_before', 'run_after', 'sanity_function',
22-
'variable']
24+
'variable', 'xfail']
2325

2426
parameter = parameters.TestParam
2527
variable = variables.TestVar
@@ -221,3 +223,18 @@ def _loggable(fn):
221223

222224
loggable = loggable_as(None)
223225
loggable.__doc__ = '''Equivalent to :func:`loggable_as(None) <loggable_as>`.'''
226+
227+
_XFailReference = namedtuple('XFailReference', ['message', 'data'])
228+
229+
230+
def xfail(message, reference):
231+
'''Mark a test :attr:`~reframe.core.pipeline.RegressionTest.reference` as
232+
an expected failure.
233+
234+
:arg message: The message to issue when this expected failure is
235+
encountered.
236+
:arg reference: The original reference tuple.
237+
238+
.. versionadded:: 4.9
239+
'''
240+
return _XFailReference(message=message, data=reference)

reframe/core/decorators.py

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
# Decorators used for the definition of tests
88
#
99

10-
__all__ = ['simple_test']
10+
__all__ = ['simple_test', 'xfail']
1111

1212
import collections
1313
import inspect
@@ -202,3 +202,46 @@ def simple_test(cls):
202202
_register_test(cls, variant_num=n)
203203

204204
return cls
205+
206+
207+
def xfail(message, predicate=None):
208+
'''Mark a test as an expected failure in sanity checking.
209+
210+
If you want mark an expected performance failure, take a look at the
211+
:func:`~reframe.core.builtins.xfail` builtin.
212+
213+
If a marked test passes sanity checking, then it will be marked as a
214+
failure.
215+
216+
:arg message: the message to be printed when this test fails expectedly.
217+
:arg predicate: A callable taking the test instance as its sole argument
218+
and returning :obj:`True` or :obj:`False`. If it returns :obj:`True`,
219+
then the test is marked as an expected failure, otherwise not. For
220+
example, the following test will only be marked as an expected failure
221+
only if variable ``x`` equals 1.
222+
223+
.. code-block:: python
224+
225+
@rfm.xfail('bug 123', lambda test: return test.x == 1)
226+
@rfm.simple_test
227+
class MyTest(...):
228+
x = variable(int, value=0)
229+
230+
If ``predicate=None`` then the test is marked as an expected failure
231+
unconditionally. It is equivalent to ``predicate=lambda _: True``.
232+
233+
.. versionadded:: 4.9
234+
'''
235+
def _default_predicate(_):
236+
return True
237+
238+
predicate = predicate or _default_predicate
239+
240+
def _xfail_fn(obj):
241+
return predicate(obj), message
242+
243+
def _deco(cls):
244+
cls.__rfm_xfail_sanity__ = _xfail_fn
245+
return cls
246+
247+
return _deco

reframe/core/exceptions.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -285,6 +285,14 @@ class SkipTestError(ReframeError):
285285
'''Raised when a test needs to be skipped.'''
286286

287287

288+
class ExpectedFailureError(ReframeError):
289+
'''Raised when a test failure is expected'''
290+
291+
292+
class UnexpectedSuccessError(ReframeError):
293+
'''Raised when a test unexpectedly passes'''
294+
295+
288296
def user_frame(exc_type, exc_value, tb):
289297
'''Return a user frame from the exception's traceback.
290298

0 commit comments

Comments
 (0)