Skip to content

BillHallahan/nofib-buggy-symbolic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Bill Hallahan
Mar 21, 2025
cfd1090 · Mar 21, 2025

History

9 Commits
Mar 21, 2025
Nov 11, 2024
Nov 14, 2024
Nov 11, 2024
Nov 11, 2024
Nov 11, 2024
Nov 14, 2024
Nov 11, 2024

Repository files navigation

This is an adaption of the Buggy Nofib Haskell benchmark suite to evaluate symbolic execution tools.

Most bugs of type 1 and type 3 (see below) are considered.  We exclude type 3 bugs that correspond to
stack overflows, as symbolic executors are not (generally) able to detect stack overflows.

Type 1 bugs produce an incorrect results.  For the purpose of evaluating symbolic execution,
we equate the value returned by the buggy code and the correct code, and task the symbolic
executor with finding an inputting that violates the equality.

Type 3 bugs produce an exception (such as division by 0.)  Type 3 benchmarks are directly
symbolically executed, to look for an error.

We make use of a special mkSymbolic function:
  mkSymbolic :: IO a
which we assume the symbolic executor will recognize as introducing a new symbolic (unknown) value.
For nofib buggy benchmarks which accept input as command line arguments or command line input,
we substitute mkSymbolic to provide symbolic arguments/input.

Some nofib benchmarks do not accept input, and simply compute a constant.  For these benchmarks,
we adjust the first function with interesting (used) arguments and pass symbolic values in place
of the constant arguments in the original benchmark.  In some cases, we have to add arguments, or
use an (unsafe) IORef to introduce a shared symbolic "constant" (see cichelli for an example.)

We add Eq typeclass as needed (via deriving) to assert equality of returned values.  For Type 1 benchmarks,
we move datatypes that are returned into a module shared by both the Buggy and Real programs.

Notes on specific benchmarks:

- imaginary/gen_regexps
    - This benchmark has a bug of type 1, and uses the interact function from the standard library:
        interact :: (String -> String) -> IO ()
    Calling `interact f` (1) reads from the command line (2) passes what was read to f (3) prints the string
    returned by the command line.

    We define an interact2 function:
      interact2 :: (String -> String) -> (String -> String) -> IO ()
    to pass inputs to both the buggy and real programs, and compare their outputs.
- spectral/cichelli
    - The nofib-buggy cichelli benchmark has a bug of type 1, and no inputs.  Thus, we need to add an input
    to have an interesting benchmark for symbolic execution.  We do this by making the `keys` variable symbolic.
    Because this variable is used at multiple places in the code, and passing it to each of those points would be
    a pain, we instead use an unsafely-accessed IORef to hold the symbolic value.
- spectral/minimax
    - TBD
- spectral/sorting
    - The benchmark includes a "mangle" function which allows choosing between executing one of a number of sorting algorithms.
    The bug is in the heap sort implementation.  However, the main function in nofib-buggy included the line:
      putStr (mangle "quicksort" cs)
    which restricted the program to only running the quicksort algorithm.  We have made the algorithm selection symbolic, to allow
    running any of the sort algorithms.

------------------------------------------------------------------------------

This is the root directory of the "Buggy NoFib Haskell benchmark suite".

It is a 'buggy' version of the nofib collection of Haskell
programs that can be used to test debugging tools. 

All programs contain one of these bugs:
- a bug that produces an incorrect result
- a bug that produces non-termination
- a bug that produces an exception (e.g., div by zero)

Those files which contain a bug have a header as follows:

---------------------------------------------------------
--       WARNING: THIS PROGRAM CONTAINS A BUG!!!       --
--                                                     --
--  This program belongs to the faulty nofib library   --
--      and contains a bug to benchmark debuggers      --
--                                                     --
---------------------------------------------------------
--                                                     --
--  "The faulty nofib library" is a  collection of     --
--  Haskell programs from the 'nofib' benchmark suite  --
--                                                     --
--  Faults are always marked with a comment: "BUG"     --
--  The commented correct line appears after the       --
--  faulty line marked with "CORRECT"                  --
--                                                     --
--  We welcome any comment or improvement about        --
--  bugs. You can send them to:                        --
--        Josep Silva (jsilva@dsic.upv.es)             --
--                                                     --
---------------------------------------------------------
--                                                     --
--  There are three kinds of bugs depending on their   --
--  consequences:                                      --
--  1) Bugs that produce an incorrect result           --
--  2) Bugs that produce non-termination               --
--  3) Bugs that produce an exception (e.g. div by 0)  --
--                                                     --
--  This program contains a bug of tipe 1              --
---------------------------------------------------------

Those lines which contain a bug are preceded by:
"-- BUG: The following line contains a bug:"
and they are followed by the corrected statement 

As an example:

-- BUG: The following line contains a bug:
  map (zipWith (\n x -> if n then -x else x) (iterate not True)) powers
-- CORRECT --   map (zipWith (\n x -> if n then x else -x) (iterate not True)) powers

------------------------------------------------------------------------------


This repo should be part of a GHC source tree, that is the 'nofib' directory
should be at the same level in the tree as 'compiler' and 'libraries'.

To run the tests:

  $ make clean
  $ make boot
  $ make 2>&1 | tee nofib-log

will put the results in the file 'nofib-log'.

To compare the results of multiple runs, use the program in
../utils/nofib-analyse.  Something like this:

  $ nofib-analyse nofib-log-6.4.2 nofib-log-6.6

to generate a comparison of the runs in captured in 'nofib-log-6.4.2'
and 'nofib-log-6.6'.  When making comparisons, be careful to ensure
that the things that changed between the builds are only the things
that you *wanted* to change.  There are lots of variables: machine,
GHC version, GCC version, C libraries, static vs. dynamic GMP library,
build options, run options, and probably lots more.  To be on the safe
side, make both runs on the same unloaded machine.

To get instruction counts, memory reads/writes, and "cache misses",
you'll need to get hold of Cachegrind, which is part of Valgrind
(http://valgrind.org).

There are some options you might want to tweak; search for nofib in
../mk/config.mk, and override settings in ../mk/build.mk as usual.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published