Skip to content

chronos: Pause compile just before compiling the fuzz target so that we can reuse it later. #11937

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Aug 15, 2024

Conversation

DonggeLiu
Copy link
Contributor

@DonggeLiu DonggeLiu commented May 10, 2024

@jonathanmetzman proposed a great idea about saving the machine state just before compiling the fuzz target so that we can compile different fuzz targets from that state later without having to go through the earlier commands.
This is particularly beneficial for OSS-Fuzz-Gen.

This PR is an (incomplete) PoC at that.
Ideally, we:

  1. Replace the fuzz target compilation command and all commands after it with no-ops,
  2. Save them into a script (e.g., $SRC/re-run.sh), and
  3. Push the resulting image for later reuse.

In this way, we can reuse the image later by swapping the fuzz target source code and executing $SRC/re-run.sh.

The script in the PR can do 2, but not 1.
This might be OK already because steps in 1 are normally at the end, and there is unlikely any check to prevent them, but ideally, we should do 1, too.

To test this locally:

python infra/helper.py build_image libiec61850
docker run -ti --entrypoint=/bin/bash gcr.io/oss-fuzz/libiec61850
(in container) compile
cat /src/re-run.sh

@DonggeLiu DonggeLiu requested a review from jonathanmetzman May 10, 2024 07:21
Copy link

DonggeLiu has previously contributed to projects/libiec61850. The previous PR was #10109

@DonggeLiu
Copy link
Contributor Author

DonggeLiu commented Aug 3, 2024

# Usage
Under `OSS-Fuzz` root directory:
```bash
export PROJECT=libiec61850
export FUZZ_TARGET=fuzz_mms_decode.c
export FUZZING_LANGUAGE=c
infra/experimental/chronos/prepare-recompile "$PROJECT" "$FUZZ_TARGET" "$FUZZING_LANGUAGE"
python infra/helper.py build_image "$PROJECT"
docker run -ti --entrypoint="compile" --name "${PROJECT}-origin" "gcr.io/oss-fuzz/${PROJECT}"
docker commit "${PROJECT}-origin" "gcr.io/oss-fuzz/${PROJECT}-ofg-cached"
docker run -ti --entrypoint="recompile" "gcr.io/oss-fuzz/${PROJECT}-ofg-cached"
```

@DonggeLiu DonggeLiu changed the title [DO NOT MERGE] Pause compile just before compiling the fuzz target so that we can reuse it later. chronos: Pause compile just before compiling the fuzz target so that we can reuse it later. Aug 5, 2024
@DonggeLiu
Copy link
Contributor Author

DonggeLiu commented Aug 5, 2024

@oliverchang, shall we integrate this into the daily automatic build for C/C++ projects?
Being able to pull cached images (e.g., gcr.io/oss-fuzz/libiec61850-ofg-cached) can be helpful, low priority to show their status on the webpage.

@DonggeLiu
Copy link
Contributor Author

/gcbrun skip

@DonggeLiu
Copy link
Contributor Author

The Infra tests / build (pull_request) CI failure seems unrelated.

@oliverchang
Copy link
Collaborator

@oliverchang, shall we integrate this into the daily automatic build for C/C++ projects? Being able to pull cached images (e.g., gcr.io/oss-fuzz/libiec61850-ofg-cached) can be helpful, low priority to show their status on the webpage.

Unfortunately we can't trust images pushed by the OSS-Fuzz build process to gcr.io/oss-fuzz. Any project being built can push to gcr.io/oss-fuzz/ANYTHING, which is why we don't use this in our infra anywhere today.

trap 'execute_or_record_command' DEBUG

# Enable extended debugging mode
shopt -s extdebug
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!

#
################################################################################

# Usage:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a general high level few sentences as a comment to describe how this script works?

e.g.

# This script intercepts commands in the build process of a project, and records all bash commands (and env variable values) from the point that the fuzz target source is encountered....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

Copy link
Collaborator

@DavidKorczynski DavidKorczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reference to sanitizers and it might be nice to include this. Perhaps both in the README.md and also in prepare-recompile as its necessary to set the sanitizer environment variable to ensure recompile builds in the respective sanitizer. Maybe also clarify this is only libfuzzer atm?

@DonggeLiu
Copy link
Contributor Author

There's no reference to sanitizers and it might be nice to include this.

Could you elaborate a bit more on what to include? Thanks

in prepare-recompile as its necessary to set the sanitizer environment variable to ensure recompile builds in the respective sanitizer.

Please correct me if I misunderstood you, but I thought all env vars (including -fsanitizer=* in c-/cxx-flags) are stored and reset in:

declare -p | grep -Ev 'declare -[^ ]*r[^ ]*' > "$RECOMPILE_ENV"

echo "source $RECOMPILE_ENV" >> "$RECOMPILE_SCRIPT"

Maybe also clarify this is only libfuzzer atm?

Yep, good point. I will add this.

Thanks!

export FUZZING_LANGUAGE=c
infra/experimental/chronos/prepare-recompile "$PROJECT" "$FUZZ_TARGET" "$FUZZING_LANGUAGE"
python infra/helper.py build_image "$PROJECT"
docker run -ti --entrypoint="compile" --name "${PROJECT}-origin" "gcr.io/oss-fuzz/${PROJECT}"
Copy link
Collaborator

@DavidKorczynski DavidKorczynski Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when compile is run here, SANITIZER wont be set and consequently we won't set the any sanitizer flags in the following code (since sanitizer is '') (

if [ -z "${SANITIZER_FLAGS-}" ]; then
FLAGS_VAR="SANITIZER_FLAGS_${SANITIZER}"
export SANITIZER_FLAGS=${!FLAGS_VAR-}
fi
) as we want the relevant variables here
ENV SANITIZER_FLAGS_address "-fsanitize=address -fsanitize-address-use-after-scope"
ENV SANITIZER_FLAGS_hwaddress "-fsanitize=hwaddress -fuse-ld=lld -Wno-unused-command-line-argument"
# Set of '-fsanitize' flags matches '-fno-sanitize-recover' + 'unsigned-integer-overflow'.
ENV SANITIZER_FLAGS_undefined "-fsanitize=array-bounds,bool,builtin,enum,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unsigned-integer-overflow,unreachable,vla-bound,vptr -fno-sanitize-recover=array-bounds,bool,builtin,enum,function,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr"
# Don't include "function" since it is unsupported on aarch64.
ENV SANITIZER_FLAGS_undefined_aarch64 "-fsanitize=array-bounds,bool,builtin,enum,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unsigned-integer-overflow,unreachable,vla-bound,vptr -fno-sanitize-recover=array-bounds,bool,builtin,enum,integer-divide-by-zero,null,object-size,return,returns-nonnull-attribute,shift,signed-integer-overflow,unreachable,vla-bound,vptr"
ENV SANITIZER_FLAGS_memory "-fsanitize=memory -fsanitize-memory-track-origins"
ENV SANITIZER_FLAGS_thread "-fsanitize=thread"
ENV SANITIZER_FLAGS_introspector "-O0 -flto -fno-inline-functions -fuse-ld=gold -Wno-unused-command-line-argument"
# Do not use any sanitizers in the coverage build.
ENV SANITIZER_FLAGS_coverage ""

For OFG we need to have two different builds:
SANITIZER == 'coverage'
SANITIZER == 'address'
but I assume in general it's intended to use this with sanitizers, which might make it nice to show how to do this in the README. I think just adding -e SANITIZER="address" to the docker run command would do the job

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DonggeLiu re #11937 (comment) -- I think my comment only applies to the README for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK Done

@DavidKorczynski
Copy link
Collaborator

Maybe also clarify this is only libfuzzer atm?

Yep, good point. I will add this.

I think they might work with multiple engines? Similar with sanitizers though, one just needs to control the fuzz engine variable when running compile

@jonathanmetzman
Copy link
Contributor

I'll try to take a look before the end of the week. Is this a totally different implementation of the idea than jcc2? https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-builder/jcc/jcc2.go

@DonggeLiu
Copy link
Contributor Author

I'll try to take a look before the end of the week. Is this a totally different implementation of the idea than jcc2? https://github.com/google/oss-fuzz/blob/master/infra/base-images/base-builder/jcc/jcc2.go

Thanks!
Yes, sorry. I did not follow that because I was hoping this could save more resources (e.g., building the project once for all its benchmarks). Later, we will run OFG on all functions suggested by FI on each project, and this can be handy.

@jonathanmetzman
Copy link
Contributor

What are OFG and FI?

@DonggeLiu
Copy link
Contributor Author

What are OFG and FI?

OSS-Fuzz-Gen and FuzzIntrospector

@DonggeLiu
Copy link
Contributor Author

2821e62 modifies /src/build.sh in container instead of in OSS-Fuzz, because some projects do not maintain their build.sh in OSS-Fuzz. /src/build.sh should be the most universal and final version used by compile.

DavidKorczynski added a commit to google/oss-fuzz-gen that referenced this pull request Aug 10, 2024
First touch on #499

Depends on: google/oss-fuzz#12284

The way this work is by saving a cached version of `build_fuzzers` post
running of `compile` and then modifying the Dockerfiles of a project to
use this cached build image + an adjusted build script.

For example, for brotli the Dockerfile is originally:

```sh
                                                                                
FROM gcr.io/oss-fuzz-base/base-builder                                          
RUN apt-get update && apt-get install -y cmake libtool make                     
                                                                                
RUN git clone --depth 1 https://github.com/google/brotli.git                    
WORKDIR brotli                                                                  
COPY build.sh $SRC/                                                             
                                                                                
COPY 01.c /src/brotli/c/fuzz/decode_fuzzer.c      
```

a Dockerfile is then created which relies on the cached version, and it
loosk like:

```sh
FROM cached_image_brotli                                                        
# RUN apt-get update && apt-get install -y cmake libtool make                   
#                                                                               
# RUN git clone --depth 1 https://github.com/google/brotli.git                  
# WORKDIR brotli                                                                
# COPY build.sh $SRC/                                                           
#                                                                               
COPY 01.c /src/brotli/c/fuzz/decode_fuzzer.c                                    
#                                                                               
COPY adjusted_build.sh $SRC/build.sh 
```

`adjusted_build.sh` is then the script that only builds fuzzers. This
means we can also use `build_fuzzers`/`compile` workflows as we know it.

More specifically, this PR:

- Makes it possible to build Docker images of fuzzer build containers.
Does this by running `build_fuzzers`, saving the docker container and
then commit the docker container to an image. This image will have a
projects' build set up post running of `compile`. This is then used when
building fuzzers by OFG.
- Supports only ASAN mode for now. Should be easy to extend to coverage
too.
- Currently builds images first and then uses them locally. We could
extend, probably on another step of this, to use containers pushed by
OSS-Fuzz itself.
- Only does the caching if a "cache-build-script" exists (added a few
for some projects) which contains the build instructions post-build
process. It should be easy to extend such that we can rely on some DB of
auto-generated build scripts as well (ref:
google/oss-fuzz#11937) but I think it's nice to
have both the option of us creating the scripts ourselves + an
auto-generated DB.

---------

Signed-off-by: David Korczynski <[email protected]>
@DonggeLiu DonggeLiu merged commit dee1595 into master Aug 15, 2024
17 of 19 checks passed
@DonggeLiu DonggeLiu deleted the chronos branch August 15, 2024 22:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants