Skip to content

Commit a11e08a

Browse files
committed
release
0 parents  commit a11e08a

File tree

219 files changed

+43050
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

219 files changed

+43050
-0
lines changed

.ci/ci.yaml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
version: v2.0
2+
3+
on:
4+
push: ["*"]
5+
mr: ["*"]
6+
7+
stages:
8+
- name: build and test stage
9+
jobs:
10+
job1:
11+
name: build and test job
12+
runs-on:
13+
pool-name: docker
14+
container:
15+
image: mirrors.tencent.com/rust-ci/rust:latest
16+
steps:
17+
- checkout: self
18+
- run: |
19+
cargo build
20+
cargo test
21+
name: cargo build and test
22+
- run: |
23+
rustup component add clippy
24+
name: install clippy
25+
- run: |
26+
cargo clippy --all-targets -- -D warnings
27+
name: run clippy
28+

.dockerignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
/install
2+
/target
3+
/.vscode

.gitignore

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
/target
2+
Cargo.lock
3+
.vscode/
4+
*.so
5+
*.dSYM
6+
*.dylib
7+
output/
8+
output_cov/
9+
install/
10+
core.*
11+
*.data
12+
*.log

Cargo.toml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
[workspace]
2+
resolver = "2"
3+
members = [
4+
'hopper-core',
5+
'hopper-derive-impl',
6+
'hopper-derive',
7+
'hopper-compiler',
8+
'hopper-harness',
9+
]
10+
11+
# [patch.crates-io]
12+
# bindgen = { path = "../rust-bindgen" }

Dockerfile

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
FROM ubuntu:20.04
2+
3+
ENV HOPPER_BIN=/hopper/hopper \
4+
RUSTUP_HOME=/usr/local/rustup \
5+
CARGO_HOME=/usr/local/cargo \
6+
PATH=/hopper:/usr/local/cargo/bin:/root/.cargo/bin:$PATH \
7+
DEBIAN_FRONTEND=noninteractive
8+
9+
# RUN sed -i 's/archive.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
10+
# RUN sed -i 's/security.ubuntu.com/mirrors.ustc.edu.cn/g' /etc/apt/sources.list
11+
12+
RUN apt-get update \
13+
&& apt-get -y upgrade \
14+
&& apt-get -y install build-essential wget curl cmake git unzip xxd protobuf-compiler libprotobuf-dev \
15+
&& apt-get clean
16+
17+
# ENV RUSTUP_DIST_SERVER="https://mirrors.ustc.edu.cn/rust-static"
18+
# ENV RUSTUP_UPDATE_ROOT="https://mirrors.ustc.edu.cn/rust-static/rustup"
19+
20+
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
21+
22+
# RUN echo '[source.crates-io]' > ${CARGO_HOME}/config && \
23+
# echo "replace-with = 'tencent'" >> ${CARGO_HOME}/config && \
24+
# echo '[source.tencent]' >> ${CARGO_HOME}/config && \
25+
# echo 'registry = "http://mirrors.tencent.com/rust/index"' >> ${CARGO_HOME}/config
26+
27+
RUN mkdir -p /hopper
28+
COPY . /hopper
29+
WORKDIR /hopper
30+
31+
RUN ./build.sh
32+
33+
RUN mkdir /llvm
34+
ENV PATH=/llvm/bin:$PATH
35+
ENV LD_LIBRARY_PATH=/llvm/lib:$LD_LIBRARY_PATH
36+
37+
RUN mkdir /fuzz_lib
38+
RUN mkdir /fuzz
39+
WORKDIR /fuzz

LICENSE

Lines changed: 828 additions & 0 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
# Hopper
2+
3+
Hopper is an tool for generating fuzzing test cases for libraries automatically using **interpretative fuzzing**. It transforms the problem of library fuzzing into the problem of interpreter fuzzing, enabling exploration of a vast range of API usages for library fuzzing out of the box.
4+
Some key features of Hopper include:
5+
- Interpretative API invoking without any fuzz driver.
6+
- Type-aware mutation for arguments.
7+
- Automiac intra- and inter-API constraints leanring.
8+
- Binary instrumentation support.
9+
10+
To learn more about Hopper, check out our [paper](https://arxiv.org/pdf/2309.03496) at CCS '23.
11+
12+
## Build Hopper
13+
### Build Requirements
14+
- Linux-amd64 (Tested on Ubuntu 20.04 and Debian Buster)
15+
- Rust stable (>= 1.60), can be obtained using [rustup](https://rustup.rs/)
16+
- Clang (>= 5.0, [Install Clang](https://rust-lang.github.io/rust-bindgen/requirements.html)), [rust-bindgen](https://rust-lang.github.io/rust-bindgen/) leverages libclang to preprocess, parse, and type check C and C++ header files.
17+
18+
### Build Hopper itself
19+
```sh
20+
./build.sh
21+
```
22+
23+
The script will create a `install` directory in hopper's root directory, then you can use `hopper`.
24+
25+
### Using Docker
26+
You can choose to use the Dockerfile, which build the requirements and Hopper.
27+
```
28+
docker build -t hopper ./
29+
docker run --name hopper_dev --privileged -v /path-to-lib:/fuzz -it --rm hopper /bin/bash
30+
```
31+
32+
## Compile library with Hopper
33+
Take `csjon` for example ([More examples](./examples/)).
34+
```sh
35+
hopper compile --header ./cJSON.h --library ./libcjson.so --output output
36+
```
37+
38+
Use `hopper compile --help` to see detailed usage. If the compiling reports errors about header file, refer to the usage of [rust-bindgen](https://rust-lang.github.io/rust-bindgen/), which we used for parsing header file.
39+
You may wrap the header file with the missing definitions.
40+
Hopper uses [E9Patch](https://github.com/GJDuck/e9patch) to instrument binaries by default.
41+
42+
After running `compile`, you will find that it generates the following files in the output directory:
43+
- `bin/hopper-fuzzer`: generates inputs, maintatins states, and use `harness` to excuted the inputs.
44+
- `bin/hopper-harness`: executes the inputs.
45+
- `bin/hopper-translate`: translates inputs to C source code.
46+
- `bin/hopper-generator`: replays the generate process.
47+
- `bin/hopper-sanitizer`: sanitize and minimize crashes.
48+
49+
#### Header files
50+
- If there are multiple header files, you can crate a new header file, and *include* all of them.
51+
- If header files are compiled depending on specific envoironment variables. You can set it by : `BINDGEN_EXTRA_CLANG_ARGS`.
52+
- If the header file includes API functions that you do not want to test, use `--func-pattern` to filter them while running the fuzzer.
53+
54+
#### Environment variable for compiling
55+
- `HOPPER_MAP_SIZE_POW2`: controls the size of coverage path. The defult value is 16, and it should be in the range of [16, 20]. e.g. `HOPPER_MAP_SIZE_POW2=18`.
56+
- `HOPPER_INST_RATIO`: controls how likely a block will be chosen for instrumentation. The default value is 100, and it should be in the range of (0, 100]. e.g. `HOPPER_INST_RATIO=75`.
57+
- `HOPPER_INCLUDE_SEARCH_PATH`: includes the search path of file in header files. e.g. `HOPPER_INCLUDE_SEARCH_PATH=../`.
58+
- `HOPPER_FUNC_BLACKLIST`: includes function blacklists that hopper won't compile. `bindgen` will not generate code for the functions. e.g. `HOPPER_FUNC_BLACKLIST=f1,f2`.
59+
- `HOPPER_TYPE_BLACKLIST`: includes type blacklists that hopper won't compile. `bindgen` will not generate code for the types. e.g. `HOPPER_TYPE_BLACKLIST=type1,type2`.
60+
- `HOPPER_ITEM_BLACKLIST`: includes item(constants/variables) blacklists that hopper won't compile. `bindgen` will not generate code for the items. e.g. `HOPPER_ITEM_BLACKLIST=IPPORT_RESERVED`
61+
- `HOPPER_CUSTOM_OPAQUE_LIST`: includes custom opaque types we defined. e.g. `HOPPER_CUSTOM_OPAQUE_LIST=type1`.
62+
63+
#### Tips
64+
- You can set the arguments and environment variables for compiling and running in a configuration file named `hopper.config`, see `examples/*` for details.
65+
66+
- Reduce density: If density is larger than 20%, the IDs of edges is likely to have hash-collisions. We can a) increase `HOPPER_MAP_SIZE_POW2` or b) reduce `HOPPER_INST_RATIO`.
67+
68+
- Multiple libraries: (1) merge the archives into one shared library, e.g. `gcc -shared -o c.so -Wl,--whole-archive a.a b.a -Wl,--no-whole-archive`; (2) pass all of them into hopper compiler by `--library a.so b.so`.
69+
70+
## Fuzz Library with Hopper
71+
72+
```
73+
hopper fuzz output --func-pattern cJSON_*
74+
```
75+
76+
Use `hopper fuzz output --help` to see detailed usage.
77+
78+
After running `fuzz`, it will generate following directories.
79+
- `queue`: generated normal inputs.
80+
- `hangs`: generated timeout inputs.
81+
- `crashes`: generated crash inputs.
82+
- `misc`: store some temporal files or stats.
83+
84+
#### Environment variable for running
85+
- `DISABLE_CALL_DET`: disables call's deterministic mutating.
86+
- `DISABLE_GEN_FAIL`: disables generating programs for functions that have been failed to invoke.
87+
- `HOPPER_SEED_DIR`: provides seeds for byte-like arguments (default: output/seeds if t exists).
88+
- `HOPPER_DICT`: provides dictionary for byte-like arguments. The grammar is the same as AFL's.
89+
- `HOPPER_API_INSENSITIVE_COV`: disables API-sensitive branch counting.
90+
- `HOPPER_FAST_EXECUTE_LOOP`: number of programs excuted (in a loop) for each fork, set as 0 or 1 to break the loop. e.g. `HOPPER_FAST_EXECUTE_LOOP=10`.
91+
92+
#### System configuration
93+
Set system core dumps as AFL (on the host if you execute Hopper in a Docker container).
94+
```
95+
echo core | sudo tee /proc/sys/kernel/core_pattern
96+
```
97+
98+
### Function pattern
99+
Hopper generates inputs for all functions in libiries by default. However, there are two ways to filter functions in Hopper: exlucding functions or including functions. This way, it can be focus on intersting functions.
100+
101+
#### `--func-pattern`
102+
```
103+
hopper fuzz output --func-pattern @cJSON_parse,!cJSON_InitHook,cJSON_*
104+
```
105+
- The pattern can be a function name, e.g. `cJSON_parse`, or a simple pattern, e.g. `cJSON_*`.
106+
- If you have multiple patterns, use `,` to join them, e.g `cJSON_*,HTTP_*`.
107+
- You can use `@` prefix to limit the fuzzer to only fuzz specific function, while the others can be candidates that provding values for fields or arguments, e.g. `@cJSON_parse,cJSON_*`.
108+
- `!` is used as prefix for excluding some specific functions, e.g `!cJSON_InitHook,cJSON_*`.
109+
110+
#### `--custom-rules`
111+
The patterns can be defined in the file passed by `--custom-rules`.
112+
113+
```rust
114+
// hopper fuzz output --custom-rules path-to-file
115+
func_target cJSON_parse
116+
func_exclude cJSON_InitHook
117+
func_include cJSON_*,HTTP_*
118+
```
119+
120+
### Constraints
121+
Hopper infers both intra- and inter-API constraints to invoking the APIs correctlly.
122+
The constraints are written in `output/misc/constraint.config`. You can remove the file to reset the constraints.
123+
Addtionally, users can defined a file that describe custom constraints for API invocations, which passed by `--custom-rules`. The constraints will override the infered ones.
124+
```java
125+
// hopper fuzz output --custom-rules path-to-file
126+
// Grammar:
127+
// func, type : prefix for adding a rule for function or type
128+
// $[0-9]+ : function's i-th argument, or index in array
129+
// [a-zA-Z_]+ : object field
130+
// 0, 128 .. : integer constants
131+
// "xxxx" : string constants
132+
// methods : $len, $range, $null, $non_null, $need_init, $read_file, $write_file, $ret_from, $cast_from, $use, $arr_len, $opaque, $len_factors
133+
// others : pointer(&) , option(?), e.g &.$0.len, `len` field in the pointer's first element
134+
//
135+
// Set one argument in a function to be specific constant
136+
func test_add[$0] = 128
137+
// One argument must be the length of another one
138+
func test_arr[$1] = $len($0)
139+
// Or one field must be the length of another field
140+
func test_arr[$0][len] = $len([$0][name])
141+
// One argument must be in a certain range
142+
func test_arr[$1] = $range(0, $len($0))
143+
// Argument should be non-null
144+
func test_non_null[$0] = $non_null
145+
// Argument should be null
146+
func test_null[$0] = $null
147+
// Argument should be specific string
148+
func test_magic[$0] = "magic"
149+
// Argument should be a file and the file will be read
150+
func test_path[$0] = $read_file
151+
// Argument should be use the value of specific function's return
152+
func test_use[$0] = $ret_from(test_create)
153+
// Argument should be specific type for void pointer. The type should start with *mut or *cosnt.
154+
func test_void[$0] = $cast_from(*mut u8)
155+
// The array suppose has a minimal array length
156+
func test_void[$0][&] = $arr_len(256)
157+
// The array's length is formed by the factors
158+
func fread[$0][&] = $len_factors(1, $2)
159+
// Or
160+
func gzfread[$0][&] = $len_factors($1, $2)
161+
// Field in argument should be specific constant
162+
func test_field[$0][len] = 128
163+
// Deeper fields
164+
func test_field[$0][&.elements.$0] = 128
165+
166+
// One field `len` in a type must be the length of another field `p`
167+
type ArrayWrap[len] = $len(p)
168+
// One nested union `inner_union` in a type must be set to `member2`
169+
type ComplicatedStruct[inner_union] = $use(member2)
170+
// Type is opaque that used as an opaque pointer
171+
type Partial = $opaque
172+
// A type should be init with specific function
173+
type Partial = $init_with(test_init, 0)
174+
175+
// ctx: set context for specific function
176+
// Add a context for function
177+
ctx test_use[$0] <- test_init
178+
// Add implicit context
179+
ctx test_use[*] <- test_init
180+
// Add optional context that prefered to use
181+
ctx test_use[$0] <- test_init ?
182+
// Add forbidden context
183+
ctx test_use[$0] <- ! test_init
184+
185+
// alias: alias types across different function
186+
alias handleA <- useA($0),createA($ret),freeA($0)
187+
188+
// assert: adding specific assertions for calls
189+
assert test_one == 1
190+
assert test_non_zero != 0
191+
192+
```
193+
194+
### Seeds for bytes arguments
195+
If there is a `seeds` direcotry (Set by `HOPPER_SEED_DIR`), Hopper will try to read files inside it and uses them as the seeds for bytes arguments (e.g. char*). Also, you can indicate the seeds for specific argument via its parameter names, e.g make the subdirectory as `@buf` for parameter whose name is `buf`.
196+
197+
### Logging
198+
Hopper uses Rust's log crate to print log information. The default log level is `INFO`. If you want to print all logging information (`DEBUG` and `TRACE`), you can set the environment `LOG_TYPE` during running Hopper, e.g. `LOG_TYPE=trace ./hopper`.
199+
The detailed logging will be written at `output/fuzzer_r*.log` and `output/harness_r*.log`.
200+
201+
### Reproduce execution
202+
Hopper can reproduce the execution of programs at output directories.
203+
204+
- `hopper-harness` can parse and explain the inputs by Hopper's runtime. It wiil print the internal states during execution in detail.
205+
```
206+
./bin/hopper-harness ./queue/id_000000
207+
```
208+
209+
- `hopper-translate` can translate the input to C source code. The C files can be a witness for reporting issues.
210+
```
211+
./bin/hopper-translate --input ./queue/id_000000 --header path-to/xx.h --output test.c
212+
# then compile it with specific library
213+
gcc -I/path-to-head -L/path-to-lib -l:libcjson.so test.c -o test
214+
```
215+
216+
- `hopper-generator` is able to replay input generation except execution. You can use it to analyse how the input was generated or mutated.
217+
```
218+
./bin/hopper-generator ./queue/id_000000
219+
```
220+
221+
- `hopper-sanitizer` can minimize and verify the crashes generated by Hopper. It excludes crashes that violate constraints and de-duplicate crashes according to call stacks.
222+
```
223+
./bin/hopper-sanitizer
224+
```
225+
226+
## Test
227+
### Test rust code
228+
- Run all testcases
229+
```
230+
RUST_BACKTRACE=1 cargo test -- --nocapture
231+
```
232+
233+
### Testsuite (test libraties)
234+
- [How to run and write testuite](./testsuite/README.md)
235+
236+
## Evaluating results via source-based code coverage
237+
- Compile the libraies' source code with LLVM source-based code sanitizer(https://clang.llvm.org/docs/SourceBasedCodeCoverage.html). You should set the compiling flags, e.g.
238+
239+
```
240+
export CFLAGS="${CFLAGS:-} -fprofile-instr-generate -fcoverage-mapping -gline-tables-only -g"
241+
make
242+
```
243+
244+
- Compile the libraries with `cov` instrumentation mode. e.g.
245+
```
246+
hopper compile --instrument cov --header ./cJSON.h --library ./libcjson_cov.so --output output_cov
247+
```
248+
249+
- Run the interpreter with all generated seed inputs (SEED_DIR).
250+
```
251+
# run hopper and use llvm-cov to compute the coverage.
252+
SEED_DIR=./output/queue hopper cov output_cov
253+
```
254+
255+
## Contributing guidelines
256+
We have listed some tasks in [Readmap](https://github.com/FuzzAnything/hopper/discussions/2).
257+
If you are interested, please feel free to discuss with us and contribute your code.
258+
259+
### Coding
260+
- *Zero* `cargo check` warnning
261+
- *Zero* `cargo clippy` warnning
262+
- *Zero* `FAILED` in `cargo test`
263+
- *Try* to write tests for your code
264+
265+
### Profiling
266+
- [Profiling Rust Applications](https://gist.github.com/KodrAus/97c92c07a90b1fdd6853654357fd557a)
267+
- [Inferno](https://github.com/jonhoo/inferno)
268+
269+
```bash
270+
perf record --call-graph=dwarf ./bin/hopper-fuzzer
271+
# use flamegraph directly
272+
perf script | stackcollapse-perf.pl | rust-unmangle | flamegraph.pl > flame.svg
273+
# use inferno
274+
perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg
275+
```
276+
277+
perf will produce huge intermediate data for analysis, so *do not* run fuzzer more than 2 minutes.

0 commit comments

Comments
 (0)