-
Notifications
You must be signed in to change notification settings - Fork 72
Improve project profile, adds more information and visual cleanup. #2162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
6b5f952
to
5a2fdaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great going through the visuals. Give me a few days -- I'm looking to go through it on Sunday.
I'll go through them now. In principle they are nice additions. Started a trial run. |
Regarding the statefulness detector, I'm happy to complete the PR. I would just like some confirmation that there is interest. |
|
||
def get_fuzzer_corpus_size(project_name, datestr, fuzzer, introspector_report): | ||
"""Go through coverage reports to find the LLVMFuzzerTestOneInput function. The first hit count equals the number inputs found.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to name it "LLVMFuzzerTestOneInput hit count" in the report or something like that instead of "corpus size/entries"? It there was a tooltip showing what it is it would be nice too I think. It's just that I'd interpret "corpus size/entries" differently if I saw it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you interpret "corpus size/entries"? This count is the number of inputs that are used for coverage measurement after the fuzzer completes, so the final number of inputs the fuzzers ends up with, which to me is the same as the number of corpus entries.
I agree that "LLVMFuzzerTestOneInput hit count" is closer to what is measured but I find it also more confusing. Maybe just "Hit Count" and have a description? While, I still like corpus size, I'm happy to change it if we can come to some decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the final number of inputs the fuzzers ends up with, which to me is the same as the number of corpus entries
I think that happens when fuzz targets are fine. When they start to OOM/timeout the number of inputs go down while the corpus size that can be measured by downloading the corpus stays the same (at first at least). I think "the number of inputs processed by the fuzz target" (or something like that) would be more clear. And when it goes down it can be combined with the data from google/oss-fuzz#13103 for example to make it easier to figure out where those drops can come from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, now I see what you mean. However, I don't think that the downloadable corpus count will usually match this number, it is de-duplicated, while this number is the number of inputs the fuzzer deemed interesting. So they are seed corpus vs corpus.
I feel like "the number of inputs processed by the fuzz target" is also confusing, as this can be understood as the number of executions by the fuzzer during normal execution and not during coverage.
What do you think of "Executed Coverage Inputs" or just "Coverage Inputs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like "the number of inputs processed by the fuzz target" is also confusing
Agreed.
What do you think of "Executed Coverage Inputs" or just "Coverage Inputs"?
It sounds good to me.
FWIW "corpus_size" is shown by OSS-Fuzz elsewhere:
so "coverage inputs" would maybe make it easier to tell those things apart in different places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the name and added a description.
fuzzer_program_coverage_urls = get_introspector_report_url_fuzzer_coverage_urls( | ||
project_name, datestr, metadata_files["coverage"]) | ||
|
||
for url in fuzzer_program_coverage_urls: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should be able to avoid this loop, do we not have the data already available at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably missing something but are you thinking that this is to get the normal per-target coverage report (https://storage.googleapis.com/oss-fuzz-coverage/sudoers/reports-by-target/20250603/fuzz_iolog_legacy/linux/report.html)? Because this function is getting the meta coverage file(s) (https://storage.googleapis.com/oss-fuzz-introspector/sudoers/inspector-report/20250603/fuzz_iolog_json.covreport). While I can find mentions of "covreport", those seem to be during the analysis stage but I don't see it used or available as part of the html report generation.
I'm using the covreport file to easily get the execution count for LLVMFuzzOneInput, which admittedly is a bit hacky but should be much faster than getting the correct file/line and parsing the per-target coverage report again, which would be needed as per line counts are not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On another note, do you know what the result of this code is for Non C/CPP projects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, this file list seems to be empty for all other languages. I'll add a check to only do this for C/CPP projects in any case, I would expect the format to be different if they were ever created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let me just go through this and see if we can get the number from some of our existing data. Otherwise this lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the check for the language, I think only checking for "c-cpp" should be fine at this point but I'm not completely sure.
Do not display recent_data if there is none, this caused misleading tables and empty plots for non C/C++ projects. Collect and display corpus size. Visual cleanup adds more space for table and scroll box for plots. Add quick links to quickly access plots from per target table. Add links for day to day degradation. Signed-off-by: phi-go <[email protected]>
@DavidKorczynski sorry to hijack in a PR but to would it be possible for someone review these two PRs for OSS-Fuzz as they are blockers for #2054:
This PR focuses a few related things to improve the project profile view.
Commit info:
Do not display recent_data if there is no real data, this caused misleading tables and empty plots for non C/C++ projects.
Collect and display corpus size, this addresses a point in: #2054
Visual cleanup adds more space for table and scroll box for plots.
Add quick links to quickly access plots from per target table.
Add links for day to day degradation, this adresses: #2155
Update to table:

Update to per target plots:
