Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fuzzing rig, make component creation work #7028

Merged
merged 3 commits into from
Feb 8, 2023

Conversation

andyross
Copy link
Contributor

@andyross andyross commented Feb 1, 2023

Come back to the fuzzing rig, dust off my brain (been a rough January here for me, sorry), and update to hopefully finish off IPC coverage.

As mentioned earlier, TPLG_COMP_NEW commands weren't being covered (and if we can't create components, we aren't able to get coverage on anything else!) because of the way they work. The input wants to match a 128 bit UUID value in the command vs. a table of drivers, and that haystack is too deep for even fuzzing to find the needle. This plays a trick by using an otherwise-ignored bit of fuzz input to encode a "create a whiteboxed command" metacommand. When triggered, this adds Just Enough Correct Input to get the code through the driver selection stage of the IPC parser, but leaves the other random bits random. There were a bunch of ancestor variants here which were terrible hacks, but I'm actually kinda proud of how this finally turned out.

Anyway, now we get all the way to the individual component creation paths, essentially completing coverage of all the IPC3 input paths (IPC4 isn't default in the tree yet, but AFAICT the protocol bits the fuzzing rig touches are all identical and it "should" work fine, modulo fuzz-detected-bugs, when we cut over; it's on my list to figure out ahead of time).

Interestingly I didn't stumble on any new bugs doing this, which makes me a little worried. But I've manually traced out every creation path I can find and know they're all being hit.

We're also not hitting as much of the runtime behavior as I would have liked. The fuzzing rig is successfully creating components, but after just a handful it's out of memory and the resulting random collection of junk doesn't actually try to do anything. Maybe that's just asking too much of fuzzing as a technique; it's really only intended for input validation code.

The TPLG_COMP_NEW message type works by matching a 128 bit UUID to a
runtime list of drivers.  That's too much depth for even directed
fuzzing to find randomly.

So whitebox that command: leverage the fact that the first four bytes
of the message are being ignored anyway (that field is the message
length, which we get externally from the fuzzing rig and overwrite).
Treat the first byte as a magic number indicating the command type
(0xff in this case being "comp_new") and the second a parameter
interpreted as a driver index.  Then fix up the fields manually so it
gets through the logic in helper.c:get_drv() successfully and tries to
create a random component.

Signed-off-by: Andy Ross <[email protected]>
Transitive header order no longer declares struct sof for us here.  No
need to pull in a full header as we only use pointers to it.

Signed-off-by: Andy Ross <[email protected]>
@@ -0,0 +1,37 @@
#!/bin/bash
set -e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in most other scripts:

SOF_TOP=$(cd "$(dirname "$0")/.." && pwd)

# but can add more at the end of this script's command line to
# duplicate configurations as needed.

export ZEPHYR_BASE=$SOF_TOP/../zephyr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't used ZEPHYR_BASE for a long time, are you sure it's required in this case? Can this script be run outside a west workspace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed in the default setup. But I have like a thousand zephyr trees floating around and do dumb stuff like symlinking module directories across them. If you have ZEPHYR_BASE set to the wrong tree, you get very funny results (like, the build works, but uses the wrong SOF tree!). So I guess I'd see this as a best practices kind of thing to ensure that the tree this script is running from is the one being built.

@@ -0,0 +1,37 @@
#!/bin/bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you generally disagree but asking anyway: can you please move this to a main "$@" function? Rationale in thesofproject/sof-test#740

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do, but this is your playground so no arguments here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyross your playground too !

scripts/fuzz.sh Outdated
-DCONFIG_ZEPHYR_NATIVE_DRIVERS=y \
-DCONFIG_ARCH_POSIX_LIBFUZZER=y \
-DCONFIG_ARCH_POSIX_FUZZ_TICKS=100 \
-DCONFIG_ASAN=y "$@"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have all these in some kind of defconfig file? Overkill / too cumbersome?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer to you guys on that? There's a thousand ways to do kconfig and this is the one that steps on the fewest toes. Another option would be to create a new kconfig like CONFIG_SOF_FUZZER that then selects this stuff.

(Also I'm noticing that heap one, which doesn't belong. I forget the details now, but the SOF "posix" platform creates a heap of the same size as the hardware platforms, but for some reason ends up building a zephyr without support for the "big" heap code and so gets runtime errors. It's just a kconfig mismatch somewhere I worked around and then forgot about.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i any comment on this re our Zephyr configs ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine here. An overlay file like sof/app/perf_overlay.conf (passed via -DOVERLAY_CONFIG= to west) would be perhaps be a good match, but as long as the assumption is the users all use this helper script anyways, I think the definitions are fine here.

Copy link
Collaborator

@marc-hb marc-hb Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An overlay file like sof/app/perf_overlay.conf (passed via -DOVERLAY_CONFIG= to west) would be perhaps be a good match

@andyross can you add this sentence as a one-line comment? Thx!

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real objections, seems good to me.

scripts/fuzz.sh Outdated
-DCONFIG_ZEPHYR_NATIVE_DRIVERS=y \
-DCONFIG_ARCH_POSIX_LIBFUZZER=y \
-DCONFIG_ARCH_POSIX_FUZZ_TICKS=100 \
-DCONFIG_ASAN=y "$@"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine here. An overlay file like sof/app/perf_overlay.conf (passed via -DOVERLAY_CONFIG= to west) would be perhaps be a good match, but as long as the assumption is the users all use this helper script anyways, I think the definitions are fine here.

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 2, 2023

Some of the checkpatch warnings seem easy to fix: https://github.com/thesofproject/sof/actions/runs/4069355263/jobs/7008988232

@@ -24,6 +24,8 @@

#define PLATFORM_DEFAULT_DELAY 12

struct sof;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extern?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok as a forward declaration.

Copy link
Collaborator

@lyakh lyakh Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i IIRC there is a weird (IMHO) C feature, that yes, you can use this in a header and the compiler will "guess what you actually mean" and only create one instance of the object, but I'd rather use an explicit extern the same way as we do everywhere, particularly since at least me personally I'm not sure which C standard we settled with and from which version that feature was introduced

Copy link
Collaborator

@marc-hb marc-hb Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very good description of that portability issue at the end of the chapter "Declarations" in the Harbison & Steele (which every C developer should have IMHO). For maximum portability across various toolchains (which I think matters to SOF), they definitely agree with @lyakh .

Follow-up PR? Could also fix some other checkpatch warnings inhttps://github.com/thesofproject/sof/actions/runs/4118174248/jobs/7110383634

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

humm, the use of 'struct foo' is very very common in Linux headers. I'd be surprised if you saw any 'extern' in any of the ALSA/ASoC headers...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until recently the Linux kernel supported a single compiler. I doubt ALSA is super interested in C portability either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is a type declaration, not an object definition. All it means is that "struct sof exists as a type and you can legally take a pointer to it". You see this construct in C++ a lot, because it avoids the need to pull in the very expensive headers that define types. The reason it was introduced here is that I had a spot further down where I declare a function that takes a "struct sof *" as an argument, and was too lazy to figure out where that was declared.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is a type declaration, not an object definition.

Thanks @andyross , I read too fast and stand corrected (and I think you meant: "not an object... declaration" :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is a type declaration, not an object definition.

ouch... /me looking at those "computer glasses" and contemplates if he should wear them more frequently... Then hides away

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, of course it's a perfectly fine forward type declaration. And I myself frequently suggest it when just a pointer to that type is needed - exactly as @andyross explained, and I think that's usually preferred over including a header just for that. And that's also why it's used in Linux, in SOF, etc. I just somehow misread it as struct sof sof;...

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @andyross !

@@ -24,6 +24,8 @@

#define PLATFORM_DEFAULT_DELAY 12

struct sof;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok as a forward declaration.

Simple script to explain how fuzz testing works

Signed-off-by: Andy Ross <[email protected]>
@andyross
Copy link
Contributor Author

andyross commented Feb 7, 2023

Cosmetic fixups as requested. I have another series coming up pretty soon I think, but may ping you guys offline depending on results.

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just gave this a quick run. It works for me except the totally cryptic, DTS (!) error message when missing clang and llvm. Most likely unrelated to this.

EDIT: indeed totally unrelated, I can reproduce with ZEPHYR_TOOLCHAIN_VARIANT=llvm west build -p -b native_posix samples/hello_world/. Tentative fix in


cmd[i + 4] = fuzz_in[i + 1];
struct sof_ipc_comp *cc = global_ipc->comp_data;
Copy link
Collaborator

@marc-hb marc-hb Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/platform/posix/ipc.c:90:30: warning: unused variable ‘cc’ [-Wunused-variable]
   90 |         struct sof_ipc_comp *cc = global_ipc->comp_data;
      |                              ^~

... in some -DCONFIG... combination which I forgot, sorry.

@kv2019i kv2019i merged commit 3f5c0ba into thesofproject:main Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants