When I mentioned fuzzing a data structure, I was asked some questions like "how does one actually do that?", and "how does fuzzing work?"
This article intends to bring a very brief introduction to explain what is going on and how to use it. Making sophisticated use of it, is up to you, once you have started, but after having read this you should be able to make your first fuzz test.
TL;DR;
Fuzzing is to programmatically exercise your code under test through rules that are controlled by guided randomness. You write the rules with your fuzz test, and the fuzzing library provides the guided randomness.
Basics
In this example, I'm using libFuzzer from LLVM. The entry point for your fuzz test is this function:
1 2 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) |
Example API
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | struct size_error : std::out_of_range { using out_of_range::out_of_range; }; template <typename T, size_t Capacity> requires (std::is_nothrow_move_constructible_v<T> && std::is_nothrow_move_assignable_v<T>) class fixed_stack { public: T& push(T t); T& back(); T pop(); [[nodiscard]] bool empty() const; [[nodiscard]] size_t size() const; [[nodiscard]] static size_t capacity(); }; |
Unit tests
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | struct failure : std::string { using std::string::string; }; #define REQUIRE(...) if (__VA_ARGS__) {;} else throw failure(#__VA_ARGS__) #define FAIL(...) throw failure(__VA_ARGS__) int main() { unsigned fail_count = 0; struct test { const char* name; std::function<void()> f; }; test tests[] { { "default constructed stack is empty", []{ fixed_stack<int, 8> s; REQUIRE(s.size() == 0); REQUIRE(s.empty()); }}, { "Each push grows size by one", [] { fixed_stack<int, 8> s; s.push(3); REQUIRE(s.size() == 1); s.push(2); REQUIRE(s.size() == 2); s.push(8); REQUIRE(s.size() == 3); } }, { "Pop returns the pushed elements in reverse order", []{ fixed_stack<int, 8> s; s.push(3); s.push(2); s.push(8); REQUIRE(s.pop() == 8); REQUIRE(s.pop() == 2); REQUIRE(s.pop() == 3); } }, { "pop on empty throws", []{ fixed_stack<int, 8> s; s.push(3); s.pop(); try { s.pop(); FAIL("didn't throw"); } catch (const size_error&) { // good! } }} }; for (auto& t : tests) { try { std::cout << std::setw(60) << std::left << t.name << "\t"; t.f(); std::cout << "PASS!"; } catch (const failure& f) { std::cout << "FAILED!\nError: " << f << '\n'; ++fail_count; } catch (...) { std::cout << "FAILED!\nUnknown reason!"; ++fail_count; } std::cout << '\n'; } } |
default constructed stack is empty PASS!
Each push grows size by one PASS!
Pop returns the pushed elements in reverse order PASS!
pop on empty throws PASS!
Example implementation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | template <typename T, size_t Capacity> requires (std::is_nothrow_move_constructible_v<T> && std::is_nothrow_move_assignable_v<T>) class fixed_stack { public: T& push(T t) { if (size() == capacity()) throw size_error("push on full stack"); return data_[++size_] = std::move(t); } T& back() { if (empty()) throw size_error("back on empty stack"); return data_[size_]; } T pop() { if (empty()) throw size_error("pop on empty stack"); return std::move(data_[size_--]); } [[nodiscard]] bool empty() const { return size() == 0; } [[nodiscard]] size_t size() const { return size_; } [[nodiscard]] static size_t capacity() { return Capacity; } private: size_t size_ = 0; std::array<T, Capacity> data_{}; }; |
Let's write a fuzz test and see if it can find the problems.
Fuzz tests
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { fixed_stack<int, 8> stack; for (size_t i = 0; i != size; ++i) { try { switch (data[i]) { case 0: stack.push(0); break; case 1: stack.back(); break; case 2: stack.pop(); break; case 3: (void)stack.empty(); break; case 4: (void)stack.size(); break; case 5: (void)stack.capacity(); break; } } catch (const size_error&) { // OK } } return 0; } |
This is super naïve, but when compiled with sanitizers, it may still provide useful information about things like memory access errors or undefined behavior.
When possible, I usually try to mimic the behavior of my API through another implementation, so that I can compare the actual results with expected results. For things like data structures, I find that populating them with std::unique_ptr<> is a good way to catch many errors (the sanitizers are very good at picking them up).
I also try to populate my data structure with data from the input, so that I can distinguish between values. For this a small helper is useful:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | struct exhausted {}; struct source { std::span<const uint8_t> input; template <typename T> requires (std::is_trivial_v<T>) T get() { constexpr auto data_size = sizeof(T); if (input.size() < data_size) throw exhausted{}; alignas (T) uint8_t buff[data_size]; std::copy_n(input.begin(), data_size, buff); input = input.subspan(data_size); return std::bit_cast<T>(buff); } }; |
A source instance is created from the random data, and you can get any trivial type from it, as long as there's sufficient data remaining.
A fuzz test can then look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { source s{{data, size}}; std::vector<int> comparison; std::optional<fixed_stack<std::unique_ptr<int>, 8>> stack; try { for (;;) { if (!stack.has_value()) { stack.emplace(); } const auto action = s.get<uint8_t>(); switch (action) { case 0: // push { const int v = s.get<int>(); const auto size = stack->size(); try { stack->push(std::make_unique<int>(v)); comparison.push_back(v); assert(stack->size() == comparison.size()); assert(stack->back() != nullptr); assert(*stack->back() == v); } catch (size_error) { assert(size == stack->capacity()); assert(stack->size() == size); } } break; case 1: // pop { if (!stack->empty()) { auto v = stack->pop(); assert(v); assert(*v == comparison.back()); comparison.pop_back(); } } break; case 2: // destroy { stack.reset(); comparison = {}; } } } } catch (exhausted) { /* done */ } return 0; // Values other than 0 and -1 are reserved for future use. } |
Here a std::optional<> is used to hold the stack, because that makes it very easy to test destruction.
I've chosen std::vector<> as the reference to compare with, and here I use my knowledge of when it should throw. I've chosen to not test empty(), size() and capacity() as separate operations, but they are all exercised in the tests.
A more sophisticated test should also include copy and move construction/assignment, but this.
Here comes the fuzz!
Building this with -fsanitize=address,undefined,fuzzer --coverage and running the program yields this output:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | INFO: Running with entropic power schedule (0xFF, 100). INFO: Seed: 2488563521 INFO: Loaded 1 modules (1392 inline 8-bit counters): 1392 [0x611ca56471e8, 0x611ca5647758), INFO: Loaded 1 PC tables (1392 PCs): 1392 [0x611ca5647758,0x611ca564ce58), INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes INFO: A corpus is not provided, starting from an empty corpus #2 INITED cov: 245 ft: 246 corp: 1/1b exec/s: 0 rss: 31Mb #8 NEW cov: 245 ft: 309 corp: 2/3b lim: 4 exec/s: 0 rss: 31Mb L: 2/2 MS: 1 InsertByte- ... #10496 NEW cov: 658 ft: 2904 corp: 57/815b lim: 38 exec/s: 0 rss: 37Mb L: 36/38 MS: 1 PersAutoDict- DE: "\001\000P \000\010\030\324"- #11605 NEW cov: 658 ft: 2921 corp: 58/862b lim: 48 exec/s: 0 rss: 38Mb L: 47/47 MS: 4 PersAutoDict-CopyPart-CrossOver-PersAutoDict- DE: "\001\000"-"\001\000"- MS: 2 CrossOver-InsertRepeatedBytes-; base unit: 3ff2147bfcadde9ab222b3e35e51e922693dbb7e 0x2e,0xd9,0xa,0xd9,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0x0,0xa, .\331\012\331\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\012 artifact_prefix='./'; Test unit written to ./crash-8234995591abace9eff323632db0a3b28800dbe5 Base64: LtkK2QAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAK |
So the fuzzer found a problem. Line 16 shows an important bit. The crash data. This file holds the byte sequence that caused the problem.
If you run the fuzzer with that file name as argument, it will run exactly that sequence. You can run it through a debugger. If I run it in a debugger and just run until it crashes, I get this result:
A memory access error in unique_ptr<>::operator= for when doing push.
Looking at the member data when in the push() member shows:
And the source code for push() says:
And here it is revealed, the logic for accessing the data_ member uses 1-based indexing, and the data_ array uses 0-based indexing, so when size_ was 7, it is incremented to 8 and accesses data past the end.
Fixing that error, to ensure 0-based indexing in push(), pop() and back() makes it pass this set of tests.
More!
This was a very simple example to get you going. More complex APIs can be considerably more difficult to fuzz.
For data-structure like types, I almost always try to fuzz them, because they're reasonably easy and it tends to pay off very well.
For APIs that in their turn communicates with other APIs that you need to have test doubles for, it can be very tricky indeed.
One last thing. As shown the fuzzer runs in one thread. Start your fuzzer with -fork=X, where X is the number of threads you want it to spawn if you want to parallelize and accelerate the finding of problems. In my experience, you probably want to start single threaded because the first found bug tends to come embarrassingly early, when it starts to look good, run over night with as many threads as you have CPU cores.