In this post, I would like to introduce the bugpoint command line tool. This is a automatic test case reduction tool which can help us generate minimal test case.
As a compiler developer, the first step to debug is to create a minimal test case which can still reproduce the bug. Unfortunately, the preprocessed C++ source code usually contains more than 10,000 lines. However, to make the test case understandable, we have to reduce them to less than 100 lines. To be honest, it is really a boring task that I don't like to work on manually. Fortunately, bugpoint is the automatic tool to solve the problem.
Convert to LLVM Assembly
Bugpoint is a reduction tool for LLVM assembly. In the other words,
it takes LLVM assembly as the input and generates LLVM bitcode as the output.
Thus, we have to convert C/C++ programs to LLVM assembly with clang
.
The easiest way is to replace the -emit-obj
with -emit-llvm
from the cc1
invocation command. For example,
$ clang -cc1 -emit-llvm input.cpp # ... other options ...
If clang
crashes in this step, then it seems that you are facing a
front-end bug. You may wish to use C-Reduce or more general tools working
on top of C/C++ source code.
If the output input.ll
is generated without any problems, than we can
continue with the llc
command (which will generate either machine
assembly file or relocatable object file):
$ llc input.ll # ... other options ... (e.g. -O3 -mtriple=...)
The llc
command should crash in this step. If it does not crash, then
try to add some common optimization flags such as -O3
to the command
line.
Reduce the Test Case
Now, we can reduce the test case with the bugpoint
command. Since I am
cross-compiling the source code in this case, I am using -llc-safe
to
test the compiler without the interpreter. Besides, the arguments to be passed
to llc
can be specified with the -safe-tool-args
option.
$ bugpoint input.ll -llc-safe -safe-tool-args -mtriple=armv7-linux-gnueabi
If everything goes well, then bugpoint-reduced-simplified.bc
will be
created. You can disassemble the output file with:
$ llvm-dis bugpoint-reduced-simplified.bc
The output bugpoint-reduced-simplified.ll
is the result test case.
Reduce the Test Case with Custom Compile Script
You may wish to customize the compiler pipeline to reproduce the bug. To do
so, use the -compile-custom
option instead and specify the test script
with -compile-command
. For example,
$ bugpoint input.ll -compile-custom -compile-command ./test.sh
Here's the test script:
#!/bin/bash
# Create a temporary file for the test command
logfile="$(mktemp)"
# Run your test command (and redirect the output messages)
llc "$@" > "${logfile}" 2>&1
ret="$?"
# Print messages when error occurs
if [ "${ret}" != 0 ]; then
echo "test failed" # must print something on failure
cat "${logfile}"
fi
# Cleanup the temporary file
rm "${logfile}"
exit "${ret}"
Note
The test script MUST print some message when the command failed and it
should not print any message when the command succeed. Otherwise, the
bugpoint
command won't work properly.
Strip the Symbols
Sometimes, there will be several long symbol names and dead function declaration in the LLVM bitcode, we can further strip the bitcode with:
$ opt -S -strip -strip-dead-prototypes \
bugpoint-reduced-simplified.ll > strip.ll
Conclusion
After these steps, we should be able to obtain a minimal test case which is suitable for debugging. We can find the exact pass causing the problem with:
$ opt -print-before-all -print-after-all -O2 strip.ll > debug.txt 2>&1
In this post, I have introduced the basic usage of bugpoint to reduce the test case for code generation bugs. We can automate the test case reduction process with bugpoint, and as a creative programmer we can focus on more challenging tasks. For further information, please refer to How to Submit a LLVM Bug.