Introduce the llvm-cfi-verify tool.

Summary: Introduces the llvm-cfi-verify tool to llvm. Includes the design document (docs/CFIVerify.rst). Current implementation of the tool is simply a disassembler that identifies and prints the indirect control flow instructions.

Reviewers: vlad.tsyrklevich

Reviewed By: vlad.tsyrklevich

Patch by Mitch Phillips

Subscribers: llvm-commits, kcc, pcc, mgorny

Differential Revision: https://reviews.llvm.org/D37937

llvm-svn: 313688
This commit is contained in:
Vlad Tsyrklevich 2017-09-19 22:33:09 +00:00
parent 399a4e9b0b
commit 564060193f
5 changed files with 366 additions and 0 deletions

88
llvm/docs/CFIVerify.rst Normal file
View File

@ -0,0 +1,88 @@
==============================================
Control Flow Verification Tool Design Document
==============================================
Objective
=========
This document provides an overview of an external tool to verify the protection
mechanisms implemented by Clang's *Control Flow Integrity* (CFI) schemes
(``-fsanitize=cfi``). This tool, provided a binary or DSO, should infer whether
indirect control flow operations are protected by CFI, and should output these
results in a human-readable form.
This tool should also be added as part of Clang's continuous integration testing
framework, where modifications to the compiler ensure that CFI protection
schemes are still present in the final binary.
Location
========
This tool will be present as a part of the LLVM toolchain, and will reside in
the "/llvm/tools/llvm-cfi-verify" directory, relative to the LLVM trunk. It will
be tested in two methods:
- Unit tests to validate code sections, present in "/llvm/unittests/llvm-cfi-
verify".
- Integration tests, present in "/llvm/tools/clang/test/LLVMCFIVerify". These
integration tests are part of clang as part of a continuous integration
framework, ensuring updates to the compiler that reduce CFI coverage on
indirect control flow instructions are identified.
Background
==========
This tool will continuously validate that CFI directives are properly
implemented around all indirect control flows by analysing the output machine
code. The analysis of machine code is important as it ensures that any bugs
present in linker or compiler do not subvert CFI protections in the final
shipped binary.
Unprotected indirect control flow instructions will be flagged for manual
review. These unexpected control flows may simply have not been accounted for in
the compiler implementation of CFI (e.g. indirect jumps to facilitate switch
statements may not be fully protected).
It may be possible in the future to extend this tool to flag unnecessary CFI
directives (e.g. CFI directives around a static call to a non-polymorphic base
type). This type of directive has no security implications, but may present
performance impacts.
Design Ideas
============
This tool will disassemble binaries and DSO's from their machine code format and
analyse the disassembled machine code. The tool will inspect virtual calls and
indirect function calls. This tool will also inspect indirect jumps, as inlined
functions and jump tables should also be subject to CFI protections. Non-virtual
calls (``-fsanitize=cfi-nvcall``) and cast checks (``-fsanitize=cfi-*cast*``)
are not implemented due to a lack of information provided by the bytecode.
The tool would operate by searching for indirect control flow instructions in
the disassembly. A control flow graph would be generated from a small buffer of
the instructions surrounding the 'target' control flow instruction. If the
target instruction is branched-to, the fallthrough of the branch should be the
CFI trap (on x86, this is a ``ud2`` instruction). If the target instruction is
the fallthrough (i.e. immediately succeeds) of a conditional jump, the
conditional jump target should be the CFI trap. If an indirect control flow
instruction does not conform to one of these formats, the target will be noted
as being CFI-unprotected.
Note that in the second case outlined above (where the target instruction is the
fallthrough of a conditional jump), if the target represents a vcall that takes
arguments, these arguments may be pushed to the stack after the branch but
before the target instruction. In these cases, a secondary 'spill graph' in
constructed, to ensure the register argument used by the indirect jump/call is
not spilled from the stack at any point in the interim period. If there are no
spills that affect the target register, the target is marked as CFI-protected.
Other Design Notes
~~~~~~~~~~~~~~~~~~
Only machine code sections that are marked as executable will be subject to this
analysis. Non-executable sections do not require analysis as any execution
present in these sections has already violated the control flow integrity.
Suitable extensions may be made at a later date to include anaylsis for indirect
control flow operations across DSO boundaries. Currently, these CFI features are
only experimental with an unstable ABI, making them unsuitable for analysis.

View File

@ -25,6 +25,7 @@ subdirectories =
llvm-as llvm-as
llvm-bcanalyzer llvm-bcanalyzer
llvm-cat llvm-cat
llvm-cfi-verify
llvm-cov llvm-cov
llvm-cvtres llvm-cvtres
llvm-diff llvm-diff

View File

@ -0,0 +1,14 @@
set(LLVM_LINK_COMPONENTS
AllTargetsAsmPrinters
AllTargetsAsmParsers
AllTargetsDescs
AllTargetsDisassemblers
AllTargetsInfos
MC
MCParser
Support
)
add_llvm_tool(llvm-cfi-verify
llvm-cfi-verify.cpp
)

View File

@ -0,0 +1,22 @@
;===- ./tools/llvm-cfi-verify/LLVMBuild.txt --------------------*- Conf -*--===;
;
; The LLVM Compiler Infrastructure
;
; This file is distributed under the University of Illinois Open Source
; License. See LICENSE.TXT for details.
;
;===------------------------------------------------------------------------===;
;
; This is an LLVMBuild description file for the components in this subdirectory.
;
; For more information on the LLVMBuild system, please see:
;
; http://llvm.org/docs/LLVMBuild.html
;
;===------------------------------------------------------------------------===;
[component_0]
type = Tool
name = llvm-cfi-verify
parent = Tools
required_libraries = MC MCDisassembler MCParser Support all-targets

View File

@ -0,0 +1,241 @@
//===-- llvm-cfi-verify.cpp - CFI Verification tool for LLVM --------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This tool verifies Control Flow Integrity (CFI) instrumentation by static
// binary anaylsis. See the design document in /docs/CFIVerify.rst for more
// information.
//
// This tool is currently incomplete. It currently only does disassembly for
// object files, and searches through the code for indirect control flow
// instructions, printing them once found.
//
//===----------------------------------------------------------------------===//
#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCContext.h"
#include "llvm/MC/MCDisassembler/MCDisassembler.h"
#include "llvm/MC/MCInst.h"
#include "llvm/MC/MCInstPrinter.h"
#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrDesc.h"
#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCObjectFileInfo.h"
#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/Object/Binary.h"
#include "llvm/Object/COFF.h"
#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/TargetRegistry.h"
#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/raw_ostream.h"
#include <cassert>
#include <cstdlib>
using namespace llvm;
using namespace llvm::object;
cl::opt<bool> ArgDumpSymbols("sym", cl::desc("Dump the symbol table."));
cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input file>"),
cl::Required);
static void printSymbols(const ObjectFile *Object) {
for (const SymbolRef &Symbol : Object->symbols()) {
outs() << "Symbol [" << format_hex_no_prefix(Symbol.getValue(), 2)
<< "] = ";
auto SymbolName = Symbol.getName();
if (SymbolName)
outs() << *SymbolName;
else
outs() << "UNKNOWN";
if (Symbol.getFlags() & SymbolRef::SF_Hidden)
outs() << " .hidden";
outs() << " (Section = ";
auto SymbolSection = Symbol.getSection();
if (SymbolSection) {
StringRef SymbolSectionName;
if ((*SymbolSection)->getName(SymbolSectionName))
outs() << "UNKNOWN)";
else
outs() << SymbolSectionName << ")";
} else {
outs() << "N/A)";
}
outs() << "\n";
}
}
int main(int argc, char **argv) {
cl::ParseCommandLineOptions(argc, argv);
InitializeAllTargetInfos();
InitializeAllTargetMCs();
InitializeAllAsmParsers();
InitializeAllDisassemblers();
Expected<OwningBinary<Binary>> BinaryOrErr = createBinary(InputFilename);
if (!BinaryOrErr) {
errs() << "Failed to open file.\n";
return EXIT_FAILURE;
}
Binary &Binary = *BinaryOrErr.get().getBinary();
ObjectFile *Object = dyn_cast<ObjectFile>(&Binary);
if (!Object) {
errs() << "Disassembling of non-objects not currently supported.\n";
return EXIT_FAILURE;
}
Triple TheTriple = Object->makeTriple();
std::string TripleName = TheTriple.getTriple();
std::string ArchName = "";
std::string ErrorString;
const Target *TheTarget =
TargetRegistry::lookupTarget(ArchName, TheTriple, ErrorString);
if (!TheTarget) {
errs() << "Couldn't find target \"" << TheTriple.getTriple()
<< "\", failed with error: " << ErrorString << ".\n";
return EXIT_FAILURE;
}
SubtargetFeatures Features = Object->getFeatures();
std::unique_ptr<const MCRegisterInfo> RegisterInfo(
TheTarget->createMCRegInfo(TripleName));
if (!RegisterInfo) {
errs() << "Failed to initialise RegisterInfo.\n";
return EXIT_FAILURE;
}
std::unique_ptr<const MCAsmInfo> AsmInfo(
TheTarget->createMCAsmInfo(*RegisterInfo, TripleName));
if (!AsmInfo) {
errs() << "Failed to initialise AsmInfo.\n";
return EXIT_FAILURE;
}
std::string MCPU = "";
std::unique_ptr<MCSubtargetInfo> SubtargetInfo(
TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));
if (!SubtargetInfo) {
errs() << "Failed to initialise SubtargetInfo.\n";
return EXIT_FAILURE;
}
std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());
if (!MII) {
errs() << "Failed to initialise MII.\n";
return EXIT_FAILURE;
}
MCObjectFileInfo MOFI;
MCContext Context(AsmInfo.get(), RegisterInfo.get(), &MOFI);
std::unique_ptr<const MCDisassembler> Disassembler(
TheTarget->createMCDisassembler(*SubtargetInfo, Context));
if (!Disassembler) {
errs() << "No disassembler available for target.";
return EXIT_FAILURE;
}
std::unique_ptr<const MCInstrAnalysis> MIA(
TheTarget->createMCInstrAnalysis(MII.get()));
std::unique_ptr<MCInstPrinter> Printer(
TheTarget->createMCInstPrinter(TheTriple, AsmInfo->getAssemblerDialect(),
*AsmInfo, *MII, *RegisterInfo));
if (ArgDumpSymbols)
printSymbols(Object);
for (const SectionRef &Section : Object->sections()) {
outs() << "Section [" << format_hex_no_prefix(Section.getAddress(), 2)
<< "] = ";
StringRef SectionName;
if (Section.getName(SectionName))
outs() << "UNKNOWN.\n";
else
outs() << SectionName << "\n";
StringRef SectionContents;
if (Section.getContents(SectionContents)) {
errs() << "Failed to retrieve section contents.\n";
return EXIT_FAILURE;
}
MCInst Instruction;
size_t InstructionSize;
ArrayRef<uint8_t> SectionBytes((const uint8_t *)SectionContents.data(),
Section.getSize());
for (size_t Byte = 0; Byte < Section.getSize();) {
bool BadInstruction = false;
// Disassemble the instruction.
if (Disassembler->getInstruction(
Instruction, InstructionSize, SectionBytes.drop_front(Byte), 0,
nulls(), outs()) != MCDisassembler::Success) {
BadInstruction = true;
}
Byte += InstructionSize;
if (BadInstruction)
continue;
// Skip instructions that do not affect the control flow.
const auto &InstrDesc = MII->get(Instruction.getOpcode());
if (!InstrDesc.mayAffectControlFlow(Instruction, *RegisterInfo))
continue;
// Skip instructions that do not operate on register operands.
bool UsesRegisterOperand = false;
for (const auto &Operand : Instruction) {
if (Operand.isReg())
UsesRegisterOperand = true;
}
if (!UsesRegisterOperand)
continue;
// Print the instruction address.
outs() << " "
<< format_hex(Section.getAddress() + Byte - InstructionSize, 2)
<< ": ";
// Print the instruction bytes.
for (size_t i = 0; i < InstructionSize; ++i) {
outs() << format_hex_no_prefix(SectionBytes[Byte - InstructionSize + i],
2)
<< " ";
}
// Print the instruction.
outs() << " | " << MII->getName(Instruction.getOpcode()) << " ";
Instruction.dump_pretty(outs(), Printer.get());
outs() << "\n";
}
}
return EXIT_SUCCESS;
}