hanchenye-llvm-project/lld/ELF/Symbols.h

451 lines
15 KiB
C
Raw Normal View History

//===- Symbols.h ------------------------------------------------*- C++ -*-===//
//
// The LLVM Linker
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
2015-10-14 03:51:57 +08:00
//
// All symbols are handled as SymbolBodies regardless of their types.
// This file defines various types of SymbolBodies.
//
//===----------------------------------------------------------------------===//
#ifndef LLD_ELF_SYMBOLS_H
#define LLD_ELF_SYMBOLS_H
#include "InputSection.h"
#include "Strings.h"
#include "lld/Common/LLVM.h"
#include "llvm/Object/Archive.h"
#include "llvm/Object/ELF.h"
namespace lld {
2016-02-28 08:25:54 +08:00
namespace elf {
class ArchiveFile;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
class BitcodeFile;
class BssSection;
class InputFile;
class LazyObjFile;
template <class ELFT> class ObjFile;
class OutputSection;
template <class ELFT> class SharedFile;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
struct Symbol;
// The base class for real symbol classes.
class SymbolBody {
public:
enum Kind {
DefinedFirst,
DefinedRegularKind = DefinedFirst,
SharedKind,
DefinedCommonKind,
DefinedLast = DefinedCommonKind,
UndefinedKind,
LazyArchiveKind,
LazyObjectKind,
};
SymbolBody(Kind K) : SymbolKind(K) {}
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
Symbol *symbol();
const Symbol *symbol() const {
return const_cast<SymbolBody *>(this)->symbol();
}
Kind kind() const { return static_cast<Kind>(SymbolKind); }
bool isUndefined() const { return SymbolKind == UndefinedKind; }
bool isDefined() const { return SymbolKind <= DefinedLast; }
bool isCommon() const { return SymbolKind == DefinedCommonKind; }
bool isShared() const { return SymbolKind == SharedKind; }
bool isLocal() const { return IsLocal; }
bool isLazy() const {
return SymbolKind == LazyArchiveKind || SymbolKind == LazyObjectKind;
}
bool isInCurrentOutput() const {
return SymbolKind == DefinedRegularKind || SymbolKind == DefinedCommonKind;
}
// True is this is an undefined weak symbol. This only works once
// all input files have been added.
bool isUndefWeak() const;
InputFile *getFile() const;
StringRef getName() const { return Name; }
uint8_t getVisibility() const { return StOther & 0x3; }
void parseSymbolVersion();
void copyFrom(SymbolBody *Other);
bool isInGot() const { return GotIndex != -1U; }
bool isInPlt() const { return PltIndex != -1U; }
uint64_t getVA(int64_t Addend = 0) const;
uint64_t getGotOffset() const;
uint64_t getGotVA() const;
uint64_t getGotPltOffset() const;
uint64_t getGotPltVA() const;
uint64_t getPltVA() const;
template <class ELFT> typename ELFT::uint getSize() const;
OutputSection *getOutputSection() const;
uint32_t DynsymIndex = 0;
uint32_t GotIndex = -1;
uint32_t GotPltIndex = -1;
uint32_t PltIndex = -1;
uint32_t GlobalDynIndex = -1;
protected:
SymbolBody(Kind K, StringRefZ Name, bool IsLocal, uint8_t StOther,
uint8_t Type)
: SymbolKind(K), IsLocal(IsLocal), NeedsPltAddr(false),
IsInGlobalMipsGot(false), Is32BitMipsGot(false), IsInIplt(false),
IsInIgot(false), IsPreemptible(false), Type(Type), StOther(StOther),
Name(Name) {}
const unsigned SymbolKind : 8;
2015-12-25 14:55:39 +08:00
// True if this is a local symbol.
unsigned IsLocal : 1;
public:
// True the symbol should point to its PLT entry.
// For SharedSymbol only.
unsigned NeedsPltAddr : 1;
// True if this symbol has an entry in the global part of MIPS GOT.
unsigned IsInGlobalMipsGot : 1;
// True if this symbol is referenced by 32-bit GOT relocations.
unsigned Is32BitMipsGot : 1;
// True if this symbol is in the Iplt sub-section of the Plt.
unsigned IsInIplt : 1;
// True if this symbol is in the Igot sub-section of the .got.plt or .got.
unsigned IsInIgot : 1;
unsigned IsPreemptible : 1;
2016-04-05 02:15:38 +08:00
// The following fields have the same meaning as the ELF symbol attributes.
uint8_t Type; // symbol type
uint8_t StOther; // st_other field value
2016-04-05 02:15:38 +08:00
// The Type field may also have this value. It means that we have not yet seen
// a non-Lazy symbol with this name, so we don't know what its type is. The
// Type field is normally set to this value for Lazy symbols unless we saw a
// weak undefined symbol first, in which case we need to remember the original
// symbol's type in order to check for TLS mismatches.
enum { UnknownType = 255 };
bool isSection() const { return Type == llvm::ELF::STT_SECTION; }
bool isTls() const { return Type == llvm::ELF::STT_TLS; }
bool isFunc() const { return Type == llvm::ELF::STT_FUNC; }
bool isGnuIFunc() const { return Type == llvm::ELF::STT_GNU_IFUNC; }
bool isObject() const { return Type == llvm::ELF::STT_OBJECT; }
bool isFile() const { return Type == llvm::ELF::STT_FILE; }
protected:
StringRefZ Name;
};
// The base class for any defined symbols.
class Defined : public SymbolBody {
public:
Defined(Kind K, StringRefZ Name, bool IsLocal, uint8_t StOther, uint8_t Type)
: SymbolBody(K, Name, IsLocal, StOther, Type) {}
static bool classof(const SymbolBody *S) { return S->isDefined(); }
};
class DefinedCommon : public Defined {
public:
DefinedCommon(StringRef Name, uint64_t Size, uint32_t Alignment,
uint8_t StOther, uint8_t Type)
: Defined(DefinedCommonKind, Name, /*IsLocal=*/false, StOther, Type),
Alignment(Alignment), Size(Size) {}
static bool classof(const SymbolBody *S) {
return S->kind() == DefinedCommonKind;
}
// The maximum alignment we have seen for this symbol.
uint32_t Alignment;
// The output offset of this common symbol in the output bss.
// Computed by the writer.
uint64_t Size;
BssSection *Section = nullptr;
};
// Regular defined symbols read from object file symbol tables.
class DefinedRegular : public Defined {
public:
DefinedRegular(StringRefZ Name, bool IsLocal, uint8_t StOther, uint8_t Type,
uint64_t Value, uint64_t Size, SectionBase *Section)
: Defined(DefinedRegularKind, Name, IsLocal, StOther, Type), Value(Value),
Size(Size), Section(Section) {}
// Return true if the symbol is a PIC function.
template <class ELFT> bool isMipsPIC() const;
static bool classof(const SymbolBody *S) {
return S->kind() == DefinedRegularKind;
}
uint64_t Value;
uint64_t Size;
SectionBase *Section;
};
class Undefined : public SymbolBody {
public:
Undefined(StringRefZ Name, bool IsLocal, uint8_t StOther, uint8_t Type)
: SymbolBody(UndefinedKind, Name, IsLocal, StOther, Type) {}
static bool classof(const SymbolBody *S) {
return S->kind() == UndefinedKind;
}
};
class SharedSymbol : public Defined {
public:
static bool classof(const SymbolBody *S) { return S->kind() == SharedKind; }
SharedSymbol(StringRef Name, uint8_t StOther, uint8_t Type,
const void *ElfSym, const void *Verdef)
: Defined(SharedKind, Name, /*IsLocal=*/false, StOther, Type),
Verdef(Verdef), ElfSym(ElfSym) {
2017-10-13 05:45:53 +08:00
// GNU ifunc is a mechanism to allow user-supplied functions to
// resolve PLT slot values at load-time. This is contrary to the
// regualr symbol resolution scheme in which symbols are resolved just
// by name. Using this hook, you can program how symbols are solved
// for you program. For example, you can make "memcpy" to be resolved
// to a SSE-enabled version of memcpy only when a machine running the
// program supports the SSE instruction set.
//
// Naturally, such symbols should always be called through their PLT
// slots. What GNU ifunc symbols point to are resolver functions, and
// calling them directly doesn't make sense (unless you are writing a
// loader).
//
// For DSO symbols, we always call them through PLT slots anyway.
// So there's no difference between GNU ifunc and regular function
2017-10-13 05:52:14 +08:00
// symbols if they are in DSOs. So we can handle GNU_IFUNC as FUNC.
2017-10-13 05:45:53 +08:00
if (this->Type == llvm::ELF::STT_GNU_IFUNC)
this->Type = llvm::ELF::STT_FUNC;
}
template <class ELFT> SharedFile<ELFT> *getFile() const {
return cast<SharedFile<ELFT>>(SymbolBody::getFile());
}
template <class ELFT> uint64_t getShndx() const {
return getSym<ELFT>().st_shndx;
}
template <class ELFT> uint64_t getValue() const {
return getSym<ELFT>().st_value;
}
template <class ELFT> uint64_t getSize() const {
return getSym<ELFT>().st_size;
}
template <class ELFT> uint32_t getAlignment() const;
// This field is a pointer to the symbol's version definition.
const void *Verdef;
// If not null, there is a copy relocation to this section.
InputSection *CopyRelSec = nullptr;
private:
template <class ELFT> const typename ELFT::Sym &getSym() const {
return *(const typename ELFT::Sym *)ElfSym;
}
const void *ElfSym;
};
// This represents a symbol that is not yet in the link, but we know where to
// find it if needed. If the resolver finds both Undefined and Lazy for the same
// name, it will ask the Lazy to load a file.
//
// A special complication is the handling of weak undefined symbols. They should
// not load a file, but we have to remember we have seen both the weak undefined
// and the lazy. We represent that with a lazy symbol with a weak binding. This
// means that code looking for undefined symbols normally also has to take lazy
// symbols into consideration.
class Lazy : public SymbolBody {
public:
static bool classof(const SymbolBody *S) { return S->isLazy(); }
// Returns an object file for this symbol, or a nullptr if the file
// was already returned.
InputFile *fetch();
protected:
Lazy(Kind K, StringRef Name, uint8_t Type)
: SymbolBody(K, Name, /*IsLocal=*/false, llvm::ELF::STV_DEFAULT, Type) {}
};
// This class represents a symbol defined in an archive file. It is
// created from an archive file header, and it knows how to load an
// object file from an archive to replace itself with a defined
// symbol.
class LazyArchive : public Lazy {
public:
LazyArchive(const llvm::object::Archive::Symbol S, uint8_t Type)
: Lazy(LazyArchiveKind, S.getName(), Type), Sym(S) {}
static bool classof(const SymbolBody *S) {
return S->kind() == LazyArchiveKind;
}
ArchiveFile *getFile();
InputFile *fetch();
private:
const llvm::object::Archive::Symbol Sym;
};
// LazyObject symbols represents symbols in object files between
// --start-lib and --end-lib options.
class LazyObject : public Lazy {
public:
LazyObject(StringRef Name, uint8_t Type) : Lazy(LazyObjectKind, Name, Type) {}
static bool classof(const SymbolBody *S) {
return S->kind() == LazyObjectKind;
}
LazyObjFile *getFile();
InputFile *fetch();
};
// Some linker-generated symbols need to be created as
// DefinedRegular symbols.
struct ElfSym {
// __bss_start
static DefinedRegular *Bss;
// etext and _etext
static DefinedRegular *Etext1;
static DefinedRegular *Etext2;
// edata and _edata
static DefinedRegular *Edata1;
static DefinedRegular *Edata2;
// end and _end
static DefinedRegular *End1;
static DefinedRegular *End2;
// The _GLOBAL_OFFSET_TABLE_ symbol is defined by target convention to
// be at some offset from the base of the .got section, usually 0 or
// the end of the .got.
static DefinedRegular *GlobalOffsetTable;
// _gp, _gp_disp and __gnu_local_gp symbols. Only for MIPS.
static DefinedRegular *MipsGp;
static DefinedRegular *MipsGpDisp;
static DefinedRegular *MipsLocalGp;
};
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
// A real symbol object, SymbolBody, is usually stored within a Symbol. There's
// always one Symbol for each symbol name. The resolver updates the SymbolBody
// stored in the Body field of this object as it resolves symbols. Symbol also
// holds computed properties of symbol names.
struct Symbol {
// Symbol binding. This is on the Symbol to track changes during resolution.
// In particular:
// An undefined weak is still weak when it resolves to a shared library.
// An undefined weak will not fetch archive members, but we have to remember
// it is weak.
uint8_t Binding;
// Version definition index.
uint16_t VersionId;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
// Symbol visibility. This is the computed minimum visibility of all
// observed non-DSO symbols.
unsigned Visibility : 2;
// True if the symbol was used for linking and thus need to be added to the
// output file's symbol table. This is true for all symbols except for
// unreferenced DSO symbols and bitcode symbols that are unreferenced except
// by other bitcode objects.
unsigned IsUsedInRegularObj : 1;
// If this flag is true and the symbol has protected or default visibility, it
// will appear in .dynsym. This flag is set by interposable DSO symbols in
// executables, by most symbols in DSOs and executables built with
// --export-dynamic, and by dynamic lists.
unsigned ExportDynamic : 1;
// False if LTO shouldn't inline whatever this symbol points to. If a symbol
// is overwritten after LTO, LTO shouldn't inline the symbol because it
// doesn't know the final contents of the symbol.
unsigned CanInline : 1;
// True if this symbol is specified by --trace-symbol option.
unsigned Traced : 1;
// This symbol version was found in a version script.
unsigned InVersionScript : 1;
// The file from which this symbol was created.
InputFile *File = nullptr;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
bool includeInDynsym() const;
uint8_t computeBinding() const;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
bool isWeak() const { return Binding == llvm::ELF::STB_WEAK; }
// This field is used to store the Symbol's SymbolBody. This instantiation of
// AlignedCharArrayUnion gives us a struct with a char array field that is
// large and aligned enough to store any derived class of SymbolBody.
llvm::AlignedCharArrayUnion<DefinedCommon, DefinedRegular, Undefined,
SharedSymbol, LazyArchive, LazyObject>
Body;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
SymbolBody *body() { return reinterpret_cast<SymbolBody *>(Body.buffer); }
const SymbolBody *body() const { return const_cast<Symbol *>(this)->body(); }
};
void printTraceSymbol(Symbol *Sym);
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
template <typename T, typename... ArgT>
void replaceBody(Symbol *S, InputFile *File, ArgT &&... Arg) {
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
static_assert(sizeof(T) <= sizeof(S->Body), "Body too small");
static_assert(alignof(T) <= alignof(decltype(S->Body)),
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
"Body not aligned enough");
assert(static_cast<SymbolBody *>(static_cast<T *>(nullptr)) == nullptr &&
"Not a SymbolBody");
S->File = File;
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
new (S->Body.buffer) T(std::forward<ArgT>(Arg)...);
// Print out a log message if --trace-symbol was specified.
// This is for debugging.
if (S->Traced)
printTraceSymbol(S);
ELF: New symbol table design. This patch implements a new design for the symbol table that stores SymbolBodies within a memory region of the Symbol object. Symbols are mutated by constructing SymbolBodies in place over existing SymbolBodies, rather than by mutating pointers. As mentioned in the initial proposal [1], this memory layout helps reduce the cache miss rate by improving memory locality. Performance numbers: old(s) new(s) Without debug info: chrome 7.178 6.432 (-11.5%) LLVMgold.so 0.505 0.502 (-0.5%) clang 0.954 0.827 (-15.4%) llvm-as 0.052 0.045 (-15.5%) With debug info: scylla 5.695 5.613 (-1.5%) clang 14.396 14.143 (-1.8%) Performance counter results show that the fewer required indirections is indeed the cause of the improved performance. For example, when linking chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and instructions per cycle increases from 0.78 to 0.83. We are also executing many fewer instructions (15,516,401,933 down to 15,002,434,310), probably because we spend less time allocating SymbolBodies. The new mechanism by which symbols are added to the symbol table is by calling add* functions on the SymbolTable. In this patch, I handle local symbols by storing them inside "unparented" SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating these SymbolBodies, we can probably do that separately. I also removed a few members from the SymbolBody class that were only being used to pass information from the input file to the symbol table. This patch implements the new design for the ELF linker only. I intend to prepare a similar patch for the COFF linker. [1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html Differential Revision: http://reviews.llvm.org/D19752 llvm-svn: 268178
2016-05-01 12:55:03 +08:00
}
inline Symbol *SymbolBody::symbol() {
assert(!isLocal());
return reinterpret_cast<Symbol *>(reinterpret_cast<char *>(this) -
offsetof(Symbol, Body));
}
2016-02-28 08:25:54 +08:00
} // namespace elf
std::string toString(const elf::SymbolBody &B);
} // namespace lld
#endif