Commit Graph

241 Commits

Author SHA1 Message Date
Rui Ueyama fe65877c76 Reorganize the cpio archiver as CpioFile class. NFC.
This code separates the code to create cpio archive from the driver.

llvm-svn: 269593
2016-05-15 17:10:23 +00:00
Rui Ueyama 9194db78fb Support --build-id=0x<hexstring>.
If you specify the option in the form of --build-id=0x<hexstring>,
that hexstring is set as a build ID. We observed that the feature
is actually in use in some builds, so we want this feature.

llvm-svn: 269495
2016-05-13 21:55:56 +00:00
George Rimar fa91000290 [ELF] implemented -z defs option
Just do not allow to link shared library if there are
undefined symbols.

This fixes PR27447

Differential revision: http://reviews.llvm.org/D20169

llvm-svn: 269183
2016-05-11 13:48:41 +00:00
George Rimar c191acf097 [ELF] - Implemented -z combrelocs/nocombreloc.
This is the option which sorts relocs to optimize dynamic linker performance.
-z combelocs is the default in gold, also it ignores -z nocombreloc,
this patch do the same.

Patch sorts relocations by symbols only and do not create any
DT_REL[A]COUNT entries. That is different with what gold/bfd do.

More information about option is here:
http://www.airs.com/blog/archives/186
http://people.redhat.com/jakub/prelink.pdf, p.2

Differential revision: http://reviews.llvm.org/D19528

llvm-svn: 269066
2016-05-10 15:47:57 +00:00
Rui Ueyama 0cfed297de Handle errors on file opening as soft error.
Also improves the error message. Previously it would just print out
the cause (e.g. "permission denied"). Now it prints out something like
"--reproduce: failed to open foo.cpio: permission denied".

llvm-svn: 268551
2016-05-04 21:05:11 +00:00
Rafael Espindola 2c51e619d5 Print the cpio trailer after every member.
This is both simpler and safer. If we crash at any point, there is a
valid cpio file to reproduce the crash.

Thanks to Rui for the suggestion.

llvm-svn: 268495
2016-05-04 12:47:56 +00:00
Rafael Espindola c5b8ed241d Implement --build-id=none.
Both bfd and gold have this. It allows disabling build-id when it is the
default with by adding -Wl,--build-id=none no the clang command line.

llvm-svn: 268435
2016-05-03 20:55:47 +00:00
Rafael Espindola 1dd2b3d1d0 Produce cpio files for --reproduce.
We want --reproduce to

* not rewrite scripts and thin archives
* work with absolute paths

Given that, it pretty much has to create a full directory tree. On windows that
is problematic because of the very short maximum path limit. On most cases
users can still work around it with "--repro c:\r", but that is annoying and
not viable for automated testing.

We then need to produce some form of archive with the files. The first option
that comes to mind is .a files since we already have code for writing them.
There are a few problems with them

The format has a dedicated string table, so we cannot start writing it until
all members are known.
Regular implementations don't support creating directories. We could make
llvm-ar support that, but that is probably not a good idea.
The next natural option would be tar. The problem is that to support long path
names (which is how this started) it needs a "pax extended header" making this
an annoying format to write.

The next option I looked at seems a natural fit: cpio files.

They are available on pretty much every unix, support directories and long path
names and are really easy to write. The only slightly annoying part is a
terminator, but at least gnu cpio only prints a warning if it is missing, which
is handy for crashes. This patch still makes an effort to always create it.

llvm-svn: 268404
2016-05-03 17:30:44 +00:00
Rafael Espindola 630ccd6a3f Remove unused includes.
llvm-svn: 268381
2016-05-03 13:57:49 +00:00
Rui Ueyama 5002a67ef2 Remove unnecessary namespace specifiers.
llvm-svn: 268292
2016-05-02 19:59:56 +00:00
Rui Ueyama 4f8d21f387 Do not pass Symtab to markLive/doICF since Symtab is globally accessible.
llvm-svn: 268286
2016-05-02 19:30:42 +00:00
Peter Collingbourne 4f9527065c ELF: New symbol table design.
This patch implements a new design for the symbol table that stores
SymbolBodies within a memory region of the Symbol object. Symbols are mutated
by constructing SymbolBodies in place over existing SymbolBodies, rather
than by mutating pointers. As mentioned in the initial proposal [1], this
memory layout helps reduce the cache miss rate by improving memory locality.

Performance numbers:

           old(s) new(s)
Without debug info:
chrome      7.178  6.432 (-11.5%)
LLVMgold.so 0.505  0.502 (-0.5%)
clang       0.954  0.827 (-15.4%)
llvm-as     0.052  0.045 (-15.5%)
With debug info:
scylla      5.695  5.613 (-1.5%)
clang      14.396 14.143 (-1.8%)

Performance counter results show that the fewer required indirections is
indeed the cause of the improved performance. For example, when linking
chrome, stalled cycles decreases from 14,556,444,002 to 12,959,238,310, and
instructions per cycle increases from 0.78 to 0.83. We are also executing
many fewer instructions (15,516,401,933 down to 15,002,434,310), probably
because we spend less time allocating SymbolBodies.

The new mechanism by which symbols are added to the symbol table is by calling
add* functions on the SymbolTable.

In this patch, I handle local symbols by storing them inside "unparented"
SymbolBodies. This is suboptimal, but if we do want to try to avoid allocating
these SymbolBodies, we can probably do that separately.

I also removed a few members from the SymbolBody class that were only being
used to pass information from the input file to the symbol table.

This patch implements the new design for the ELF linker only. I intend to
prepare a similar patch for the COFF linker.

[1] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098832.html

Differential Revision: http://reviews.llvm.org/D19752

llvm-svn: 268178
2016-05-01 04:55:03 +00:00
Rui Ueyama 9aea957f6d ELF: --reproduce: Copy files referenced by linker scripts.
Previuosly, only files appeared on the command line were copied.

llvm-svn: 268171
2016-04-30 22:23:29 +00:00
Rui Ueyama aa00e96a84 ELF: Make --reproduce to produce a response file.
The aim of this patch is to make it easy to re-run the command without
updating paths in the command line. Here is a use case.

Assume that Alice is having an issue with lld and is reporting the issue
to developer Bob. Alice's current directly is /home/alice/work and her
command line is "ld.lld -o foo foo.o ../bar.o". She adds "--reproduce repro"
to the command line and re-run. Then the following text will be produced as
response.txt (notice that the paths are rewritten so that they are
relative to /home/alice/work/repro.)

  -o home/alice/work/foo home/alice/work/foo.o home/alice/bar.o

The command also produces the following files by copying inputs.

  /home/alice/repro/home/alice/work/foo.o
  /home/alice/repro/home/alice/bar.o

Alice zips the directory and send it to Bob. Bob get an archive from Alice
and extract it to his home directory as /home/bob/repro. Now his directory
have the following files.

  /home/bob/repro/response.txt
  /home/bob/repro/home/alice/work/foo.o
  /home/bob/repro/home/alice/bar.o

Bob then re-run the command with these files by the following commands.

  cd /home/bob/repro
  ld.lld @response.txt

This command will run the linker with the same command line options and
the same input files as Alice's, so it is very likely that Bob will see
the same issue as Alice saw.

Differential Revision: http://reviews.llvm.org/D19737

llvm-svn: 268169
2016-04-30 21:40:04 +00:00
Rui Ueyama fb6d499ed7 ELF: Add -O0 (produce output as fast as possible) mode.
This patch redefines the default optimization level as 1 and adds
new level 0. In the command line, it is -O0. The flag disables
costly but optional features so that the linker produces semantically
correct but larger output quickly. Currently it only disables
section merging.

This flag is not intended to be used for final production linking.
It is intended to be used in compile-link-test cycle.

Time to link clang with debug info is about 2x faster with the flag.

  Head:
  13.24 seconds
  Output size: 1227189664 bytes

  With this patch:
  7.41 seconds
  Output size: 2490281784 bytes

Differential Revision: http://reviews.llvm.org/D19705

llvm-svn: 268056
2016-04-29 16:12:29 +00:00
Simon Atanasyan ae77ab71d8 [ELF][MIPS] Accept MIPS 64-bit binaries
LLD accepts MIPS 64-bit binaries, supports corresponding eulation (-m)
arguments and emits 64-bit specific ELF flags.

llvm-svn: 268024
2016-04-29 10:39:17 +00:00
Rafael Espindola 156f4ee1c0 Use a single context for lto.
Using multiple context used to be a really big memory saving because we
could free memory from each file while the linker proceeded with the
symbol resolution. We are getting lazier about reading data from the
bitcode, so I was curious if this was still a good tradeoff.

One thing that is a bit annoying is that we still have to copy the
symbol names. The problem is that the names are stored in the Module and
get freed when we move the module bits during linking.

Long term I think the solution is to add a symbol table to the bitcode.
That way IRObject file will not need to use a Module or a Context and we
can drop it while still keeping a StringRef to the names.

This patch is still be an interesting medium term improvement.

When linking llvm-as without debug info this patch is a small speedup:

master: 29.861877513 seconds
patch: 29.814533787 seconds

With debug info the numbers are

master: 34.765181469 seconds
patch: 34.563351584 seconds

The peak memory usage when linking llvm-as with debug info was

master: 599.10MB
patch: 600.13MB
llvm-svn: 267921
2016-04-28 19:30:41 +00:00
Rui Ueyama 5d7a3d011d Do not call hasArg and getLastArg for the same option.
llvm-svn: 267839
2016-04-28 02:08:54 +00:00
Peter Collingbourne 892d498017 ELF: Re-implement -u directly and remove CanKeepUndefined flag.
The semantics of the -u flag are to load the lazy symbol named by the flag. We
were previously relying on this behavior falling out of symbol resolution
against a synthetic undefined symbol, but that didn't quite give us the
correct behavior, so we needed a flag to mark symbols created with -u so
we could treat them specially in the writer. However, it's simpler and less
error prone to implement the required behavior directly and remove the flag.

This fixes an issue where symbols loaded with -u would receive hidden
visibility even when the definition in an object file had wider visibility.

Differential Revision: http://reviews.llvm.org/D19560

llvm-svn: 267639
2016-04-27 00:05:03 +00:00
Rui Ueyama cf0dd1ebf2 Move utility functions to DriverUtils.cpp.
llvm-svn: 267602
2016-04-26 20:41:32 +00:00
Rui Ueyama 673b609586 Handle Windows drive letters and ".." for --reproduce.
When --reproduce <path> is given, then we need to concatenate input
file paths to the given path to save input files to the directory.
Previously, path concatenation didn't handle Windows drive letters
so it could generate invalid paths such as "C:\D:\foo". It also didn't
handle ".." path components, so it could produce some bad paths
such as "foo/../../etc/passwd".

In this patch, Windows drive letters and ".." are removed before
concatenating paths.

Differential Revision: http://reviews.llvm.org/D19551

llvm-svn: 267600
2016-04-26 20:36:46 +00:00
Davide Italiano e53e6e9f37 [ELF] Simplify. Pointed out by Rui Ueyama.
llvm-svn: 267523
2016-04-26 05:51:37 +00:00
Davide Italiano 034f58a9bd [ELF] Introduce --reproduce flag.
--reproduce dumps the object files in a directory chosen
(preserving the file system layout) and the linker invocation
so that people can create an archive and upload for debugging.

Differential Revision:  http://reviews.llvm.org/D19494

llvm-svn: 267497
2016-04-26 00:22:24 +00:00
Davide Italiano bfccefd514 [ELF] Reinstate 'else' which was previously removed.
It turns out it's actually needed.

llvm-svn: 267358
2016-04-24 18:23:21 +00:00
Davide Italiano 7d32a4d27f [ELF] Simplify. Remove unneeded else. NFC.
llvm-svn: 267333
2016-04-24 07:19:32 +00:00
Rui Ueyama 1e9e615f92 LTO: Merge -lto-no-discard-value-names with -save-temps.
This patch is to remove -lto-no-discard-value-names flag and
instead to use -save-temps as we discussed in the post-commit
review thread for r267020.

Differential Revision: http://reviews.llvm.org/D19437

llvm-svn: 267230
2016-04-22 21:43:10 +00:00
Peter Collingbourne 66ac1d6152 ELF: Implement basic support for --version-script.
This patch only implements support for version scripts of the form:
  { [ global: symbol1; symbol2; [...]; symbolN; ] local: *; };
No wildcards are supported, other than for the local entry. Symbol versioning
is also not supported.

It works by introducing a new Symbol flag which tracks whether a symbol
appears in the global section of a version script.

This patch also simplifies the logic in SymbolBody::isPreemptible(), and
teaches it to handle the case where symbols with default visibility in DSOs
do not appear in the dynamic symbol table because of a version script.

Fixes PR27482.

Differential Revision: http://reviews.llvm.org/D19430

llvm-svn: 267208
2016-04-22 20:21:26 +00:00
Rui Ueyama 630a382ba1 Reorganize LinkerDriver::link and add comments. NFC.
llvm-svn: 267201
2016-04-22 19:58:47 +00:00
Peter Collingbourne 3a35c4546d ELF: Scan for undefined symbols in shlibs and symbols in dynamic lists before LTO.
This ensures that these features work correctly with bitcode files.

llvm-svn: 267188
2016-04-22 18:47:52 +00:00
Peter Collingbourne 760e583e22 ELF: Implement --export-dynamic-symbol.
llvm-svn: 267184
2016-04-22 18:44:06 +00:00
Peter Collingbourne dadcc17ead ELF: Move Visibility, IsUsedInRegularObj and MustBeInDynSym flags to Symbol.
These are properties of a symbol name, rather than a particular instance
of a symbol in an object file. We can simplify the code by collecting these
properties in Symbol.

The MustBeInDynSym flag has been renamed ExportDynamic, as its semantics
have been changed to be the same as those of --dynamic-list and
--export-dynamic-symbol, which do not cause hidden symbols to be exported.

Differential Revision: http://reviews.llvm.org/D19400

llvm-svn: 267183
2016-04-22 18:42:48 +00:00
Davide Italiano 5abcb3cc5b [LTO] Discard names for values that are not global by default.
Rafael reported on the mailing list that this reduces peak memory
usage while linking llvm-as by 15%. It makes sense to make it
the default, and introduce an inverse knob -lto-no-discard-value-names
for those who want to restore the old behavior.

llvm-svn: 267020
2016-04-21 17:46:38 +00:00
Davide Italiano 3c0410d09d [LTO] Discard names for Values that are not global.
This is not on by default (but it might be in the future).
The knob to enable the optimization is -lto-discard-value-names.

llvm-svn: 266953
2016-04-21 04:46:22 +00:00
Rui Ueyama 07320e4030 ELF: Template LinkerScript class.
Originally, linker scripts were basically an alternative way to specify
options to the command line options. But as we add more features to hanlde
symbols and sections, many member functions needed to be templated.
Now most the members are templated. It is probably time to template the
entire class.

Previously, LinkerScript is an executor of the linker script as well as
a storage of linker script configurations. This is not suitable to template
the class because when we are reading linker script files, we don't know
the ELF type yet, so we can't instantiate ELF-templated classes.

In this patch, I defined a new class, ScriptConfiguration, to store
linker script configurations. ScriptParser writes parse results to it,
and LinkerScript uses them.

Differential Revision: http://reviews.llvm.org/D19302

llvm-svn: 266908
2016-04-20 20:13:41 +00:00
Davide Italiano bc176631cd [LTO] Implement parallel Codegen for LTO using splitCodeGen.
Parallelism level can be chosen using the new --lto-jobs=K option
where K is the number of threads used for CodeGen. It currently
defaults to 1.

llvm-svn: 266484
2016-04-15 22:38:10 +00:00
Rafael Espindola 38c67a27fe Store a Symbol for EntrySym.
This makes it impossible to forget to call repl on the SymbolBody.

llvm-svn: 266432
2016-04-15 14:41:56 +00:00
Rafael Espindola 3fa7352fc7 Fix warning about unused variable.
llvm-svn: 266232
2016-04-13 19:09:48 +00:00
Rafael Espindola ab14c88984 git-clang-format. NFC.
llvm-svn: 266231
2016-04-13 19:07:40 +00:00
Adhemerval Zanella 9df0720766 ELF: Implement --dynamic-list
This patch implements the --dynamic-list option, which adds a list of
global symbol that either should not be bounded by default definition
when creating shared libraries, or add in dynamic symbol table in the
case of creating executables.

The patch modifies the ScriptParserBase class to use a list of Token
instead of StringRef, which contains information if the token is a
quoted or unquoted strings. It is used to use a faster search for
exact match symbol name.

The input file follow a similar format of linker script with some
simplifications (it does not have scope or node names). It leads
to a simplified parser define in DynamicList.{cpp,h}.

Different from ld/gold neither glob pattern nor mangled names
(extern 'C++') are currently supported.

llvm-svn: 266227
2016-04-13 18:51:11 +00:00
George Rimar 2a78fceb2c [ELF] - Change -t implementation to print which archive members are used.
Previously each archive file was reported no matter were it's member used or not,
like:
lib/libLLVMSupport.a

Now lld prints line for each used internal file, like:
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/StringSaver.cpp.o)
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/Host.cpp.o)
lib/libLLVMSupport.a(lib/Support/CMakeFiles/LLVMSupport.dir/ConvertUTF.c.o)

That should be consistent with what gold do.

This fixes PR27243.

Differential revision: http://reviews.llvm.org/D19011

llvm-svn: 266220
2016-04-13 18:07:57 +00:00
Rafael Espindola 9b3f99e50f Devide _gp in the same spot as other mips symbols. NFC.
The test changes are just because of the symbol order.

llvm-svn: 266037
2016-04-12 02:24:43 +00:00
Rui Ueyama 6c8b0c1b96 Use EM_NONE instead of 0 to represent an invalid value. NFC.
llvm-svn: 265750
2016-04-07 23:54:33 +00:00
Rui Ueyama d86ec30168 ELF: Add --build-id=sha1 option.
llvm-svn: 265748
2016-04-07 23:51:56 +00:00
Rui Ueyama 3a41be277a ELF: Implement --build-id=md5.
Previously, we supported only one hash function, FNV-1, so
BuildIdSection directly handled hash computation. In this patch,
I made BuildIdSection an abstract class and defined two subclasses,
BuildIdFnv1 and BuildIdMd5.

llvm-svn: 265737
2016-04-07 22:49:21 +00:00
Rui Ueyama fc6a4b045f ELF: Add --strip-debug option.
If --strip-debug option is given, then all sections whose names start
with ".debug" are removed from output.

llvm-svn: 265722
2016-04-07 21:04:51 +00:00
Rui Ueyama 8c76487ee5 ELF: Add --no-gnu-unique option.
When the option is specified, then all STB_GNU_UNIQUE symbols are
converted to STB_GLOBAL symbols.

llvm-svn: 265717
2016-04-07 20:41:41 +00:00
Rui Ueyama f8baa66056 ELF: Implement --start-lib and --end-lib
start-lib and end-lib are options to link object files in the same
semantics as archive files. If an object is in start-lib and end-lib,
the object is linked only when the file is needed to resolve
undefined symbols. That means, if an object is in start-lib and end-lib,
it behaves as if it were in an archive file.

In this patch, I introduced a new notion, LazyObjectFile. That is
analogous to Archive file type, but that works for a single object
file instead of for an archive file.

http://reviews.llvm.org/D18814

llvm-svn: 265710
2016-04-07 19:24:51 +00:00
Rafael Espindola ccfe3cb3d6 Don't store an Elf_Sym for most symbols.
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.

Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.

There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.

The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.

As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.

In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.

llvm-svn: 265293
2016-04-04 14:04:16 +00:00
Davide Italiano 842fa53026 [LTO] Implement -disable-verify, which disables bitcode verification.
So, there are some cases when the IR Linker produces a broken
module (which doesn't pass the verifier) and we end up asserting
inside the verifier. I think it's always a bug producing a module
which does not pass the verifier but there are some cases in which
people can live with the broken module (e.g. if only DebugInfo
metadata are broken). The gold plugin has something similar.

This commit is motivated by a situation I found in the
wild. It seems that somebody else discovered it independently
and reported in PR24923.

llvm-svn: 265258
2016-04-03 03:39:09 +00:00
Davide Italiano 8848d44e46 [LTO] Reject invalid optimization levels.
llvm-svn: 265255
2016-04-03 02:41:15 +00:00