Commit Graph

89 Commits

Author SHA1 Message Date
Ted Kremenek 29b8697393 Move PTHStatCache within the anonymous namespace.
llvm-svn: 65348
2009-02-23 23:27:54 +00:00
Ted Kremenek e76eb060c7 Fix another PTH warning that should not be a note.
llvm-svn: 65072
2009-02-19 22:14:49 +00:00
Ted Kremenek 36b005db45 Make PTH warnings actual warnings instead of 'notes'.
llvm-svn: 65071
2009-02-19 22:13:40 +00:00
Ted Kremenek 2fd18ec43a PTH: Cache directory and negative 'stat' calls. This gives us a 1% performance improvement on Cocoa.h (fsyntax-only+PTH).
llvm-svn: 64490
2009-02-13 22:07:44 +00:00
Ted Kremenek 29942a349c Add some boilerplate to the PTH file to prepare for the caching of stats for directories (and negative stats too).
llvm-svn: 64477
2009-02-13 19:13:46 +00:00
Eli Friedman 159a7cbc36 Fix gcc warning: gcc correctly notes that const-qualifying the return
type doesn't do anything.

llvm-svn: 64424
2009-02-13 01:02:29 +00:00
Daniel Dunbar ad027c7781 Fix assertion when input is an empty string.
llvm-svn: 64397
2009-02-12 19:31:53 +00:00
Ted Kremenek b4c85ccaaa Re-enable PTH stat caching. All tests pass now.
llvm-svn: 64356
2009-02-12 03:45:39 +00:00
Ted Kremenek c6a2a37222 Fix bad reading of bytes in ReadUnalignedLE64() (copy-paste error).
llvm-svn: 64355
2009-02-12 03:39:55 +00:00
Ted Kremenek 3280145da4 Temporarily disable PTH stat caching as it appears to be failing on some machines.
llvm-svn: 64354
2009-02-12 03:36:54 +00:00
Ted Kremenek a5c2c27ebd PTH: Cache stat information for files in the PTH file. Hook up FileManager
to use this stat information in the PTH file using a 'StatSysCallCache' object.

Performance impact (Cocoa.h, PTH):
- number of stat calls reduces from 1230 to 425
- fsyntax-only: time improves by 4.2% 

We can reduce the number of stat calls to almost zero by caching negative stat
calls and directory stat calls in the PTH file as well.

llvm-svn: 64353
2009-02-12 03:26:59 +00:00
Ted Kremenek 4c1d41f2b1 PTH: Have meta data be at the beginning of the PTH file, not the end.
llvm-svn: 64338
2009-02-11 23:34:32 +00:00
Ted Kremenek e5554deb45 PTH: Replace string identifier to persistent ID lookup with a hashtable. This is
actually *slightly* slower than the binary search. Since this is algorithmically
better, further performance tuning should be able to make this faster.

llvm-svn: 64326
2009-02-11 21:29:16 +00:00
Ted Kremenek 8527e3a727 PTH: Don't emit the PTH offset of the IdentifierInfo string data as that data is
referenced by other tables.

llvm-svn: 64304
2009-02-11 16:06:55 +00:00
Ted Kremenek 86423a9993 PTH: Replace ad hoc 'file name' -> 'PTH data' lookup table in the PTH file with an on-disk chained hash table. This data structure is implemented using templates, and will be used to replace similar data structures. This change leads to no visibile performance impact on Cocoa.h, but now we only pay a price for the table on order with the number of files accessed and not the number in the PTH file.
llvm-svn: 64245
2009-02-10 22:16:22 +00:00
Ted Kremenek 62224c1d7f Add more PTH diagnostics for invalid PTH files, etc.
llvm-svn: 63232
2009-01-28 21:02:43 +00:00
Ted Kremenek 3b0589e4b4 Enhance PTHManager::Create() to take an optional Diagnostic* argument that can be used to report issues such as a missing PTH file.
llvm-svn: 63231
2009-01-28 20:49:33 +00:00
Ted Kremenek 8d178f4357 PTH: Use Token::setLiteralData() to directly store a pointer to cached spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading.
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.

llvm-svn: 63072
2009-01-27 00:01:05 +00:00
Ted Kremenek 327d00cd45 Silence warning.
llvm-svn: 63054
2009-01-26 22:16:12 +00:00
Ted Kremenek 978b5becea Add version number checking to PTH files.
llvm-svn: 63047
2009-01-26 21:50:21 +00:00
Ted Kremenek eb8c8fbd63 Embed the offset of the PTH table inside the prologue of the PTH file. This will help improve gradual versioning of PTH files instead of relying that the PTH table is at a fixed offset.
llvm-svn: 63045
2009-01-26 21:43:14 +00:00
Chris Lattner 4fa23625ab Check in the long promised SourceLocation rewrite. This lays the
ground work for implementing #line, and fixes the "out of macro ID's" 
problem.

There is nothing particularly tricky about the code, other than the
very performance sensitive SourceManager::getFileID() method.

llvm-svn: 62978
2009-01-26 00:43:02 +00:00
Chris Lattner 1f6c7fe6a8 This is a follow-up to r62675:
Refactor how the preprocessor changes a token from being an tok::identifier to a 
keyword (e.g. tok::kw_for).  Instead of doing this in HandleIdentifier, hoist this
common case out into the caller, so that every keyword doesn't have to go through
HandleIdentifier.  This drops time in HandleIdentifier from 1.25ms to .62ms, and
speeds up clang -Eonly with PTH by about 1%.

llvm-svn: 62855
2009-01-23 18:35:48 +00:00
Chris Lattner f8ccb4f9e3 Update comment.
llvm-svn: 62819
2009-01-23 00:13:28 +00:00
Chris Lattner 34eab390b9 remove my gross #ifdef's, using portable abstractions now that the 32-bit
load is always aligned.

I verified that the bswap doesn't occur in the assembly code on x86.

llvm-svn: 62815
2009-01-22 23:50:07 +00:00
Chris Lattner fec5470f03 remove Read8/Read24, which are dead. Rename Read16/Read32 to be more
descriptive.

llvm-svn: 62775
2009-01-22 19:48:26 +00:00
Ted Kremenek ae54f2f590 Fix <rdar://problem/6512717> by correctly reading the right offset in the token data in PTHLexer::getSourceLocation().
llvm-svn: 62725
2009-01-21 22:41:38 +00:00
Chris Lattner 3029b35faa merge two checks for identifiers in the pth loop into one.
llvm-svn: 62677
2009-01-21 07:50:06 +00:00
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Ted Kremenek 8d6c828728 Don't crash on empty PTH files. This fixes <rdar://problem/6512714>.
llvm-svn: 62673
2009-01-21 07:34:28 +00:00
Chris Lattner c950296006 really we only need on Read24!
llvm-svn: 62672
2009-01-21 07:28:57 +00:00
Chris Lattner 47def9787e revert my previous patch, it assumed endianness.
llvm-svn: 62671
2009-01-21 07:21:56 +00:00
Chris Lattner a74f7cbb9d minor cleanups: now that tokens are 4-byte aligned in a PTH
file, just load them directly as ints.

llvm-svn: 62668
2009-01-21 07:06:08 +00:00
Ted Kremenek 52f73cad4a Fix: <rdar://problem/6510344> [pth] PTH slows down regular lexer considerably (when it has substantial work)
Changes to IdentifierTable:
- High-level summary: StringMap never owns IdentifierInfos.  It just
references them.
- The string map now has StringMapEntry<IdentifierInfo*> instead of
  StringMapEntry<IdentifierInfo>.  The IdentifierInfo object is
  allocated using the same bump pointer allocator as used by the
  StringMap.

Changes to IdentifierInfo:
- Added an extra pointer to point to the
  StringMapEntry<IdentifierInfo*> in the string map.  This pointer
  will be null if the IdentifierInfo* is *only* used by the PTHLexer
  (that is it isn't in the StringMap).

Algorithmic changes:
- Non-PTH case:
   IdentifierInfo::get() will always consult the StringMap first to
   see if we have an IdentifierInfo object.  If that StringMapEntry
   references a null pointer, we allocate a new one from the BumpPtrAllocator
   and update the reference in the StringMapEntry.
- PTH case:
   We do the same lookup as with the non-PTH case, but if we don't get
   a hit in the StringMap we do a secondary lookup in the PTHManager for
   the IdentifierInfo.  If we don't find an IdentifierInfo we create a
   new one as in the non-PTH case.  If we do find and IdentifierInfo
   in the PTHManager, we update the StringMapEntry to refer to it so
   that the IdentifierInfo will be found on the next StringMap lookup.
   This way we only do a binary search in the PTH file at most once
   for a given IdentifierInfo.  This greatly speeds things up for source
   files containing a non-trivial amount of code.

Performance impact:
   While these changes do add some extra indirection in
   IdentifierTable to access an IdentifierInfo*, I saw speedups even
   in the non-PTH case as well.

   Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup.
   PTH (with Cocoa.h in token cache): 11% speedup.

   I also did an experiment where we did -fsyntax-only on a source file
   including a large header and Cocoa.h, but the token cache did not
   contain the larger header.  For this file, we were seeing a performance
   *regression* when using PTH of 3% over non-PTH.  Now we are seeing
   a performance improvement of 9%!

Tests:
   The serialization tests are now failing.  I looked at this extensively,
   and I my belief is that this change is unmasking a bug rather than
   introducing a new one.  I have disabled the serialization tests for now.

llvm-svn: 62636
2009-01-20 23:28:34 +00:00
Ted Kremenek 8433f0b400 PTH: Emitted tokens now consist of 12 bytes that are loaded used 3 32-bit loads. This reduces user time but increases system time because of the slightly larger PTH file. Although there is no performance win on Cocoa.h and -Eonly, overall this seems like a good step.
llvm-svn: 62542
2009-01-19 23:13:15 +00:00
Chris Lattner 144aacd19e rearrange GetIdentifierInfo so that the fast path can be partially inlined into PTHLexer::Lex. This speeds up the user time of PTH -Eonly by another 2ms (4.4%)
llvm-svn: 62454
2009-01-18 02:57:21 +00:00
Chris Lattner 18fc6ceb56 rename some variables, only set a tokens identifierinfo if non-null.
llvm-svn: 62450
2009-01-18 02:34:01 +00:00
Chris Lattner 9cdd877436 On i386 and x86-64, just do unaligned loads
instead of assembling from bytes.  This speeds up -Eonly PTH reading 
of cocoa.h by about 2ms, which is 4.2%.

llvm-svn: 62447
2009-01-18 02:19:16 +00:00
Chris Lattner 137d6492a8 switch PTHLexer to use Read32 and friends instead of lots of inlined
copies.  I verified that this causes no performance change in PTH.

llvm-svn: 62445
2009-01-18 02:10:31 +00:00
Chris Lattner eb09754a9d switch PTH lexer from using "const char*"s to "const unsigned char*"s
internally.  This is just a cleanup that reduces the need to cast to
unsigned char before assembling a larger integer.

llvm-svn: 62442
2009-01-18 01:57:14 +00:00
Chris Lattner ab1d4b8abd simplify PTHManager::CreateLexer
llvm-svn: 62424
2009-01-17 08:06:50 +00:00
Chris Lattner 3793bba26f suck the call to "getSpellingLoc" that all clients do into
the implementation of PTHManager::getSpelling.

llvm-svn: 62408
2009-01-17 06:29:33 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Ted Kremenek 4bbb79a642 PTH: Fix termination condition in binary search.
llvm-svn: 62277
2009-01-15 19:28:38 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Ted Kremenek bef9fc2240 PTH: Embed a persistentID side-table in the PTH file that is sorted in the
lexical order of the corresponding identifier strings. This will be used for a
forthcoming optimization. This slows down PTH generation time by 7%. We can
revert this change if the optimization proves to not be valuable.

llvm-svn: 62248
2009-01-15 01:26:25 +00:00
Ted Kremenek e9814186ac PTH:
- Use canonical FileID when using getSpelling() caching.  This
  addresses some cache misses we were seeing with -fsyntax-only on
  Cocoa.h
- Added Preprocessor::getPhysicalCharacterAt() utility method for
  clients to grab the first character at a specified sourcelocation.
  This uses the PTH spelling cache.
- Modified Sema::ActOnNumericConstant() to use
  Preprocessor::getPhysicalCharacterAt() instead of
  SourceManager::getCharacterData() (to get PTH hits).

These changes cause -fsyntax-only to not page in any sources from
Cocoa.h.  We see a speedup of 27%.

llvm-svn: 62193
2009-01-13 23:19:12 +00:00
Ted Kremenek 7cbdcc25d4 Fix corner cases in PTH getSpelling() binary search.
llvm-svn: 62187
2009-01-13 22:16:45 +00:00
Ted Kremenek b0b4f74b6b PTH: Fix remaining cases where the spelling cache in the PTH file was being missed when it shouldn't. This shaves another 7% off PTH time for -Eonly on Cocoa.h
llvm-svn: 62186
2009-01-13 22:05:50 +00:00