Commit Graph

481 Commits

Author SHA1 Message Date
Chris Lattner a91c30fdb0 simplify and refactor a bunch of type definition code in Preprocessor
predefines buffer initialization.

llvm-svn: 63919
2009-02-06 05:04:11 +00:00
Chris Lattner 61898606dc remove some ad-hocery and use DefineTypeSize for more things.
Now you too can have a 47 bit long long!

llvm-svn: 63918
2009-02-06 04:55:18 +00:00
Chris Lattner a3dc5d8423 refactor some code into a DefineTypeSize function.
llvm-svn: 63917
2009-02-06 04:50:25 +00:00
Chris Lattner fafd8d1be9 correct and generalize computation of __INTMAX_MAX__.
llvm-svn: 63848
2009-02-05 07:27:41 +00:00
Chris Lattner 8181312251 fix some differences between apple gcc and clang on darwin/x86-32.
llvm-svn: 63846
2009-02-05 07:19:24 +00:00
Chris Lattner 022923a22a Fix PR3464 by searching for headers from the predefines
buffer as if the #include happened from the main file.

llvm-svn: 63764
2009-02-04 19:45:07 +00:00
Chris Lattner 1c967784f3 Implement handling of file entry/exit notifications from GNU
line markers, including maintenance of the virtual include stack.

For something like this:

# 42 "bar.c" 1
# 142 "bar2.c" 1

#warning zappa
# 92 "bar.c" 2
#warning gonzo
# 102 "foo.c" 2
#warning bonkta


we now produce these three warnings:

#1:
In file included from foo.c:3:
In file included from bar.c:42:
bar2.c:143:2: warning: #warning zappa
#warning zappa
 ^

#2:
In file included from foo.c:3:
bar.c:92:2: warning: #warning gonzo
#warning gonzo
 ^

#3:
foo.c:102:2: warning: #warning bonkta
#warning bonkta
 ^

llvm-svn: 63722
2009-02-04 06:25:26 +00:00
Chris Lattner 0a1a8d8514 propagate linemarker flags down into the the line table, currently
ignoring include stack push/pop info though.

llvm-svn: 63719
2009-02-04 05:21:58 +00:00
Chris Lattner 1eaa70a612 stub out basic #line handling calls.
llvm-svn: 63667
2009-02-03 21:52:55 +00:00
Chris Lattner 60f36223a9 move library-specific diagnostic headers into library private dirs. Reduce
redundant #includes.  Patch by Anders Johnsen!

llvm-svn: 63271
2009-01-29 05:15:15 +00:00
Ted Kremenek 62224c1d7f Add more PTH diagnostics for invalid PTH files, etc.
llvm-svn: 63232
2009-01-28 21:02:43 +00:00
Ted Kremenek 3b0589e4b4 Enhance PTHManager::Create() to take an optional Diagnostic* argument that can be used to report issues such as a missing PTH file.
llvm-svn: 63231
2009-01-28 20:49:33 +00:00
Chris Lattner 7368d581c1 Split the single monolithic DiagnosticKinds.def file into one
.def file for each library.  This means that adding a diagnostic
to sema doesn't require all the other libraries to be rebuilt.

Patch by Anders Johnsen!

llvm-svn: 63111
2009-01-27 18:30:58 +00:00
Chris Lattner f1ca7d3e02 Introduce a new PresumedLoc class to represent the concept of a location
as reported to the user and as manipulated by #line.  This is what __FILE__,
__INCLUDE_LEVEL__, diagnostics and other things should follow (but not 
dependency generation!).  

This patch also includes several cleanups along the way: 

- SourceLocation now has a dump method, and several other places 
  that did similar things now use it.
- I cleaned up some code in AnalysisConsumer, but it should probably be
  simplified further now that NamedDecl is better.
- TextDiagnosticPrinter is now simplified and cleaned up a bit.

This patch is a prerequisite for #line, but does not actually provide 
any #line functionality.

llvm-svn: 63098
2009-01-27 07:57:44 +00:00
Chris Lattner bf648a3a63 Fix a bug that I noticed by inspection.
llvm-svn: 63094
2009-01-27 05:34:03 +00:00
Ted Kremenek 8d178f4357 PTH: Use Token::setLiteralData() to directly store a pointer to cached spelling data in the PTH file. This removes a ton of code for looking up spellings using sourcelocations in the PTH file. This simplifies both PTH-generation and reading.
Performance impact for -fsyntax-only on Cocoa.h (with Cocoa.h in the PTH file):
- PTH generation time improves by 5%
- PTH reading improves by 0.3%.

llvm-svn: 63072
2009-01-27 00:01:05 +00:00
Chris Lattner d381721810 Fix a bug I introduced in my changes, which caused MeasureTokenLength
to crash when given an instantiation location.  Thanks to Fariborz for
the testcase.

llvm-svn: 63057
2009-01-26 22:24:27 +00:00
Ted Kremenek 327d00cd45 Silence warning.
llvm-svn: 63054
2009-01-26 22:16:12 +00:00
Ted Kremenek 978b5becea Add version number checking to PTH files.
llvm-svn: 63047
2009-01-26 21:50:21 +00:00
Ted Kremenek eb8c8fbd63 Embed the offset of the PTH table inside the prologue of the PTH file. This will help improve gradual versioning of PTH files instead of relying that the PTH table is at a fixed offset.
llvm-svn: 63045
2009-01-26 21:43:14 +00:00
Chris Lattner 357b57d749 remove my hacks that aggressively threw away multiple
instantiation history in an effort to speed up c99-intconst-1.c.
Now that multiple nested instantiations are allowed, we just
make them and don't pay the cost of lookups.  With the other
changes that went in before this, reverting this is actually
a speedup for c99-intconst-1.c, speeding it up from 1.96s to 1.80s,
and preserves much better loc info.

llvm-svn: 63036
2009-01-26 20:24:53 +00:00
Chris Lattner 7e20927756 allow _Pragmas formed from #defines to keep their full instantiation
history

llvm-svn: 63035
2009-01-26 20:15:46 +00:00
Chris Lattner 5a7971e0c3 This change refactors some of the low-level lexer interfaces a bit.
Token now has a class of kinds for "literals", which include 
numeric constants, strings, etc.  These tokens can optionally have
a pointer to the start of the token in the lexer buffer.  This 
makes it faster to get spelling and do other gymnastics, because we
don't have to go through source locations.

This change is performance neutral, but will make other changes
more feasible down the road.

llvm-svn: 63028
2009-01-26 19:29:26 +00:00
Chris Lattner b5fba6f8d8 start plumbing together the line table information. So far we just
unique the Filenames in #line directives, assigning them UIDs.

llvm-svn: 63010
2009-01-26 07:57:50 +00:00
Chris Lattner 76e689636b add parsing and constraint enforcement for GNU line marker directives.
llvm-svn: 63003
2009-01-26 06:19:46 +00:00
Chris Lattner 38d7fd252a a few minor cleanups
llvm-svn: 63000
2009-01-26 05:30:54 +00:00
Chris Lattner 100c65e810 parse and enforce required constraints on #line directives. Right now
we just discard them.

llvm-svn: 62999
2009-01-26 05:29:08 +00:00
Chris Lattner ad13cf4e7a eagerly resolve the spelling locations of macro argument preexpansions.
This reduces fsyntax-only time on c99-intconst-1.c from 2.43s down to 
2.01s (20%), reducing the number of fileid lookups from 2529040 linear 
and 64771121 binary to 5625902 linear and 4151182 binary.

This knocks getFileID down to only 4.6% of compile time on this testcase.
At this point, malloc/free is over 35% of compile time, primarily allocating
MacroArgs objects and their argument preexpansion vectors.

I don't feel like malloc avoiding right now, so I'm just going to call
this good.

llvm-svn: 62994
2009-01-26 04:33:10 +00:00
Chris Lattner 5a5d67101b Eagerly resolve the spelling location of the tokens in a definition
of a macro.  Since these tokens may themselves be from macro 
expansions, we need to resolve down to the spelling loc when the
macro ends up being instantiated.  Instead of resolving this for
each token expanded from the macro definition, just do it once when
the macro is defined.  This speeds up clang on c99-intconst-1.c from
2.66s to 2.43s (9.5%), reducing the FileID lookups from 407244 linear and
114175649 binary to 2529040 linear and 64771121 binary.

llvm-svn: 62993
2009-01-26 04:06:48 +00:00
Chris Lattner dd9babc79a Only resolve a macro's instantiation loc once per macro, instead of once
per token lexed from it.  This speeds up clang on c99-intconst-1.c from
the GCC testsuite from 3.64s to 2.66s (36%).  This reduces the number of
binary search FileID lookups from 251570522 to 114175649 on this testcase.

llvm-svn: 62992
2009-01-26 03:46:22 +00:00
Chris Lattner 4fa23625ab Check in the long promised SourceLocation rewrite. This lays the
ground work for implementing #line, and fixes the "out of macro ID's" 
problem.

There is nothing particularly tricky about the code, other than the
very performance sensitive SourceManager::getFileID() method.

llvm-svn: 62978
2009-01-26 00:43:02 +00:00
Chris Lattner 1f6c7fe6a8 This is a follow-up to r62675:
Refactor how the preprocessor changes a token from being an tok::identifier to a 
keyword (e.g. tok::kw_for).  Instead of doing this in HandleIdentifier, hoist this
common case out into the caller, so that every keyword doesn't have to go through
HandleIdentifier.  This drops time in HandleIdentifier from 1.25ms to .62ms, and
speeds up clang -Eonly with PTH by about 1%.

llvm-svn: 62855
2009-01-23 18:35:48 +00:00
Chris Lattner f8ccb4f9e3 Update comment.
llvm-svn: 62819
2009-01-23 00:13:28 +00:00
Chris Lattner 34eab390b9 remove my gross #ifdef's, using portable abstractions now that the 32-bit
load is always aligned.

I verified that the bswap doesn't occur in the assembly code on x86.

llvm-svn: 62815
2009-01-22 23:50:07 +00:00
Chris Lattner fec5470f03 remove Read8/Read24, which are dead. Rename Read16/Read32 to be more
descriptive.

llvm-svn: 62775
2009-01-22 19:48:26 +00:00
Ted Kremenek ae54f2f590 Fix <rdar://problem/6512717> by correctly reading the right offset in the token data in PTHLexer::getSourceLocation().
llvm-svn: 62725
2009-01-21 22:41:38 +00:00
Chris Lattner 3029b35faa merge two checks for identifiers in the pth loop into one.
llvm-svn: 62677
2009-01-21 07:50:06 +00:00
Chris Lattner 8256b970a3 a trivial micro optimization to save a load.
llvm-svn: 62676
2009-01-21 07:45:14 +00:00
Chris Lattner ad89ec013f Add a bit to IdentifierInfo that acts as a simple predicate which
tells us whether Preprocessor::HandleIdentifier needs to be called.
Because this method is only rarely needed, this saves a call and a
bunch of random checks.  This drops the time in HandleIdentifier 
from 3.52ms to .98ms on cocoa.h on my machine.

llvm-svn: 62675
2009-01-21 07:43:11 +00:00
Ted Kremenek 8d6c828728 Don't crash on empty PTH files. This fixes <rdar://problem/6512714>.
llvm-svn: 62673
2009-01-21 07:34:28 +00:00
Chris Lattner c950296006 really we only need on Read24!
llvm-svn: 62672
2009-01-21 07:28:57 +00:00
Chris Lattner 47def9787e revert my previous patch, it assumed endianness.
llvm-svn: 62671
2009-01-21 07:21:56 +00:00
Chris Lattner a74f7cbb9d minor cleanups: now that tokens are 4-byte aligned in a PTH
file, just load them directly as ints.

llvm-svn: 62668
2009-01-21 07:06:08 +00:00
Ted Kremenek 52f73cad4a Fix: <rdar://problem/6510344> [pth] PTH slows down regular lexer considerably (when it has substantial work)
Changes to IdentifierTable:
- High-level summary: StringMap never owns IdentifierInfos.  It just
references them.
- The string map now has StringMapEntry<IdentifierInfo*> instead of
  StringMapEntry<IdentifierInfo>.  The IdentifierInfo object is
  allocated using the same bump pointer allocator as used by the
  StringMap.

Changes to IdentifierInfo:
- Added an extra pointer to point to the
  StringMapEntry<IdentifierInfo*> in the string map.  This pointer
  will be null if the IdentifierInfo* is *only* used by the PTHLexer
  (that is it isn't in the StringMap).

Algorithmic changes:
- Non-PTH case:
   IdentifierInfo::get() will always consult the StringMap first to
   see if we have an IdentifierInfo object.  If that StringMapEntry
   references a null pointer, we allocate a new one from the BumpPtrAllocator
   and update the reference in the StringMapEntry.
- PTH case:
   We do the same lookup as with the non-PTH case, but if we don't get
   a hit in the StringMap we do a secondary lookup in the PTHManager for
   the IdentifierInfo.  If we don't find an IdentifierInfo we create a
   new one as in the non-PTH case.  If we do find and IdentifierInfo
   in the PTHManager, we update the StringMapEntry to refer to it so
   that the IdentifierInfo will be found on the next StringMap lookup.
   This way we only do a binary search in the PTH file at most once
   for a given IdentifierInfo.  This greatly speeds things up for source
   files containing a non-trivial amount of code.

Performance impact:
   While these changes do add some extra indirection in
   IdentifierTable to access an IdentifierInfo*, I saw speedups even
   in the non-PTH case as well.

   Non-PTH: For -fsyntax-only on Cocoa.h, we see a 6% speedup.
   PTH (with Cocoa.h in token cache): 11% speedup.

   I also did an experiment where we did -fsyntax-only on a source file
   including a large header and Cocoa.h, but the token cache did not
   contain the larger header.  For this file, we were seeing a performance
   *regression* when using PTH of 3% over non-PTH.  Now we are seeing
   a performance improvement of 9%!

Tests:
   The serialization tests are now failing.  I looked at this extensively,
   and I my belief is that this change is unmasking a bug rather than
   introducing a new one.  I have disabled the serialization tests for now.

llvm-svn: 62636
2009-01-20 23:28:34 +00:00
Ted Kremenek 8433f0b400 PTH: Emitted tokens now consist of 12 bytes that are loaded used 3 32-bit loads. This reduces user time but increases system time because of the slightly larger PTH file. Although there is no performance win on Cocoa.h and -Eonly, overall this seems like a good step.
llvm-svn: 62542
2009-01-19 23:13:15 +00:00
Chris Lattner 4fd8b958be do not use SourceManager::getFileCharacteristic(FileID), it is not
safe because a #line can change the file characteristic on a per-loc
basis.

llvm-svn: 62502
2009-01-19 08:01:53 +00:00
Chris Lattner c033416639 do not use SourceManager::getFileCharacteristic(FileID), it is not
safe because a #line can change the file characteristic on a per-loc
basis.

llvm-svn: 62501
2009-01-19 07:59:15 +00:00
Chris Lattner cbc35ecb04 Rename SourceManager::getCanonicalFileID -> getFileID. There is
no longer such thing as a non-canonical FileID.

llvm-svn: 62499
2009-01-19 07:46:45 +00:00
Ted Kremenek 8c3b812148 Run destructors of MacroInfo objects to free memory they allocate. This addresses <rdar://problem/6506035>.
llvm-svn: 62498
2009-01-19 07:45:44 +00:00
Chris Lattner 02495d80ef Make some enums in SourceLocation private, remove a useless assertion from ScratchBuffer.
llvm-svn: 62492
2009-01-19 06:57:37 +00:00
Chris Lattner 29a2a191f2 Make SourceLocation::getFileLoc private to reduce the API exposure of
SourceLocation.  This requires making some cleanups to token pasting
and _Pragma expansion.

llvm-svn: 62490
2009-01-19 06:46:35 +00:00
Chris Lattner fc014f80e5 fix rdar://6505352 - Bogus warning with -WUndef, a case
Anders noticed.

llvm-svn: 62472
2009-01-18 21:18:58 +00:00
Chris Lattner 144aacd19e rearrange GetIdentifierInfo so that the fast path can be partially inlined into PTHLexer::Lex. This speeds up the user time of PTH -Eonly by another 2ms (4.4%)
llvm-svn: 62454
2009-01-18 02:57:21 +00:00
Chris Lattner 18fc6ceb56 rename some variables, only set a tokens identifierinfo if non-null.
llvm-svn: 62450
2009-01-18 02:34:01 +00:00
Chris Lattner 9cdd877436 On i386 and x86-64, just do unaligned loads
instead of assembling from bytes.  This speeds up -Eonly PTH reading 
of cocoa.h by about 2ms, which is 4.2%.

llvm-svn: 62447
2009-01-18 02:19:16 +00:00
Chris Lattner 137d6492a8 switch PTHLexer to use Read32 and friends instead of lots of inlined
copies.  I verified that this causes no performance change in PTH.

llvm-svn: 62445
2009-01-18 02:10:31 +00:00
Chris Lattner eb09754a9d switch PTH lexer from using "const char*"s to "const unsigned char*"s
internally.  This is just a cleanup that reduces the need to cast to
unsigned char before assembling a larger integer.

llvm-svn: 62442
2009-01-18 01:57:14 +00:00
Chris Lattner 71dc14b9f0 Rename SourceLocation::getFileID to getChunkID, because it returns
the chunk ID not the file ID.  This exposes problems in 
TextDiagnosticPrinter where it should have been using the canonical
file ID but wasn't.  Fix these along the way.

llvm-svn: 62427
2009-01-17 08:45:21 +00:00
Chris Lattner 5509d533f6 simplify some lookups.
llvm-svn: 62426
2009-01-17 08:30:10 +00:00
Chris Lattner 757169b60f Change the Lexer ctor used to lex _Pragma directives into a static factory
method.  This lets us clean up the interface and make it more obvious that
this method is *really really* _Pragma specific.

Note that _Pragma handling uglifies the Lexer in the critical path.  It would
be very interesting to consider making _Pragma remapping be a new special
lexer class of its own.

llvm-svn: 62425
2009-01-17 08:27:52 +00:00
Chris Lattner ab1d4b8abd simplify PTHManager::CreateLexer
llvm-svn: 62424
2009-01-17 08:06:50 +00:00
Chris Lattner c809089b26 Change the Lexer ctor used in the non _Pragma case to take a FileID instead
of a SourceLocation.  This should speed it up and definitely simplifies it.

llvm-svn: 62422
2009-01-17 08:03:42 +00:00
Chris Lattner 8ddb5cf0cf in Preprocessor::AdvanceToTokenCharacter, don't actually bother
creating a whole lexer when we just want one static method.

llvm-svn: 62420
2009-01-17 07:57:25 +00:00
Chris Lattner 5965a28a4b More simplifications to the lexer ctors.
llvm-svn: 62419
2009-01-17 07:56:59 +00:00
Chris Lattner fcf6452eb4 make the verbose raw-lexer ctor fully explicit instead of having
embedded magic.

llvm-svn: 62417
2009-01-17 07:42:27 +00:00
Chris Lattner 08354fef13 add a simplified lexer ctor that sets up the lexer to raw-lex an
entire file.

llvm-svn: 62414
2009-01-17 07:35:14 +00:00
Chris Lattner f76b92092e refactor some common initialization code out of the two lexer ctors.
llvm-svn: 62411
2009-01-17 06:55:17 +00:00
Chris Lattner 3793bba26f suck the call to "getSpellingLoc" that all clients do into
the implementation of PTHManager::getSpelling.

llvm-svn: 62408
2009-01-17 06:29:33 +00:00
Chris Lattner d32480d3db this massive patch introduces a simple new abstraction: it makes
"FileID" a concept that is now enforced by the compiler's type checker
instead of yet-another-random-unsigned floating around.

This is an important distinction from the "FileID" currently tracked by
SourceLocation.  *That* FileID may refer to the start of a file or to a
chunk within it.  The new FileID *only* refers to the file (and its 
#include stack and eventually #line data), it cannot refer to a chunk.

FileID is a completely opaque datatype to all clients, only SourceManager
is allowed to poke and prod it.

llvm-svn: 62407
2009-01-17 06:22:33 +00:00
Chris Lattner 1abd20901b Instead of iterating over FileID's, have PTH generation iterate over the
content cache directly.  Content cache has a 1-1 mapping with fileentries,
whereas multiple FileIDs can be the same FileEntry.

llvm-svn: 62401
2009-01-17 03:48:08 +00:00
Chris Lattner 5882771102 Fix PR2477 - clang misparses "//*" in C89 mode
llvm-svn: 62368
2009-01-16 22:39:25 +00:00
Chris Lattner 5244f34e75 As a performance optimization, don't bother calling MacroInfo::isIdenticalTo
if warnings in system headers are disabled.  isIdenticalTo can end up 
calling the expensive getSpelling method, and other bad stuff and is 
completely unneeded if the warning will be discarded anyway. rdar://6502956

llvm-svn: 62347
2009-01-16 19:50:11 +00:00
Chris Lattner f49775dc81 only notify callbacks if they exist.
llvm-svn: 62334
2009-01-16 19:01:46 +00:00
Chris Lattner 262d4e31b9 Improve #pragma comment support by building the string argument and
notifying PPCallbacks about it.

llvm-svn: 62333
2009-01-16 18:59:23 +00:00
Chris Lattner 8a24e588d7 minor cleanups to StringLiteralParser: no need to pass target info
into its ctor.  Also, make it handle validity checking of pascal
strings instead of making clients do it.

llvm-svn: 62332
2009-01-16 18:51:42 +00:00
Chris Lattner 2ff698df60 Implement basic support for parsing #pragma comment, a microsoft extension
documented here:
http://msdn.microsoft.com/en-us/library/7f0aews7(VS.80).aspx

This is according to my understanding reading the docs, I don't know if it
really agrees fully with what VC++ allows.

llvm-svn: 62317
2009-01-16 08:21:25 +00:00
Chris Lattner 8a42586c54 more SourceLocation lexicon change: instead of referring to the
"logical" location, refer to the "instantiation" location.

llvm-svn: 62316
2009-01-16 07:36:28 +00:00
Chris Lattner 7c8556e7bc remove obsolete comment which happened to go over 80 cols.
llvm-svn: 62313
2009-01-16 07:04:11 +00:00
Chris Lattner 15af77f679 remove an unneeded const_cast.
llvm-svn: 62311
2009-01-16 07:02:14 +00:00
Chris Lattner 53e384f633 Change some terminology in SourceLocation: instead of referring to
the "physical" location of tokens, refer to the "spelling" location.
This is more concrete and useful, tokens aren't really physical objects!

llvm-svn: 62309
2009-01-16 07:00:02 +00:00
Ted Kremenek 4bbb79a642 PTH: Fix termination condition in binary search.
llvm-svn: 62277
2009-01-15 19:28:38 +00:00
Ted Kremenek a705b04d7f IdentifierInfo:
- IdentifierInfo can now (optionally) have its string data not be
  co-located with itself.  This is for use with PTH.  This aspect is a
  little gross, as getName() and getLength() now make assumptions
  about a possible alternate representation of IdentifierInfo.
  Perhaps we should make IdentifierInfo have virtual methods?

IdentifierTable:
- Added class "IdentifierInfoLookup" that can be used by
  IdentifierTable to perform "string -> IdentifierInfo" lookups using
  an auxilliary data structure.  This is used by PTH.
- Perform tests show that IdentifierTable::get() does not slow down
  because of the extra check for the IdentiferInfoLookup object (the
  regular StringMap lookup does enough work to mitigate the impact of
  an extra null pointer check).
- The upshot is that now that some IdentifierInfo objects might be
  owned by the IdentiferInfoLookup object.  This should be reviewed.

PTH:
- Modified PTHManager::GetIdentifierInfo to *not* insert entries in
  IdentifierTable's string map, and instead create IdentifierInfo
  objects on the fly when mapping from persistent IDs to
  IdentifierInfos.  This saves a ton of work with string copies,
  hashing, and StringMap lookup and resizing.  This change was
  motivated because when processing source files in the PTH cache we
  don't need to do any string -> IdentifierInfo lookups.
- PTHManager now subclasses IdentifierInfoLookup, allowing clients of
  IdentifierTable to transparently use IdentifierInfo objects managed
  by the PTH file.  PTHManager resolves "string -> IdentifierInfo"
  queries by doing a binary search over a sorted table of identifier
  strings in the PTH file (the exact algorithm we use can be changed
  as needed).

These changes lead to the following performance changes when using PTH on Cocoa.h:
- fsyntax-only: 10% performance improvement
- Eonly: 30% performance improvement

llvm-svn: 62273
2009-01-15 18:47:46 +00:00
Ted Kremenek bef9fc2240 PTH: Embed a persistentID side-table in the PTH file that is sorted in the
lexical order of the corresponding identifier strings. This will be used for a
forthcoming optimization. This slows down PTH generation time by 7%. We can
revert this change if the optimization proves to not be valuable.

llvm-svn: 62248
2009-01-15 01:26:25 +00:00
Ted Kremenek e9814186ac PTH:
- Use canonical FileID when using getSpelling() caching.  This
  addresses some cache misses we were seeing with -fsyntax-only on
  Cocoa.h
- Added Preprocessor::getPhysicalCharacterAt() utility method for
  clients to grab the first character at a specified sourcelocation.
  This uses the PTH spelling cache.
- Modified Sema::ActOnNumericConstant() to use
  Preprocessor::getPhysicalCharacterAt() instead of
  SourceManager::getCharacterData() (to get PTH hits).

These changes cause -fsyntax-only to not page in any sources from
Cocoa.h.  We see a speedup of 27%.

llvm-svn: 62193
2009-01-13 23:19:12 +00:00
Ted Kremenek 7cbdcc25d4 Fix corner cases in PTH getSpelling() binary search.
llvm-svn: 62187
2009-01-13 22:16:45 +00:00
Ted Kremenek b0b4f74b6b PTH: Fix remaining cases where the spelling cache in the PTH file was being missed when it shouldn't. This shaves another 7% off PTH time for -Eonly on Cocoa.h
llvm-svn: 62186
2009-01-13 22:05:50 +00:00
Ted Kremenek 47b8cf6deb Enhance PTH 'getSpelling' caching:
- Refactor caching logic into a helper class PTHSpellingSearch
- Allow "random accesses" in the spelling cache, thus catching the remaining
  cases where 'getSpelling' wasn't hitting the PTH cache
  
For -Eonly, PTH, Cocoa.h:
- This reduces wall time by 3% (user time unchanged, sys time reduced)
- This reduces the amount of paged source by 1112K.
  The remaining 1112K still being paged in is from somewhere else
  (investigating).

llvm-svn: 62009
2009-01-09 22:05:30 +00:00
Ted Kremenek 8ae06625b5 Invert assertion condition.
llvm-svn: 61961
2009-01-09 00:36:11 +00:00
Ted Kremenek d5e6e16d0d PTH: Hook up getSpelling() caching in PTHLexer. This results in a nice
performance gain. Here's what we see for -Eonly on Cocoa.h (using PTH):

- wall time decreases by 21% (26% speedup overall)
- system time decreases by 35%
- user time decreases by 6%

These reductions are due to not paging source files just to get spellings for
literals. The solution in place doesn't appear to be 100% yet, as we still see
some of the pages for source files getting mapped in. Using -print-stats, we see
that SourceManager maps in 7179K less bytes of source text (reduction of 75%).
Will investigate why the remaining 25% are getting paged in.

With these changes, here's how PTH compares to non-PTH on Cocoa.h:
  -Eonly: PTH takes 64% of the time as non-PTH (54% speedup)
  -fsyntax-only: PTH takes 89% of the time as non-PTH (11% speedup)

llvm-svn: 61913
2009-01-08 04:30:32 +00:00
Ted Kremenek 884a558441 PTH:
- Added stub PTHLexer::getSpelling() that will be used for fetching cached
  spellings from the PTH file.  This doesn't do anything yet.
- Added a hook in Preprocessor::getSpelling() to call PTHLexer::getSpelling()
  when using a PTHLexer.
- Updated PTHLexer to read the offsets of spelling tables in the PTH file.

llvm-svn: 61911
2009-01-08 02:47:16 +00:00
Chris Lattner 933d5ffc11 Optimize stringification a bit to avoid std::string thrashing and
avoid the version of Preprocessor::getSpelling that returns an 
std::string.

llvm-svn: 61769
2009-01-05 23:04:18 +00:00
Chris Lattner 07ebf302e5 simplify Preprocessor::getSpelling now that identifiers carry around
their length.

llvm-svn: 61734
2009-01-05 19:44:41 +00:00
Steve Naroff f9c29d4200 Add parser support for __forceinline, __w64, __ptr64.
llvm-svn: 61431
2008-12-25 14:41:26 +00:00
Steve Naroff 44ac777741 Add parser support for __cdecl, __stdcall, and __fastcall.
Change preprocessor implementation of _cdecl to reference __cdecl.

llvm-svn: 61430
2008-12-25 14:16:32 +00:00
Steve Naroff 3a9b7e0cff Add explicit "fuzzy" parse support for Microsoft declspec.
Remove previous __declspec macro that would effectively erase the construct prior to parsing.

llvm-svn: 61422
2008-12-24 20:59:21 +00:00
Ted Kremenek b0051a9955 Remove old PTH token-generation test harness.
llvm-svn: 61382
2008-12-23 19:25:33 +00:00
Ted Kremenek 78cc24730e PTH: Remove some methods and simplify some conditions in PTHLexer::Lex(). No big functionality change.
llvm-svn: 61381
2008-12-23 19:24:24 +00:00
Ted Kremenek a754c40390 PTH: Use 3 bytes instead of 4 bytes to encode the persistent ID for a token.
- This reduces the PTH size for Cocoa.h by 7%.
- The increases PTH -Eonly speed for Cocoa.h by 0.8%.

llvm-svn: 61377
2008-12-23 18:41:34 +00:00
Ted Kremenek 3f94706e57 Cosmetics: rename a variable and tighten spacing. No functionality change.
llvm-svn: 61375
2008-12-23 18:27:26 +00:00
Ted Kremenek 1bd0a550d0 PTH:
- Encode the token length with 2 bytes instead of 4.
- This reduces the size of the .pth file for Cocoa.h by 12%.
- This speeds up PTH time (-Eonly) on Cocoa.h by 1.6%.

llvm-svn: 61364
2008-12-23 02:52:12 +00:00
Ted Kremenek 66076a964b PTH:
- In PTHLexer::Lex read all of the token data from PTH file before
  constructing the token.  The idea is to enhance locality.
- Do not use Read8/Read32 in PTHLexer::Lex.  Inline these operations manually.
- Change PTHManager::ReadIdentifierInfo() to PTHManager::GetIdentifierInfo().
  They are functionally the same except that PTHLexer::Lex() reads the
  persistent id.

These changes result in a 3.3% speedup for PTH on Cocoa.h (-Eonly).

llvm-svn: 61363
2008-12-23 02:30:15 +00:00
Ted Kremenek 1b18ad240c PTH:
- Embed 'eom' tokens in PTH file.
- Use embedded 'eom' tokens to not lazily generate them in the PTHLexer.
  This means that PTHLexer can always advance to the next token after
  reading a token (instead of buffering tokens using a copy).
- Moved logic of 'ReadToken' into Lex.  GetToken & ReadToken no longer exist.
- These changes result in a 3.3% speedup (-Eonly) on Cocoa.h.
- The code is a little gross.  Many cleanups are possible and should be done.

llvm-svn: 61360
2008-12-23 01:30:52 +00:00
Steve Naroff 1ef21279b6 Don't define __STDC__ when compiling with -fms-extensions
llvm-svn: 61223
2008-12-18 22:37:25 +00:00
Ted Kremenek 9443f0ea5e Use '&' to test StartOfLine flag.
llvm-svn: 61205
2008-12-18 18:15:29 +00:00
Ted Kremenek aceeb25660 Rewrite PTHLexer::DiscardToEndOfLine() to not use GetToken and instead only read the bytes needed to determine if a token is not at the start of the line.
llvm-svn: 61172
2008-12-17 23:52:11 +00:00
Ted Kremenek 63ff81c4e1 Change PTHLexer::getSourceLocation() to not call GetToken() and instead just read the file offset in the token data buffer directly.
llvm-svn: 61170
2008-12-17 23:36:32 +00:00
Chris Lattner d88c933970 add a dropped word back
llvm-svn: 61152
2008-12-17 21:38:44 +00:00
Ted Kremenek a7d73b1fd4 Shadow CurPtr with a local variable in ReadToken.
llvm-svn: 61145
2008-12-17 18:38:19 +00:00
Ted Kremenek 6c7ea11300 Preprocessor: Allocate MacroInfo objects using a BumpPtrAllocator instead using new/delete. This speeds up -Eonly on Cocoa.h using the regular lexer by 1.8% and the PTHLexer by 3%.
llvm-svn: 61042
2008-12-15 19:56:42 +00:00
Chris Lattner 77c76ae3de eliminate the isCXXNamedOperator function and some string compares and
use identifierinfo instead.  Patch by Chris Goller!

llvm-svn: 60992
2008-12-13 20:12:40 +00:00
Ted Kremenek 877556f4b9 PTH: Added minor 'sibling jumping' optimization for iterating over the side table used for fast preprocessor block skipping. This has a minor performance improvement when preprocessing Cocoa.h, but can have some wins in pathologic cases.
llvm-svn: 60966
2008-12-12 22:05:38 +00:00
Ted Kremenek 56572ab9e9 Added PTH optimization to not process entire blocks of tokens that appear in skipped preprocessor blocks. This improves PTH speed by 6%. The code for this optimization itself is not very optimized, and will get cleaned up.
llvm-svn: 60956
2008-12-12 18:34:08 +00:00
Chris Lattner e141a9e225 rdar://6060752 - don't warn about trigraphs in bcpl-style comments
llvm-svn: 60942
2008-12-12 07:34:39 +00:00
Chris Lattner 89770575cd fix thought-o
llvm-svn: 60937
2008-12-12 07:14:34 +00:00
Ted Kremenek 864eb39233 PTH:
- Added a side-table per each token-cached file with the preprocessor conditional stack.  This tracks what #if's are matched with what #endifs and where their respective tokens are in the PTH file.  This will allow for quick skipping of excluded conditional branches in the Preprocessor.
- Performance testing shows the addition of this information (without actually utilizing it) leads to no performance regressions.

llvm-svn: 60911
2008-12-11 23:36:38 +00:00
Ted Kremenek ca153f7349 PTHLexer: Keep track of the location of the last '#' token and provide the means to jump ahead in the token stream.
llvm-svn: 60905
2008-12-11 22:41:47 +00:00
Ted Kremenek 67ab296d5c Remove unused ivar CurTokenIdx.
llvm-svn: 60896
2008-12-11 20:39:48 +00:00
Ted Kremenek 8e1f05fc26 PreprocessorLexer (and subclasses):
- Added virtual method 'getSourceLocation()' (no arguments) that gets the location of the next "observable" location (e.g., next character, next token).
PPLexerChange.cpp:
- Implemented FIXME by using PreprocessorLexer::getSourceLocation() to get the location in the file we are returning to after lexing a #included file.  This appears to be slightly faster than having the branch (i.e., 'if(CurLexer)').  It's also not a really hot part of the Preprocessor.

llvm-svn: 60860
2008-12-10 23:20:59 +00:00
Ted Kremenek d40ab7b72a Declare PerIDCache as IdentifierInfo** instead of void*. This is just cleaner. No performance change.
llvm-svn: 60843
2008-12-10 19:40:23 +00:00
Ted Kremenek 1aed3ddffa Remove unneeded assertion.
llvm-svn: 60559
2008-12-04 22:47:11 +00:00
Ted Kremenek baedbf47f6 Use 'free' to release PerIDCache since it was allocated using calloc().
llvm-svn: 60556
2008-12-04 22:09:37 +00:00
Ted Kremenek 73a4d28758 PTH:
Use an array instead of a DenseMap to cache persistent IDs -> IdentifierInfo*.  This leads to a 4% speedup at -fsyntax-only using PTH.

llvm-svn: 60452
2008-12-03 01:16:39 +00:00
Ted Kremenek 33eeabda61 - Remove PTHManager.cpp. Move all of its functions to PTHLexer.cpp since some of the internal methods are used by PTHLexer (their implementations are intertwined.) This enables some important inlining opportunities at -O3.
- Don't construct an std::vector<Token> prior to feeding PTH tokens to the Preprocessor.  Stream them off the PTH file directly.

llvm-svn: 60447
2008-12-03 00:38:03 +00:00
Ted Kremenek af058b5696 Preprocessor:
- Added method "setPTHManager" that will be called by the driver to install
  a PTHManager for the Preprocessor.
- Fixed some comments.
- Added EnterSourceFileWithPTH to mirror EnterSourceFileWithLexer.

llvm-svn: 60437
2008-12-02 19:46:31 +00:00
Ted Kremenek 498b6210fc Added PTHManager, a utility class that will be used by Preprocessor to lazily create PTHLexer objects for pre-tokenized files.
llvm-svn: 60436
2008-12-02 19:45:05 +00:00
Douglas Gregor 90abb6dead Objective-C keywords are not always identifiers. Some are also C++ keywords
llvm-svn: 60373
2008-12-01 21:46:47 +00:00
Daniel Dunbar 1f3d7849a8 Add LangOptions marker for assembler-with-cpp mode and use to define
__ASSEMBLER__ properly. Patch from Roman Divacky (with minor
formatting changes). Thanks!

llvm-svn: 60362
2008-12-01 18:55:22 +00:00
Ted Kremenek 1f50dc899f PTHLexer now owns the Token vector.
llvm-svn: 60136
2008-11-27 00:38:24 +00:00
Daniel Dunbar 5c4cc09498 Comment fix.
llvm-svn: 59997
2008-11-25 00:20:22 +00:00
Chris Lattner 03c4041cb5 make the 'to match this' diagnostic a note.
llvm-svn: 59921
2008-11-23 23:17:07 +00:00
Chris Lattner e3d20d9545 Convert IdentifierInfo's to be printed the same as DeclarationNames
with implicit quotes around them.  This has a bunch of follow-on 
effects and requires tweaking to a whole lot of code.  This causes
a regression in two tests (xfailed) by causing it to emit things like:

  Line 10: duplicate interface declaration for category 'MyClass1' ('Category1')

instead of:

  Line 10: duplicate interface declaration for category 'MyClass1(Category1)'

I will fix this in a follow-up commit.

As part of this, I had to start switching stuff to use ->getDeclName() instead
of Decl::getName() for consistency.  This is good, but I was planning to do this
as an independent patch.  There will be several follow-on patches
to clean up some of the mess, but this patch is already too big.

llvm-svn: 59917
2008-11-23 21:45:46 +00:00
Chris Lattner f3cb394f41 Fix a weird inconsistency with hex floats. Previously the lexer
would not eat the "-1" in "0x0p-1", but LiteralSupport would accept
it when extensions are on.  This caused strangeness and failures 
when hexfloats were properly treated as an extension (not error)
in LiteralSupport.

llvm-svn: 59865
2008-11-22 07:39:03 +00:00
Chris Lattner 59acca5874 remove the NumericLiteralParser::Diag helper method, inlining it into
its call sites.  This makes it more explicit when the hasError flag is
getting set and removes a confusing difference in behavior between
PP.Diag and Diag in this code.

llvm-svn: 59863
2008-11-22 07:23:31 +00:00
Chris Lattner 1ef2028205 Move the Preprocessor::Diag methods inline. This has the interesting
(and carefully calculated) effect of allowing the compiler to reason
about the aliasing properties of DiagnosticBuilder object better,
allowing the whole thing to be promoted to registers instead of
resulting in a ton of stack traffic.

While I'm not very concerned about the performance of the Diag() method
invocations, I *am* more concerned about their code size and impact on the
non-diagnostic code.  This patch shrinks the clang executable (in 
release-asserts mode with gcc-4.2) from 14523980 to 14519816 bytes.  This
isn't much, but it shrinks the lexer from 38192 to 37776, PPDirectives.o
from 31116 to 28868 bytes, etc.

llvm-svn: 59862
2008-11-22 07:03:46 +00:00
Chris Lattner fdabe83c67 inline a method into its only two call sites.
llvm-svn: 59860
2008-11-22 06:42:31 +00:00
Chris Lattner 014156e108 actually, this version isn't really needed.
llvm-svn: 59859
2008-11-22 06:22:39 +00:00
Chris Lattner 57dab26be1 remove a sneaky version of Diag hiding in PreprocessorLexer.
llvm-svn: 59858
2008-11-22 06:20:42 +00:00
Chris Lattner 6d27a16b95 Change the Lexer::Diag method to not magically silence warnings,
force the caller to check instead.  This eliminates the need (and the
risk!) of weird null DiagnosticBuilder's floating around.

llvm-svn: 59856
2008-11-22 02:02:22 +00:00
Chris Lattner 427c9c1763 Split the DiagnosticInfo class into two disjoint classes:
one for building up the diagnostic that is in flight (DiagnosticBuilder)
and one for pulling structured information out of the diagnostic when
formatting and presenting it.

There is no functionality change with this patch.

llvm-svn: 59849
2008-11-22 00:59:29 +00:00
Ted Kremenek 6b3ced2b15 In PTHLexer::DiscardToEndOfLine() use Lex() instead of AdvanceToken(). This handles transitions in the preprocessor state.
llvm-svn: 59845
2008-11-21 23:28:56 +00:00
Ted Kremenek 1ad05ce600 Reenable the default lexer.
llvm-svn: 59843
2008-11-21 20:51:59 +00:00
Ted Kremenek b6209858cb When creating the raw tokens for PTHLexer, make sure the token representing the file to include is checked for being an identifier.
llvm-svn: 59842
2008-11-21 20:51:15 +00:00
Ted Kremenek 72d9912b08 When creating raw tokens for the PTHLexer specially handle angled strings for #include directives.
llvm-svn: 59840
2008-11-21 19:41:29 +00:00
Ted Kremenek 53ab374d9f PTHLexer:
- Move out logic for handling the end-of-file to LexEndOfFile (to match the Lexer) class.  The logic now mirrors the Lexer class more, which allows us to pass most of the Preprocessor test cases.

llvm-svn: 59768
2008-11-21 00:58:35 +00:00
Ted Kremenek 111caaac58 PTHLexer:
- Move PTHLexer::GetToken() to be inside PTHLexer.cpp.
- When lexing in raw mode, null out identifiers.

llvm-svn: 59744
2008-11-20 19:49:00 +00:00
Ted Kremenek cbc984169f Handle another case where we should use PTHLexer as an alternative to the normal Lexer.
llvm-svn: 59736
2008-11-20 16:46:54 +00:00
Ted Kremenek 94981e1f23 PTHLexer:
- Rename 'CurToken' and 'LastToken' to 'CurTokenIdx' and 'LastTokenIdx'
  respectively.
- Add helper methods GetToken(), AdvanceToken(), AtLastToken() to abstract away
  details of the token stream. This also allows us to easily replace their
  implementation later.

llvm-svn: 59733
2008-11-20 16:32:22 +00:00
Ted Kremenek 6bc5f3ec90 Rename IsNonPragmaNonMacroLexer to IsFileLexer.
llvm-svn: 59731
2008-11-20 16:19:53 +00:00
Ted Kremenek c490c8877c Rewrote PTHLexer::Lex by digging through the sources of Lexer again. Now we can do basic macro expansion using the PTHLexer.
llvm-svn: 59724
2008-11-20 07:58:05 +00:00
Ted Kremenek 85b48c6e3a Add ugly "test harness" for PTHLexer that is not enabled by default. The
(temporary hack) to test the PTHLexer is that whenever we would create a Lexer
object we instead raw lex a memory buffer first and then use the PTHLexer. This
logic exists only to driver the PTHLexer and will be removed/changed in the
future. Note that the regular path using normal Lexer objects is what is used by
default.

llvm-svn: 59723
2008-11-20 07:56:33 +00:00
Ted Kremenek 2af3cee287 Make FIXME a hard assertion.
llvm-svn: 59695
2008-11-20 01:52:55 +00:00
Ted Kremenek b33ce32bda Preprocessor::getCurrentFileLexer() now returns a PreprocessorLexer* instead of
a Lexer*. This means it will either return the current (normal) file Lexer or a
PTHLexer.

llvm-svn: 59694
2008-11-20 01:49:44 +00:00
Ted Kremenek 300590b584 Just use the SourceLocation of SysHeaderTok when doing a callback to emit #line
information. A diff of the -E output for Cocoa.h shows that there is no change
in output.

llvm-svn: 59693
2008-11-20 01:45:11 +00:00
Ted Kremenek 6552d259d4 Assign the result of getCurrentFileLexer() to a PreprocessorLexer* instead of Lexer* (narrower interface).
llvm-svn: 59691
2008-11-20 01:35:24 +00:00
Ted Kremenek b0262c1e64 - Default initialize ParsingPreprocessorDirective, ParsingFilename, and
LexingRawMode in the ctor of PreprocessorLexer.

- PTHLexer: Use "LastToken" instead of "NumToken"

llvm-svn: 59690
2008-11-20 01:29:45 +00:00
Ted Kremenek 61915f5d4a Add (untested) implementation of PTHLexer::isNextPPTokenLParen() and PTHLexer::DiscardToEndOfLine().
llvm-svn: 59687
2008-11-20 01:16:50 +00:00
Ted Kremenek 2861cf42fe Use PreprocessorLexer::getFileID() instead of Lexer::getFileLoc(). This is an intermediate step to having getCurrentLexer() return a PreprocessorLexer* instead of a Lexer*.
llvm-svn: 59672
2008-11-19 22:55:25 +00:00
Ted Kremenek a2c3c8d71c Move more cases of using 'CurLexer' to 'CurPPLexer'.
Use PTHLexer::isNextPPTokenLParen() when using the PTHLexer.

llvm-svn: 59671
2008-11-19 22:43:49 +00:00
Ted Kremenek 11cfbb473e Add stub for PTHLexer::isNextPPTokenLParen().
llvm-svn: 59670
2008-11-19 22:42:26 +00:00
Ted Kremenek 76c3441a4e When using a PTHLexer, use DiscardToEndOfLine() instead of ReadToEndOfLine().
llvm-svn: 59668
2008-11-19 22:21:33 +00:00
Ted Kremenek 45245217bc - Move static function IsNonPragmaNonMacroLexer into Preprocessor.h.
- Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry
  (simplifies some uses).
- Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile.
- Add 'FileID' to PreprocessorLexer, and have Preprocessor query this fileid
  when looking up the FileEntry for a file

Performance testing of -Eonly on Cocoa.h shows no performance regression because
of this patch.

llvm-svn: 59666
2008-11-19 21:57:25 +00:00
Oscar Fuentes 77543d9af0 CMake: Added some source files.
Patch contributed by Jay Foad!

llvm-svn: 59656
2008-11-19 18:46:39 +00:00
Argyrios Kyrtzidis f5e2812e69 Remove Preprocessor::CacheTokens boolean data member. The same functionality can be provided by using Preprocessor::isBacktrackEnabled().
llvm-svn: 59631
2008-11-19 14:23:14 +00:00
Chris Lattner c5cdade2df don't turn identifierinfo's into strings in diagnostics.
llvm-svn: 59602
2008-11-19 07:33:58 +00:00
Ted Kremenek a7c279ba40 Revert 59574 (caused tests to fail).
llvm-svn: 59579
2008-11-19 01:54:47 +00:00
Ted Kremenek 1b167108bb - Move static function IsNonPragmaNonMacroLexer into Preprocessor.h.
- Add variants of IsNonPragmaNonMacroLexer to accept an IncludeMacroStack entry
  (simplifies some uses).
- Use IsNonPragmaNonMacroLexer in Preprocessor::LookupFile.

Performance testing of -Eonly on Cocoa.h shows no performance regression because
of this patch.

llvm-svn: 59574
2008-11-19 00:46:18 +00:00
Ted Kremenek c7a366309d Initialize CurPPLexer in Preprocessor's constructor.
llvm-svn: 59573
2008-11-19 00:44:06 +00:00
Chris Lattner e05c4dfc42 Remove the last of the old-style Preprocessor::Diag methods.
llvm-svn: 59554
2008-11-18 21:48:13 +00:00
Chris Lattner 97b8e84bd7 remove one more Preprocessor::Diag method.
llvm-svn: 59512
2008-11-18 08:02:48 +00:00
Chris Lattner 907dfe94e1 Convert the lexer and start converting the PP over to using canonical Diag methods.
llvm-svn: 59511
2008-11-18 07:59:24 +00:00
Chris Lattner 8488c8297c This reworks some of the Diagnostic interfaces a bit to change how diagnostics
are formed.  In particular, a diagnostic with all its strings and ranges is now
packaged up and sent to DiagnosticClients as a DiagnosticInfo instead of as a 
ton of random stuff.  This has the benefit of simplifying the interface, making
it more extensible, and allowing us to do more checking for things like access
past the end of the various arrays passed in.

In addition to introducing DiagnosticInfo, this also substantially changes how 
Diagnostic::Report works.  Instead of being passed in all of the info required
to issue a diagnostic, Report now takes only the required info (a location and 
ID) and returns a fresh DiagnosticInfo *by value*.  The caller is then free to
stuff strings and ranges into the DiagnosticInfo with the << operator.  When
the dtor runs on the DiagnosticInfo object (which should happen at the end of
the statement), the diagnostic is actually emitted with all of the accumulated
information.  This is a somewhat tricky dance, but it means that the 
accumulated DiagnosticInfo is allowed to keep pointers to other expression 
temporaries without those pointers getting invalidated.

This is just the minimal change to get this stuff working, but this will allow
us to eliminate the zillions of variant "Diag" methods scattered throughout
(e.g.) sema.  For example, instead of calling:

  Diag(BuiltinLoc, diag::err_overload_no_match, typeNames,
       SourceRange(BuiltinLoc, RParenLoc));

We will soon be able to just do:

  Diag(BuiltinLoc, diag::err_overload_no_match)
      << typeNames << SourceRange(BuiltinLoc, RParenLoc));

This scales better to support arbitrary types being passed in (not just 
strings) in a type-safe way.  Go operator overloading?!

llvm-svn: 59502
2008-11-18 07:04:44 +00:00
Chris Lattner 16ba91396a Change the diagnostics interface to take an array of pointers to
strings instead of array of strings.  This reduces string copying
in some not-very-important cases, but paves the way for future 
improvements.

llvm-svn: 59494
2008-11-18 04:56:44 +00:00
Ted Kremenek 3d9740da29 - Add Lexer::isPragma() accessor for clients of Lexer that aren't friends.
- Add static method to test if the current lexer is a non-macro/non-pragma
  lexer.
- Refactor some code in PPLexerChange to use this static method.
- No performance change.

llvm-svn: 59486
2008-11-18 01:33:13 +00:00
Ted Kremenek 551c82aa7b Replace more uses of 'CurLexer->' with 'CurPPLexer->'. No performance change.
llvm-svn: 59482
2008-11-18 01:12:54 +00:00
Ted Kremenek 6b73291462 Add hooks to use PTHLexer::Lex instead of Lexer::Lex when CurLexer is null.
Performance tests on Cocoa.h (using the regular Lexer) shows no performance
difference.

llvm-svn: 59479
2008-11-18 01:04:47 +00:00
Ted Kremenek 59e003e538 Added conditional guard 'if (CurLexer)' when using SetCommentRetentionState().
This is because the PTHLexer will not support this method. Performance testing
on preprocessing Cocoa.h shows that this results in a negligible performance
difference (less than 1%).

I tried making Lexer::SetCommentRetentionState() an out-of-line function (a
precursor to making it a virtual function in PreprocessorLexer) and noticed a 1%
decrease in speed (it is called in a hot part of the Preprocessor).

llvm-svn: 59477
2008-11-18 00:43:07 +00:00
Ted Kremenek 30cd88caab Change a bunch of uses of 'CurLexer->' to 'CurPPLexer->', which should be the
alias for the current PreprocessorLexer. No functionality change. Performance
testing shows this results in no performance degradation when preprocessing
Cocoa.h.

llvm-svn: 59474
2008-11-18 00:34:22 +00:00
Ted Kremenek 68ef9fc6ae - Add 'CurPPLexer' to Preprocessor to keep track of the current
PreprocessorLexer, which will either be a 'Lexer' or 'PTHLexer'.
- Added stub field 'CurPTHLexer' to keep track of the current PTHLexer.
- Modified IncludeStackInfo to track both the current PTHLexer and
  current PreprocessorLexer.

llvm-svn: 59472
2008-11-18 00:12:49 +00:00
Chris Lattner 1b03f76113 Trivial tidying
llvm-svn: 59424
2008-11-16 20:22:05 +00:00
Ted Kremenek a0d2a1661a Using llvm::OwningPtr<> for CurLexer and CurTokenLexer. This makes both the ownership semantics of these objects explicit within the Preprocessor and also tightens up the code (explicit deletes not needed).
llvm-svn: 59249
2008-11-13 17:11:24 +00:00
Ted Kremenek 7c1e61d78b Use PushIncludeMacroStack/PopMacroStack instead of manually pushing/popping from IncludeMacroStack. This is both cleaner and makes the include stack transparently extensible.
llvm-svn: 59248
2008-11-13 16:51:03 +00:00
Ted Kremenek 66312a3ff4 Move some diagnostic handling to PreprocessorLexer.
llvm-svn: 59191
2008-11-12 23:13:54 +00:00
Ted Kremenek 3060634fba Add virtual dtor to PreprocessorLexer.
llvm-svn: 59188
2008-11-12 22:48:57 +00:00
Ted Kremenek 2f4f2dea82 Remove Lexer::LexIncludeFilename.
llvm-svn: 59186
2008-11-12 22:44:15 +00:00
Ted Kremenek 6c90efb923 Move LexIncludeFilename from Lexer to PreprocessorLexer.
PreprocessorLexer now has a virtual method "IndirectLex" which allows it to call the lex method of its subclasses.  This is not for performance intensive operations.

llvm-svn: 59185
2008-11-12 22:43:05 +00:00
Ted Kremenek 50b4f48225 Use PushIncludeMacroStack() instead of manually manipulating the include stack.
llvm-svn: 59181
2008-11-12 22:21:57 +00:00
Ted Kremenek 7cd62457c4 Add skeleton for PTH lexer.
llvm-svn: 59169
2008-11-12 21:37:15 +00:00
Argyrios Kyrtzidis c7e67a04c3 Introduce annotation tokens, a special kind of token, created and used only by the parser to replace a group of tokens with a single token encoding semantic information.
Will be fully utilized later for C++ nested-name-specifiers.

llvm-svn: 58911
2008-11-08 16:17:04 +00:00
Sanjiv Gupta 83b95cc60c Fixed build warning. No functionality change.
llvm-svn: 58503
2008-10-31 10:24:31 +00:00
Sanjiv Gupta d79592448b Made the mechanism of defining preprocessor defs for maxint, ptrdiff_t, wchar
etc more generic. For some targets, long may not be equal to pointer size. For
example: PIC16 has int as i16, ptr as i16 but long as i32.

Also fixed a few build warnings in assert() functions in CFRefCount.cpp,
CGDecl.cpp, SemaDeclCXX.cpp and ParseDeclCXX.cpp.

llvm-svn: 58501
2008-10-31 09:52:39 +00:00
Ted Kremenek 100f87d655 Initialize Suffix and Prefix to 0, even with a bad entry. Removes an uninitialized value warning from gcc.
llvm-svn: 58305
2008-10-28 00:18:42 +00:00
Chris Lattner 66a740e66e Rename Characteristic_t to CharacteristicKind
llvm-svn: 58224
2008-10-27 01:19:25 +00:00
Oscar Fuentes 07d9f9a6ec CMake: Builds and installs clang binary and libs (no docs yet). It
must be under the `tools' subdirectory of the LLVM *source* tree.

llvm-svn: 58180
2008-10-26 00:56:18 +00:00
Daniel Dunbar be947083d2 Speed up NumericLiteralParser::GetIntegerValue.
- Implement fast path when value easily fits in a uint64.
 - ~6x faster, translates to 1-2% on Cocoa.h

llvm-svn: 57632
2008-10-16 07:32:01 +00:00
Daniel Dunbar b1f64426a0 Simplify overflow-on-add check in NumericLiteralParser::GetIntegerValue.
llvm-svn: 57629
2008-10-16 06:39:30 +00:00
Chris Lattner b11c3233d8 Change FormTokenWithChars to take the token kind to form, since all clients
were setting a kind and then forming it.  This is just a minor API cleanup, 
no functionality change.

llvm-svn: 57404
2008-10-12 04:51:35 +00:00
Chris Lattner 99e7d23455 When in keep whitespace mode, make sure to return block comments that are
unterminated.

llvm-svn: 57403
2008-10-12 04:19:49 +00:00
Chris Lattner e01e758e11 Change SkipBlockComment and SkipBCPLComment to return true when in
keep comment mode, instead of returning false.  This matches SkipWhitespace.

llvm-svn: 57402
2008-10-12 04:15:42 +00:00
Chris Lattner 4d96344c19 Add a new mode to the lexer which enables it to return all characters,
even whitespace, as tokens from the file.  This is enabled with
L->SetKeepWhitespaceMode(true) on a raw lexer.  In this mode, you too
can use clang as a really complex version of 'cat' with code like this:

  Lexer RawLex(SourceLocation::getFileLoc(SM.getMainFileID(), 0),
               PP.getLangOptions(), File.first, File.second);
  
  RawLex.SetKeepWhitespaceMode(true);
  
  Token RawTok;
  RawLex.LexFromRawLexer(RawTok);
  while (RawTok.isNot(tok::eof)) {
    std::cout << PP.getSpelling(RawTok);
    RawLex.LexFromRawLexer(RawTok);
  }

This will emit exactly the input file, with no canonicalization or other
translation.  Realistic clients actually do something with the tokens of
course :)

llvm-svn: 57401
2008-10-12 04:05:48 +00:00
Chris Lattner 924efdf623 Stop the preprocessor from poking the lexer's private parts.
llvm-svn: 57399
2008-10-12 03:31:33 +00:00
Chris Lattner 097a8b8777 Fix a couple more places that poke KeepCommentMode unnecesarily.
llvm-svn: 57398
2008-10-12 03:27:19 +00:00
Chris Lattner 8637abd333 add a new inKeepCommentMode() accessor to abstract the KeepCommentMode
ivar.

llvm-svn: 57397
2008-10-12 03:22:02 +00:00
Chris Lattner e3f863a388 fix misleading comment.
llvm-svn: 57396
2008-10-12 01:34:51 +00:00
Chris Lattner 7c2e9809b1 Simplify raw mode lexing by treating an unterminate /**/ comment the
same we we do an unterminated string or character literal.  This makes
it so we can guarantee that the lexer never calls into the 
preprocessor (which would be suicide for a raw lexer).

llvm-svn: 57395
2008-10-12 01:31:51 +00:00
Chris Lattner 6b0c5ad096 add a comment.
llvm-svn: 57394
2008-10-12 01:23:27 +00:00
Chris Lattner 50c9050037 Change how raw lexers are handled: instead of creating them and then
using LexRawToken, create one and use LexFromRawLexer.  This avoids
twiddling the RawLexer flag around and simplifies some code (even 
speeding raw lexing up a tiny bit).

This change also improves the token paster to use a Lexer on the stack
instead of new/deleting it. 

llvm-svn: 57393
2008-10-12 01:15:46 +00:00
Chris Lattner 79ef843533 silence some release-assert warnings.
llvm-svn: 57391
2008-10-12 00:28:42 +00:00
Chris Lattner 87e97ea7b8 improve a comment.
llvm-svn: 57389
2008-10-12 00:23:07 +00:00
Chris Lattner 1b0a00a4c9 __CONSTANT_CFSTRINGS__ should be defined even in C mode, otherwise the CFSTR
won't expand to the builtin.  This fixes rdar://6248329

llvm-svn: 57164
2008-10-06 07:43:09 +00:00
Chris Lattner ac7ed9a71a move __FLT_EVAL_METHOD__, __FLT_RADIX__, and __DECIMAL_DIG__ into
target indep code.

llvm-svn: 57139
2008-10-05 21:49:27 +00:00
Chris Lattner 6da2f0dd2e suck the rest of the FP macros out of the targets into the PP
llvm-svn: 57137
2008-10-05 21:40:58 +00:00
Chris Lattner 5cd8351808 start moving fp macros over
llvm-svn: 57134
2008-10-05 20:40:30 +00:00
Chris Lattner 6512a88984 move a bunch more integer sizing out of target-specific code into
target indep code.  

Note that this changes functionality on PIC16: it defines __INT_MAX__
correctly for it, and it changes sizeof(long) to 16-bits (to match
the size of pointer).

llvm-svn: 57132
2008-10-05 20:06:37 +00:00
Chris Lattner 4ecd753486 eliminate __USER_LABEL_PREFIX__ from the Targets.cpp file, start moving
integer size #defines over to the Preprocessor.

llvm-svn: 57130
2008-10-05 19:44:25 +00:00
Chris Lattner 248d3c4192 gcc no longer defines __block to nothing when blocks aren't enabled.
llvm-svn: 57129
2008-10-05 19:32:52 +00:00
Chris Lattner 1f7e2d5430 rearrange preprocessor macro definitions into language-specific
then target specific.

llvm-svn: 57128
2008-10-05 19:32:22 +00:00
Chris Lattner f37bafc5ca Implement PR2773, support for __USER_LABEL_PREFIX__
llvm-svn: 57127
2008-10-05 19:22:37 +00:00
Daniel Dunbar 405965323d Add Preprocessor::RemovePragmaHandler.
- No functionality change.

llvm-svn: 57065
2008-10-04 19:17:46 +00:00
Chris Lattner 59f09b6fe1 Document assumptions that NumericLiteralParser makes with an assertion.
llvm-svn: 56876
2008-09-30 20:45:40 +00:00
Chris Lattner 7ed3209332 define __PASCAL_STRINGS__ whenever -fpascal-strings is enabled.
llvm-svn: 56824
2008-09-30 00:48:48 +00:00
Chris Lattner 006579ddb5 __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ is a darwin-specific #define
llvm-svn: 56822
2008-09-30 00:46:39 +00:00
Chris Lattner fb8b8f298c Fix the root cause of PR2750 instead of the side effect.
NumericLiteral parser is not careful about overrun because
it should never be possible.  It implicitly expects that its
input matched the regex for pp-constant.  Because of this, it
knows it can't be pointing to a prefix of something that
looks like a number.  This is all fine, except that __LINE__
does not prevent implicit concatenation from happening.  Fix
__LINE__ to not do this.

llvm-svn: 56818
2008-09-29 23:12:31 +00:00
Nico Weber 378c5539c8 whitespace and comment changes, to fix grammar and 80 col violations
llvm-svn: 56776
2008-09-29 00:25:48 +00:00
Chris Lattner b03dc76499 clean up a bunch of fixme's I added, by moving
DirectoryLookup::DirType into SourceManager.h

llvm-svn: 56692
2008-09-26 21:18:42 +00:00
Chris Lattner c88a23e8d7 Fix the rest of rdar://6243860 hopefully. This requires changing FileIDInfo
to whether the fileid is a 'extern c system header' in addition to whether it
is a system header, most of this is spreading plumbing around.  Once we have that,
PPLexerChange bases its "file enter/exit" notifications to PPCallbacks to
base the system header state on FileIDInfo instead of HeaderSearch.  Finally,
in Preprocessor::HandleIncludeDirective, mirror logic in GCC: the system headerness
of a file being entered can be set due to the #includer or the #includee.

llvm-svn: 56688
2008-09-26 20:12:23 +00:00
Daniel Dunbar b5fe494a2c Update clang to pretend to be gcc-4.2.
- This really needs to be automated and configurable.

llvm-svn: 56635
2008-09-26 01:13:13 +00:00
Steve Naroff de2d5b75c2 Fix <rdar://problem/6240065> clang: __BLOCKS__ should be defined.
llvm-svn: 56503
2008-09-23 21:28:24 +00:00
Argyrios Kyrtzidis 91c3f526dc Line endings: CRLF -> LF
llvm-svn: 55829
2008-09-05 08:53:53 +00:00
Steve Naroff c84e8b779e - Implement __block.
- Replace FIXME in Preprocessor::HandleIdentifier() with a check that avoids diagnosing extension tokens that originate from macro definitions.

llvm-svn: 55639
2008-09-02 18:50:17 +00:00
Steve Naroff d450bffcf1 Pull code from last commit. will put back soon.
llvm-svn: 55637
2008-09-02 18:04:36 +00:00
Steve Naroff 95bffc74da Implement block pseudo-storage class modifiers (__block, __byref).
llvm-svn: 55635
2008-09-02 15:20:19 +00:00
Eli Friedman ee29c9c2fc Fix for PR2750; don't check for an 'e' in the trash after the token.
Note that this isn't really a complete fix; I think there are other 
potential overrun situations.  I don't really know what the best 
systematic fix is, though.

llvm-svn: 55622
2008-09-02 05:29:22 +00:00
Argyrios Kyrtzidis 75b34536d0 Rename Preprocessor::DisableBacktrack -> Preprocessor::CommitBacktrackedTokens.
llvm-svn: 55281
2008-08-24 12:29:43 +00:00
Argyrios Kyrtzidis 5d240d07f2 Add a safety check.
Make sure there's no "dangling" backtrack position when Preprocessor is destroyed.

llvm-svn: 55236
2008-08-23 12:12:06 +00:00
Argyrios Kyrtzidis bd024c7fdb Change line endings: CRLF -> LF
llvm-svn: 55235
2008-08-23 12:05:53 +00:00
Argyrios Kyrtzidis a65490c5df Allow nested backtracks.
llvm-svn: 55204
2008-08-22 21:27:50 +00:00
Chris Lattner 5d1cfa1229 various updates to match r54873 on mainline.
llvm-svn: 54874
2008-08-17 07:19:51 +00:00
Daniel Dunbar 12c9ddced1 Change Parser & Sema to use interned "super" for comparions.
- Added as private members for each because it is not clear where to
   put the common definition. Perhaps the IdentifierInfos all of these
   "pseudo-keywords" should be collected into one place (this would
   KnownFunctionIDs and Objective-C property IDs, for example).

Remove Token::isNamedIdentifier.
 - There isn't a good reason to use strcmp when we have interned
   strings, and there isn't a good reason to encourage clients to do
   so.

llvm-svn: 54794
2008-08-14 22:04:54 +00:00
Daniel Dunbar 2258aa0f27 Move some ObjC preprocessor definitions into
InitializePredefinedMacros().
 - Also now properly wired to -fobjc-gc, -fnext-runtime.

llvm-svn: 54661
2008-08-12 00:21:46 +00:00
Chris Lattner 7b18e35113 remove obsolete comment.
llvm-svn: 54652
2008-08-11 22:03:07 +00:00
Daniel Dunbar 56fdb6ae69 More #include cleaning
- Kill unnecessary #includes in .cpp files. This is an automatic
   sweep so some things removed are actually used, but happen to be
   included by a previous header. I tried to get rid of the obvious
   examples and this was the easiest way to trim the #includes in one
   fell swoop.
 - We now return to regularly scheduled development.

llvm-svn: 54632
2008-08-11 06:23:49 +00:00
Nico Weber 4c3116437c * Remove isInSystemHeader() from DiagClient, move it to SourceManager
* Move FormatError() from TextDiagnostic up to DiagClient, remove now  
  empty class TextDiagnostic
* Make DiagClient optional for Diagnostic

This fixes the following problems:

* -html-diags (and probably others) does now output the same set of  
  warnings as console clang does
* nothing crashes if one forgets to call setHeaderSearch() on  
  TextDiagnostic
* some code duplication is removed

llvm-svn: 54620
2008-08-10 19:59:06 +00:00
Argyrios Kyrtzidis b3dd1e0889 Allow the preprocessor to cache the lexed tokens, so that we can do efficient lookahead and backtracking.
1) New public methods added:
  -EnableBacktrackAtThisPos
  -DisableBacktrack
  -Backtrack
  -isBacktrackEnabled

2) LookAhead() implementation is replaced with a more efficient one.
3) LookNext() is removed.

llvm-svn: 54611
2008-08-10 13:15:22 +00:00
Chris Lattner c94ad4abcb In c89 mode accept hex fp constants as an extension:
t2.c:1:17: warning: hexadecimal floating constants are a C99 feature
long double d = 0x0.0000003ffffffff00000p-16357L;
                ^

instead of emitting a weird error message that doesn't make sense:

t2.c:1:41: error: hexadecimal floating constants require an exponent
long double d = 0x0.0000003ffffffff00000p-16357L;
                                        ^

rdar://6096838

llvm-svn: 54035
2008-07-25 18:18:34 +00:00
Ted Kremenek 0fff6d3c06 Patch by
"When dumping the tokens (-dumptokens output type), the column numbers are not
correctly shown. This patch fixes that issue."

llvm-svn: 53796
2008-07-19 19:10:04 +00:00
Argyrios Kyrtzidis 715eabcef5 Convert CRLF -> LF line endings.
llvm-svn: 53519
2008-07-12 20:28:04 +00:00
Argyrios Kyrtzidis 80b77ac394 Add Preprocessor::LookNext method, which implements an efficient way to 'take a peek' at the next token without consuming it.
llvm-svn: 53375
2008-07-09 22:46:46 +00:00
Nuno Lopes 3da38fd145 move the linux predefined macro definition to the TargetInfo, where it really belongs
llvm-svn: 53149
2008-07-05 19:32:25 +00:00
Nuno Lopes 9b6de71b7d predefine the macro linux when compiled on a linux system. this fixes the build of libtidy
llvm-svn: 53145
2008-07-05 17:58:44 +00:00
Chris Lattner 1cb0e61e98 Fix PR2252: don't warn on negating an unsigned value ever, and don't emit
'integer constant is so large that it is unsigned' warning for hex literals.

llvm-svn: 53070
2008-07-03 03:47:30 +00:00