Commit Graph

221 Commits

Author SHA1 Message Date
Jordan Rose cc538345be Lexer: Don't warn about Unicode in preprocessor directives.
This allows people to use Unicode in their #pragma mark and in macros
that exist only to be string-ized.

<rdar://problem/13107323&13121362>

llvm-svn: 174081
2013-01-31 19:48:48 +00:00
Jordan Rose 9588a02b77 Fix comment in test/Lexer/utf8-invalid.c for updates in r173959.
llvm-svn: 173961
2013-01-30 19:29:14 +00:00
Jordan Rose f649795f84 Fix r173881 to properly skip invalid UTF-8 characters in raw lexing and -E.
This caused hangs as we processed the same invalid byte over and over.

<rdar://problem/13115651>

llvm-svn: 173959
2013-01-30 19:21:12 +00:00
Jordan Rose 17441589c3 Don't warn about Unicode characters in -E mode.
People use the C preprocessor for things other than C files. Some of them
have Unicode characters. We shouldn't warn about Unicode characters
appearing outside of identifiers in this case.

There's not currently a way for the preprocessor to tell if it's in -E mode,
so I added a new flag, derived from the PreprocessorOutputOptions. This is
only used by the Unicode warnings for now, but could conceivably be used by
other warnings or even behavioral differences later.

<rdar://problem/13107323>

llvm-svn: 173881
2013-01-30 01:52:57 +00:00
Dmitri Gribenko 75bd3a8ec1 FileCheck'ize and merge tests
llvm-svn: 173714
2013-01-28 20:40:50 +00:00
Jordan Rose cccbdbf0db PR15067 (again): Don't warn about UCNs in C90 if we're raw-lexing.
Fixes a crash. Thanks, Richard.

llvm-svn: 173701
2013-01-28 17:49:02 +00:00
Jordan Rose c0cba27230 PR15067: Don't assert when a UCN appears in a C90 file.
Unfortunately, we can't accept the UCN as an extension because we're
required to treat it as two tokens for preprocessing purposes.

llvm-svn: 173622
2013-01-27 20:12:04 +00:00
Dmitri Gribenko 5a7ae8dc18 Migrate tests to -verify
llvm-svn: 173582
2013-01-26 17:11:39 +00:00
Dmitri Gribenko a5ef1517d9 FileCheck'ize test
llvm-svn: 173393
2013-01-24 23:44:04 +00:00
Jordan Rose 4246ae0089 As an extension, treat Unicode whitespace characters as whitespace.
llvm-svn: 173370
2013-01-24 20:50:50 +00:00
Jordan Rose 7f43dddae0 Handle universal character names and Unicode characters outside of literals.
This is a missing piece for C99 conformance.

This patch handles UCNs by adding a '\\' case to LexTokenInternal and
LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN.
If the UCN is not syntactically well-formed, we fall back to the old
treatment: a backslash followed by an identifier beginning with 'u' (or 'U').

Because the spelling of an identifier with UCNs still has the UCN in it, we
need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo.

Of course, valid code that does *not* use UCNs will see only a very minimal
performance hit (checks after each identifier for non-ASCII characters,
checks when converting raw_identifiers to identifiers that they do not
contain UCNs, and checks when getting the spelling of an identifier that it
does not contain a UCN).

This patch also adds basic support for actual UTF-8 in the source. This is
treated almost exactly the same as UCNs except that we consider stray
Unicode characters to be mistakes and offer a fixit to remove them.

llvm-svn: 173369
2013-01-24 20:50:46 +00:00
Bill Wendling 1f631645f8 Don't check lines beginning with '#', since they could contain a path with the unexpected word in them.
llvm-svn: 173306
2013-01-23 23:06:28 +00:00
Bill Wendling 958d8f2fcd The diagnostic is now a warning instead of an error. Also don't check lines beginning with '#', since they could contain a path with the unexpected word in them.
llvm-svn: 173305
2013-01-23 23:04:29 +00:00
Richard Smith e826c1a134 Add raw string literal versus C preprocessor test, suggested by James Dennett.
llvm-svn: 172660
2013-01-16 21:43:09 +00:00
Evgeniy Stepanov a8df444a1c Add __has_feature(memory_sanitizer).
llvm-svn: 170686
2012-12-20 12:03:13 +00:00
Dmitry Vyukov a53767ea22 tsan: add __has_feature(thread_sanitizer)
llvm-svn: 170314
2012-12-17 08:52:05 +00:00
Aaron Ballman 406ea51cfb Support for #pragma region/endregion for MSVC compatibility. Patch thanks to pravic!
llvm-svn: 169028
2012-11-30 19:52:30 +00:00
Nico Weber 4e270380c1 Fix crash on end-of-file after \ in a char literal, fixes PR14369.
This makes LexCharConstant() look more like LexStringLiteral(), which doesn't
have this bug. Add tests for eof after \ for several other cases.

llvm-svn: 168269
2012-11-17 20:25:54 +00:00
Andy Gibbs a8df57a962 Made the "expected string literal" diagnostic more expressive
llvm-svn: 168267
2012-11-17 19:16:52 +00:00
Nico Weber 1ed35ba2dd FileCheckize test
llvm-svn: 167680
2012-11-11 01:35:05 +00:00
Richard Smith b1b0ab41e7 Use the individual -fsanitize=<...> arguments to control which of the UBSan
checks to enable. Remove frontend support for -fcatch-undefined-behavior,
-faddress-sanitizer and -fthread-sanitizer now that they don't do anything.

llvm-svn: 167413
2012-11-05 22:21:05 +00:00
Andy Gibbs c6e68daac0 Prior to adding the new "expected-no-diagnostics" directive to VerifyDiagnosticConsumer, make the necessary adjustment to 580 test-cases which will henceforth require this new directive.
llvm-svn: 166280
2012-10-19 12:44:48 +00:00
Dmitri Gribenko 1cd2305703 Change the wording of the extension warning from
> 'long long' is an extension when C99 mode is not enabled
to
> 'long long' is a C++11 extension
while compiling in C++98 mode.

llvm-svn: 164545
2012-09-24 18:19:21 +00:00
Richard Smith 639b8d05dd When a bad UTF-8 encoding or bogus escape sequence is encountered in a
string literal, produce a diagnostic pointing at the erroneous character
range, not at the start of the literal.

llvm-svn: 163459
2012-09-08 07:16:20 +00:00
Jordan Rose b13eb8dca5 Allow -verify directives to be filtered by preprocessing.
This is accomplished by making VerifyDiagnosticsConsumer a CommentHandler,
which then only reads the -verify directives that are actually in live
blocks of code. It also makes it simpler to handle -verify directives that
appear in header files, though we still have to manually reparse some files
depending on how they are generated.

This requires some test changes. In particular, all PCH tests now have their
-verify directives outside the "header" portion of the file, using the @line
syntax added in r159978. Other tests have been modified mostly to make it
clear what is being tested, and to prevent polluting the expected output with
the directives themselves.

Patch by Andy Gibbs! (with slight modifications)

The new Frontend/verify-* tests exercise the functionality of this commit,
as well as r159978, r159979, and r160053 (Andy's other -verify enhancements).

llvm-svn: 160068
2012-07-11 19:58:23 +00:00
Jordan Rose 8d63d5b8e6 Fix the location of the fixit for -Wnewline-eof.
It turns out SourceManager treating the "one-past-the-end" location as invalid,
but then failing to set the invalid flag properly.

llvm-svn: 158699
2012-06-19 03:09:38 +00:00
Jordan Rose 127f6eef7e [-E] Emit a rewritten _Pragma on its own line.
1. Teach Lexer that pragma lexers are like macro expansions at EOF.
2. Treat pragmas like #define/#undef when printing.
3. If we just printed a directive, add a newline before any more tokens.
(4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp)

PR10594 and <rdar://problem/11562490> (two separate related problems)

llvm-svn: 158571
2012-06-15 23:33:51 +00:00
Richard Smith e6799ddae8 PR12717: Clang supports hexadecimal floating-point literals in all language
modes. For languages other than C99/C11, this isn't quite a conforming
extension, and for C++11, it breaks some reasonable code containing
user-defined literals.

In languages which don't officially have hexfloats, pare back this extension
to only apply in cases where the token starts 0x and does not contain an
underscore. The extension is still not quite conforming, but it's a lot closer
now.

llvm-svn: 158487
2012-06-15 05:07:49 +00:00
Richard Smith 0948d93b7f Fix off-by-one error in UTF-16 encoding: don't try to use a surrogate pair for U+FFFF.
llvm-svn: 158391
2012-06-13 05:41:29 +00:00
James Molloy 222f27858f Add a predefine __WINT_UNSIGNED__, similar to __WCHAR_UNSIGNED__, and test them both for ARM and X86.
Use this to fully fix Sema/format-strings.c for non-x86 platforms.

Reviewed by Chandler on IRC.

llvm-svn: 156169
2012-05-04 11:23:40 +00:00
David Blaikie 83261063d1 Fix tests that weren't actually verifying anything.
Passing -verify to clang without -cc1 or -Xclang silently passes (with a
printed warning, but lit doesn't care about that). This change adds -cc1 or,
as is necessary in one case, -Xclang to fix this so that these tests are
actually verifying as intended.

I'd like to change the driver so this kind of mistake could not be made, but
I'm not entirely sure how. Further, since the driver only warns about unknown
flags in general, we could have similar bugs with a misspellings of arguments
that would be nice to find.

llvm-svn: 154776
2012-04-15 22:09:44 +00:00
Seth Cantrell b0dfdfe790 %clang -cc1 -> %clang_cc1
llvm-svn: 154757
2012-04-15 04:41:49 +00:00
Seth Cantrell e83c731cad Support -Wc++98-compat-pedantic as requested:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20120409/056126.html

llvm-svn: 154655
2012-04-13 03:43:23 +00:00
Seth Cantrell 10ac7205ce C++11 no longer requires files to end with a newline
llvm-svn: 154643
2012-04-13 01:00:34 +00:00
Douglas Gregor 0598962a7b Add a query macro for C++11 N3276, decltype does not require complete
return types, from Michel Morin!

llvm-svn: 154428
2012-04-10 20:00:33 +00:00
Francois Pichet 7ebc4c1910 ext_reserved_user_defined_literal must not default to Error in MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse.
Fixes PR12383.

llvm-svn: 154273
2012-04-07 23:09:23 +00:00
Douglas Gregor 9781893507 Add feature check "cxx_local_type_template_args" describing support
for templates with local template arguments, from Michel Morin! Fixes
PR12337.

llvm-svn: 153983
2012-04-04 00:48:39 +00:00
Richard Smith 5023188315 User-defined literals are done.
llvm-svn: 152396
2012-03-09 08:41:27 +00:00
Richard Smith 812924502b When checking the encoding of an 8-bit string literal, don't just check the
first codepoint! Also, don't reject empty raw string literals for spurious
"encoding" issues. Also, don't rely on undefined behavior in ConvertUTF.c.

llvm-svn: 152344
2012-03-08 21:59:28 +00:00
Richard Smith d67aea28f6 User-defined literals: reject string and character UDLs in all places where the
grammar requires a string-literal and not a user-defined-string-literal. The
two constructs are still represented by the same TokenKind, in order to prevent
a combinatorial explosion of different kinds of token. A flag on Token tracks
whether a ud-suffix is present, in order to prevent clients from needing to look
at the token's spelling.

llvm-svn: 152098
2012-03-06 03:21:47 +00:00
Richard Smith 522fa53703 Add a pile of tests for unrestricted unions, and advertise support for them.
llvm-svn: 151992
2012-03-03 23:51:05 +00:00
Jean-Daniel Dupas 7598fadd78 Merge __has_attribute tests. Patch by Jonathan Sauer!
llvm-svn: 151819
2012-03-01 17:45:53 +00:00
Sebastian Redl d89c218a2b Initializer lists are now supported.
llvm-svn: 151458
2012-02-25 20:51:27 +00:00
Richard Smith 2cca7b5ca9 Accept __has_feature(__feature__) as a synonym for __has_feature(feature) (and
likewise for __has_extension). Patch by Jonathan Sauer!

llvm-svn: 151445
2012-02-25 10:41:10 +00:00
Douglas Gregor 34b2e8bb17 Clang now supports lambda expressions.
llvm-svn: 151231
2012-02-23 03:02:32 +00:00
Richard Smith 1cb2af0b3a Advertize support for constexpr.
llvm-svn: 150524
2012-02-14 22:56:17 +00:00
Eli Friedman 9436352a82 Implement warning for non-wide string literals with an unexpected encoding. Downgrade error for non-wide character literals with an unexpected encoding to a warning for compatibility with gcc and older versions of clang. <rdar://problem/10837678>.
llvm-svn: 150295
2012-02-11 05:08:10 +00:00
Aaron Ballman e1224a5067 Fixing hex floating literal support so that it handles 0x.2p2 properly.
llvm-svn: 150072
2012-02-08 13:36:33 +00:00
Aaron Ballman b97a5addd5 Hex literals without a significand no longer crash the lexer. Fixes bug 7910
Patch by Eitan Adler

llvm-svn: 149984
2012-02-07 13:46:03 +00:00
Eli Friedman 04342eee52 Improve the error message slightly for files that aren't using the expected UTF-8 encoding. Patch by Seth Cantrell.
llvm-svn: 148991
2012-01-25 22:34:12 +00:00