when we don't need to split.
In some cases we know that a method cannot have a different
implementation in a subclass:
- the class is declared in the main file (private)
- all the method declarations (including the ones coming from super
classes) are in the main file.
This can be improved further, but might be enough for the heuristic.
(When we are too aggressive splitting the state, efficiency suffers.
When we fail to split the state coverage might suffer.)
llvm-svn: 161681
This should speed up activities that need to access bindings by cluster,
such as invalidation and dead-bindings cleaning. In some cases all we save
is the cost of building the region cluster map, but other times we can
actually avoid traversing the rest of the store.
In casual testing, this produced a speedup of nearly 10% analyzing SQLite,
with /less/ memory used.
llvm-svn: 161636
An ASTContext's RecordLayoutInfo can only be used to look up offsets of
direct base classes, and we need the offset to make non-symbolic bindings
in RegionStore. This change makes sure that we have one layer of
CXXBaseObjectRegion for each base we are casting through.
This was causing crashes on an internal buildbot.
llvm-svn: 161621
This is an initial (unoptimized) version. We split the path when
inlining ObjC instance methods. On one branch we always assume that the
type information for the given memory region is precise. On the other we
assume that we don't have the exact type info. It is important to check
since the class could be subclassed and the method can be overridden. If
we always inline we can loose coverage.
Had to refactor some of the call eval functions.
llvm-svn: 161552
Unfortunately, generalized region printing is very difficult:
- ElementRegions are used both for casting and as actual elements.
- Accessing values through a pointer means going through an intermediate
SymbolRegionValue; symbolic regions are untyped.
- Referring to implicitly-defined variables like 'this' and 'self' could be
very confusing if they come from another stack frame.
We fall back to simply not printing the region name if we can't be sure it
will print well. This will allow us to improve in the future.
llvm-svn: 161512
The main blocker on this (besides the previous commit) was that
ScanReachableSymbols was not looking through LazyCompoundVals.
Once that was fixed, it's easy enough to clear out malloc data on return,
just like we do when we bind to a global region.
<rdar://problem/10872635>
llvm-svn: 161511
RegionStore currently uses a (Region, Offset) pair to describe the locations
of memory bindings. However, this representation breaks down when we have
regions like 'array[index]', where 'index' is unknown. We used to store this
as (SubRegion, 0); now we mark them specially as (SubRegion, SYMBOLIC).
Furthermore, ProgramState::scanReachableSymbols depended on the existence of
a sub-region map, but RegionStore's implementation doesn't provide for such
a thing. Moving the store-traversing logic of scanReachableSymbols into the
StoreManager allows us to eliminate the notion of SubRegionMap altogether.
This fixes some particularly awkward broken test cases, now in
array-struct-region.c.
llvm-svn: 161510
Warns on anti-patterns/typos in the 'size' argument to strncat. The
correct size argument should look like the following:
- strncat(dst, src, sizeof(dst) - strlen(dest) - 1);
We warn on:
- sizeof(dst)
- sizeof(src)
- sizeof(dst) - strlen(dst)
- sizeof(src) - anything
(This has been implemented in void Sema::CheckStrncatArguments().)
llvm-svn: 161440
Dynamic type inference does the right thing in this case. However, as
Jordan suggested, it would be nice to add a warning here as well.
llvm-svn: 161365
I currently have a bit of redundancy with the cast kind switch statement
inside the ImplicitCast callback, but I might be adding more casts going
forward.
llvm-svn: 161358
Instead of sprinkling dynamic type info propagation throughout
ExprEngine, the added checker would add the more precise type
information on known APIs (Ex: ObjC alloc, new) and propagate
the type info in other cases (ex: ObjC init method, casts (the second is
not implemented yet)).
Add handling of ObjC alloc, new and init to the checker.
llvm-svn: 161357
No functionality change, but from now on, any new path notes should be
tested both with plain-text output (for ease of human auditing) and with
plist output (to ensure control flow and events are being correctly
represented in Xcode).
llvm-svn: 161351
While there is no such thing as a "null reference" in the C++ standard,
many implementations of references (including Clang's) do not actually
check that the location bound to them is non-null. Thus unlike a regular
null dereference, this will not cause a problem at runtime until the
reference is actually used. In order to catch these cases, we need to not
prune out paths on which the input pointer is null.
llvm-svn: 161288
Like base constructors, delegating constructors require no further
processing in the CFGInitializer node.
Also, add PrettyStackTraceLoc to the initializer and destructor logic
so we can get better stack traces in the future.
llvm-svn: 161283
Because of this, we would previously emit NO path notes when a parameter
is constrained to null (because there are no stores). Now we show where we
made the assumption, which is much more useful.
llvm-svn: 161280
This only applies in the case where ->* is not overloaded, since it
specifically looks for BinaryOperator and not CXXOperatorCallExpr.
llvm-svn: 161275
In the following code, find the type of the symbolic receiver by
following it and updating the dynamic type info in the state when we
cast the symbol from id to MyClass *.
MyClass *a = [[self alloc] init];
return 5/[a testSelf];
llvm-svn: 161264
There is no reason why we should not track the memory which was not
allocated in the current function, but was freed there. This would
allow to catch more use-after-free and double free with no/limited IPA.
Also fix a realloc issue which surfaced as the result of this patch.
llvm-svn: 161248
engine.
The code that was supposed to split the tie in a deterministic way is
not deterministic. Most likely one of the profile methods uses a
pointer. After this change we do finally get the consistent diagnostic
output. Testing this requires running the analyzer on large code bases
and diffing the results.
llvm-svn: 161224
There's still more work to be done here; this doesn't catch reference
parameters or return values. But it's a step in the right direction.
Part of <rdar://problem/11212286>.
llvm-svn: 161214
While usually we'd use a symbolic region rather than a straight-up Unknown,
we can still generate unknowns via array subscripts with symbolic indexes.
(And if this ever changes in the future, we still shouldn't crash.)
llvm-svn: 161059
This was causing a crash in our array-to-pointer logic, since the region
was clearly not an array.
PR13440 / <rdar://problem/11977113>
llvm-svn: 161051
- Retrieves the type of the object/receiver from the state.
- Binds self during stack setup.
- Only explores the path on which the method is inlined (no
bifurcation to explore the path on which the method is not inlined).
llvm-svn: 160991
We were treating this like a CXXDefaultArgExpr, but
SubstNonTypeTemplateParmExpr actually appears when a template is
instantiated, i.e. we have all the information necessary to evaluate it.
This allows us to inline functions like llvm::array_lengthof.
<rdar://problem/11949235>
llvm-svn: 160846
instead of walking to the preceding PostStmt node. There are cases where the last evaluated
expression does not appear in the ExplodedGraph.
Fixes PR 13466.
llvm-svn: 160819
Our BugReporter knows how to deal with implicit statements: it looks in
the ParentMap until it finds a parent with a valid location. However, since
initializers are not in the body of a constructor, their sub-expressions are
not in the ParentMap. That was easy enough to fix in AnalysisDeclContext.
...and then even once THAT was fixed, there's still an extra funny case
of Objective-C object pointer fields under ARC, which are initialized with
a top-level ImplicitValueInitExpr. To catch these cases,
PathDiagnosticLocation will now fall back to the start of the current
function if it can't find any other valid SourceLocations. This isn't great,
but it's miles better than a crash.
(All of this is only relevant when constructors and destructors are being
inlined, i.e. under -cfg-add-initializers and -cfg-add-implicit-dtors.)
llvm-svn: 160810
This workaround is fairly lame: we simulate the first element's constructor
and destructor and rely on the region invalidation to "initialize" the rest
of the elements.
llvm-svn: 160809
This modifies BugReporter and friends to handle CallEnter and CallExitEnd
program points that came from implicit call CFG nodes (read: destructors).
This required some extra handling for nested implicit calls. For example,
the added multiple-inheritance test case has a call graph that looks like this:
testMultipleInheritance3
~MultipleInheritance
~SmartPointer
~Subclass
~SmartPointer
***bug here***
In this case we correctly notice that we started in an inlined function
when we reach the CallEnter program point for the second ~SmartPointer.
However, when we reach the next CallEnter (for ~Subclass), we were
accidentally re-using the inner ~SmartPointer call in the diagnostics.
Rather than guess if we saw the corresponding CallExitEnd based on the
contents of the active path, we now just ask the PathDiagnostic if there's
any known stack before popping off the top path.
(A similar issue could have occured without multiple inheritance, but there
wasn't a test case for it.)
llvm-svn: 160804
- Some cleanup(the TODOs) will be done after ObjC method inlining is
complete.
- Simplified CallEvent::getDefinition not to require ISDynamicDispatch
parameter.
- Also addressed Jordan's comments from r160530.
llvm-svn: 160768
value by scanning the path, rather than assuming we have visited the '?:' operator
as a terminator (which sets a value indicating which expression to grab the
final ternary expression value from).
llvm-svn: 160760
to fix all the issues. Currently the code is essentially unmaintained and buggy, and
needs major revision (with coupled enhancements to the analyzer core).
llvm-svn: 160754
As pointed out by Anna, we only differentiate between explicit message sends
This also adds support for ObjCSubscriptExprs, which are basically the same
as properties in many ways. We were already checking these, but not emitting
nice messages for them.
This depends on the llvm::PointerIntPair change in r160456.
llvm-svn: 160461
instead push the terminator for the branch down into the basic blocks of the subexpressions of '&&' and '||'
respectively. This eliminates some artifical control-flow from the CFG and results in a more
compact CFG.
Note that this patch only alters the branches 'while', 'if' and 'for'. This was complex enough for
one patch. The remaining branches (e.g., do...while) can be handled in a separate patch, but they
weren't immediately tackled because they were less important.
It is possible that this patch introduces some subtle bugs, particularly w.r.t. to destructor placement.
I've tried to audit these changes, but it is also known that the destructor logic needs some refinement
in the area of '||' and '&&' regardless (i.e., their are known bugs).
llvm-svn: 160218
uninitialized variable use, walk back over branches where we've reached all the
non-null successors, not just cases where we've reached all successors.
llvm-svn: 160206
Previously we were using the static type of the base object to inline
methods, whether virtual or non-virtual. Now, we try to see if the base
object has a known type, and if so ask for its implementation of the method.
llvm-svn: 160094
C++ method calls and C function calls both appear as CallExprs in the AST.
This was causing crashes for an object that had a 'free' method.
<rdar://problem/11822244>
llvm-svn: 160029
In order to accomplish this, we now build the callee's stack frame
as part of the CallEnter node, rather than the subsequent BlockEdge node.
This should not have any effect on perceived behavior or diagnostics.
This makes it safe to re-enable inlining of member overloaded operators.
llvm-svn: 160022
This was a regression introduced during the CallEvent changes; a call to
FunctionDecl::hasBody was also being used to replace the decl found by
lookup with the actual definition. To keep from making this mistake again
(particularly if/when we start inlining Objective-C methods), this commit
adds a "getDefinition()" method to CallEvent, which should do the right
thing under any circumstances.
llvm-svn: 159940
We use LazyCompoundVals to avoid copying the contents of structs and arrays
around in the store, and when we need to pass a struct around that already
has a LazyCompoundVal we just use the original one. However, it's possible
that the first field of a struct may have a LazyCompoundVal of its own, and
we currently can't distinguish a LazyCompoundVal for the first element of a
struct from a LazyCompoundVal for the entire struct. In this case we should
just drop the optimization and make a new LazyCompoundVal that encompasses
the old one.
PR13264 / <rdar://problem/11802440>
llvm-svn: 159866
Our current inlining support (specifically RegionStore::enterStackFrame)
doesn't know that calls to overloaded operators may be calls to non-static
member functions, and that in these cases the first argument should be
treated as 'this'. This caused incorrect results and sometimes crashes.
The long-term fix will be to rewrite RegionStore::enterStackFrame to use
CallEvent and its subclasses, but for now we can just disable these
problematic calls by classifying them under a new CallEvent,
CXXMemberOperatorCall.
llvm-svn: 159692
in the call graph had been inlined but for whatever reason we did not inline some
of its callees.
Also, fix a related traversal bug where we meant to do a BFS of the callgraph but
instead were doing a DFS.
llvm-svn: 159577
This involved refactoring some common pointer-escapes code onto CallEvent,
then having MallocChecker use those callbacks for whether or not to consider
a pointer's /ownership/ as escaping. This still needs to be pinned down, and
probably we want to make the new argumentsMayEscape() function a little more
discerning (content invalidation vs. ownership/metadata invalidation), but
this is a good improvement.
As a bonus, also remove CallOrObjCMessage from the source completely.
llvm-svn: 159557
This ended allowing quite a bit of cleanup, and some minor changes.
- CallEvent makes it easy to use hasNonZeroCallbackArg more aggressively, which
we check in order to avoid false positives with callbacks that might release
the object.
- In order to support this for functions which consume their arguments, there
are two new ArgEffects: DecRefAndStopTracking and DecRefMsgAndStopTracking.
These act just like StopTracking, except that if the object only had a
return count of +1 it's now considered released instead (so we still get
use-after-free messages).
- On the plus side, we no longer have to special-case
+[NSObject performSelector:withObject:afterDelay:] and friends.
- The use of IdentifierInfos in the method summary cache is now hidden; only
the ObjCInterfaceDecl gets passed around most of the time.
- Since we cache all "simple" summaries and check every function call, there is
no real benefit to having NULL stand in for default summaries anymore.
- Whitespace, unused methods, etc.
Even more simplification to come when we get check::postCall and can unify all
these other post* checks.
llvm-svn: 159555
This is intended to replace CallOrObjCMessage, and is eventually intended to be
used for anything that cares more about /what/ is being called than /how/ it's
being called. For example, inlining destructors should be the same as inlining
blocks, and checking __attribute__((nonnull)) should apply to the allocator
calls generated by operator new.
llvm-svn: 159554
Previously:
...the comment said DFS...
...the WorkList being instantiated said BFS...
...and the implementation was actually DFS...
...due to an unintentional change in 2010...
...and everything kept working anyway.
This fixes our std::deque implementation of BFS, but switches back to a
SmallVector-based implementation of DFS.
We should probably still investigate the ramifications of DFS vs. BFS,
especially for large functions (and especially when we hit our block path
limit), since this might completely change our memory use. It can also mask
some bugs and reveal others depending on when we halt analysis. But at least
we will not have this kind of little mistake creep in again.
llvm-svn: 159397
The implicit global allocation functions do not have valid source locations,
but we still want to treat them as being "system header" functions for the
purposes of how they affect program state.
llvm-svn: 159160
We don't handle exceptions yet, so we treat them as sinks. ExprEngine
hardcodes messages that are known to raise Objective-C exceptions like -raise,
but it was only checking for +raise:format: and +raise:format:arguments: on
NSException itself, not subclasses.
<rdar://problem/11724201>
llvm-svn: 159010
This commits sets the grounds for more aggressive use after free
checking. We will use the Relinquished sate to denote that someone
else is now responsible for releasing the memory.
llvm-svn: 158850
target Objective-C runtime down to the frontend: break this
down into a single target runtime kind and version, and compute
all the relevant information from that. This makes it
relatively painless to add support for new runtimes to the
compiler. Make the new -cc1 flag, -fobjc-runtime=blah-x.y.z,
available at the driver level as a better and more general
alternative to -fgnu-runtime and -fnext-runtime. This new
concept of an Objective-C runtime also encompasses what we
were previously separating out as the "Objective-C ABI", so
fragile vs. non-fragile runtimes are now really modelled as
different kinds of runtime, paving the way for better overall
differentiation.
As a sort of special case, continue to accept the -cc1 flag
-fobjc-runtime-has-weak, as a sop to PLCompatibilityWeak.
I won't go so far as to say "no functionality change", even
ignoring the new driver flag, but subtle changes in driver
semantics are almost certainly not intended.
llvm-svn: 158793
Per Anna's comment, this is a better way to handle "to-do list"-type failures.
This way we'll know if any of the features get fixed; in an XFAIL file, /all/
the cases have to be fixed before lit would tell us anything.
llvm-svn: 158791
The default global placement new just returns the pointer it is given.
Note that other custom 'new' implementations with placement args are not
guaranteed to do this.
In addition, we need to invalidate placement args, since they may be updated by
the allocator function. (Also, right now we don't properly handle the
constructor inside a CXXNewExpr, so we need to invalidate the placement args
just so that callers know something changed!)
This invalidation is not perfect because CallOrObjCMessage doesn't support
CXXNewExpr, and all of our invalidation callbacks expect that if there's no
CallOrObjCMessage, the invalidation is happening manually (e.g. by a direct
assignment) and shouldn't affect checker-specific metadata (like malloc state);
hence the malloc test case in new-fail.cpp. But region values are now
properly invalidated, at least.
The long-term solution to this problem is to rework CallOrObjCMessage into
something more general, rather than the morass of branches it is today.
<rdar://problem/11679031>
llvm-svn: 158784
This happens in C++ mode right at the declaration of a struct VLA;
MallocChecker sees a bind and tries to get see if it's an escaping bind.
It's likely that our handling of this is still incomplete, but it fixes a
crash on valid without disturbing anything else for now.
llvm-svn: 158587
Specifically, although the bitmap context does not take ownership of the
buffer (unlike CGBitmapContextCreateWithData), the data buffer can be extracted
out of the created CGContextRef. Thus the buffer is not leaked even if its
original pointer goes out of scope, as long as
- the context escapes, or
- it is retrieved via CGBitmapContextGetData and freed.
Actually implementing that logic is beyond the current scope of MallocChecker,
so for now CGBitmapContextCreate goes on our system function exception list.
llvm-svn: 158579
We already didn't track objects that have delegates or callbacks or
objects that are passed through void * "context pointers". It's a
not-uncommon pattern to release the object in its callback, and so
the leak message we give is not very helpful.
llvm-svn: 158532
This does not actually give us the right behavior for reinterpret_cast
of references. Reverting so I can think about it some more.
This reverts commit 50a75a6e26a49011150067adac556ef978639fe6.
llvm-svn: 158341
These casts only appear in very well-defined circumstances, in which the
target of a reinterpret_cast or a function formal parameter is an lvalue
reference. According to the C++ standard, the following are equivalent:
reinterpret_cast<T&>( x)
*reinterpret_cast<T*>(&x)
[expr.reinterpret.cast]p11
llvm-svn: 158338
While collections containing nil elements can still be iterated over in an
Objective-C for-in loop, the most common Cocoa collections -- NSArray,
NSDictionary, and NSSet -- cannot contain nil elements. This checker adds
that assumption to the analyzer state.
This was the cause of some minor false positives concerning CFRelease calls
on objects in an NSArray.
llvm-svn: 158319
to addition.
We should not to warn in case the malloc size argument is an
addition containing 'sizeof' operator - it is common to use the pattern
to pack values of different sizes into a buffer.
Ex:
uint8_t *buffer = (uint8_t*)malloc(dataSize + sizeof(length));
llvm-svn: 158219
CmpRuns.py can be used to compare issues from different analyzer runs.
Since it uses the issue line number to unique 2 issues, adding a new
line to the beginning of a file makes all issues in the file reported as
new.
The hash will be an opaque value which could be used (along with the
function name) by CmpRuns to identify the same issues. This way, we only
fail to identify the same issue from two runs if the function it appears
in changes (not perfect, but much better than nothing).
llvm-svn: 158180
I falsely assumed that the memory spaces are equal when we reach this
point, they might not be when memory space of one or more is stack or
Unknown. We don't want a region from Heap space alias something with
another memory space.
llvm-svn: 158165
Add a concept of symbolic memory region belonging to heap memory space.
When comparing symbolic regions allocated on the heap, assume that they
do not alias.
Use symbolic heap region to suppress a common false positive pattern in
the malloc checker, in code that relies on malloc not returning the
memory aliased to other malloc allocations, stack.
llvm-svn: 158136
When we timeout or exceed a max number of blocks within an inlined
function, we retry with no inlining starting from a node right before
the CallEnter node. We assume the state of that node is the state of the
program before we start evaluating the call. However, the node pruning
removes this node as unimportant.
Teach the node pruning to keep the predecessors of the call enter nodes.
llvm-svn: 157860
-Wsometimes-uninitialized diagnostics to make it clearer that the cause
of the issue may be a condition which must always evaluate to true or
false, rather than an uninitialized variable.
To emphasize this, add a new note with a fixit which removes the
impossible condition or replaces it with a constant.
Also, downgrade the diagnostic from -Wsometimes-uninitialized to
-Wconditional-uninitialized when it applies to a range-based for loop,
since the condition is not written explicitly in the code in that case.
llvm-svn: 157511
-Wsometimes-uninitialized. This detects cases where an explicitly-written branch
inevitably leads to an uninitialized variable use (so either the branch is dead
code or there is an uninitialized use bug).
This chunk of warnings tentatively lives within -Wuninitialized, in order to
give it more visibility to existing Clang users.
llvm-svn: 157458
The new debug.ExprInspection checker looks for calls to clang_analyzer_eval,
and emits a warning of TRUE, FALSE, or UNKNOWN (or UNDEFINED) based on the
constrained value of its (boolean) argument. It does not modify the analysis
state though the conditions tested can result in branches (e.g. through the
use of short-circuit operators).
llvm-svn: 156919
Moves the bool bail-out down a little in SemaChecking - so now
-Wnull-conversion and -Wliteral-conversion can fire when the target type is
bool.
Also improve the wording/details in the -Wliteral-conversion warning to match
the -Wconstant-conversion.
llvm-svn: 156826
We check the address of the last element accessed, but with 0 calculating that
address results in element -1. This patch bails out early (and avoids a bunch
of other work at that).
Fixes PR12807.
llvm-svn: 156769
to reason about.
As part of taint propagation, we now allow creation of non-integer
symbolic expressions like a cast from int to float.
Addresses PR12511 (radar://11215362).
llvm-svn: 156578
We report a leak at a point a leaked variable is no longer accessible.
The statement that happens to be at that point is not relevant to the
leak diagnostic and, thus, should not be highlighted.
radar://11178519
llvm-svn: 156530
RegionStore, so be explicit about it and generate UnknownVal().
This is a hack to ensure we never produce undefined values for a value
coming from a compound value. (The undefined values can lead to
false positives.)
radar://10127782
llvm-svn: 156446
disruptive, but it allows RegionStore to better "see" through casts that reinterpret arrays of values
as structs. Fixes <rdar://problem/11405978>.
llvm-svn: 156428
don't reason about.
Self is just like a local variable in init methods, so it can be
assigned anything like result of static functions, other methods ... So
to suppress false positives that result in such cases, stop tracking the
checker-specific state after self is being assigned to (unless the
value is't being assigned to is either self or conforms to our rules).
This change does not invalidate any existing regression tests.
llvm-svn: 156420
This involves keeping track of three separate types: the symbol type, the
adjustment type, and the comparison type. For example, in "$x + 5 > 0ULL",
if the type of $x is 'signed char', the adjustment type is 'int' and the
comparison type is 'unsigned long long'. Most of the time these three types
will be the same, but we should still do the right thing when the
comparison value is out of range, and wraparound should be calculated in
the adjustment type.
This also re-disables an out-of-bounds test; we were extracting the symbol
from non-additive SymIntExprs, but then throwing away the integer.
Sorry for the large patch; both the basic and range constraint managers needed
to be updated together, since they share code in SimpleConstraintManager.
llvm-svn: 156361
SValBuilder should return an UnknownVal() when comparison of int and ptr
fails. Previous to this commit, it went on assuming that we are dealing
with pointer arithmetic.
PR12509, radar://11390991
llvm-svn: 156320
The logical change is that the integers in SymIntExprs may not have the same type as the symbols they are paired with. This was already the case with taint-propagation expressions created by SValBuilder::makeSymExprValNN, but I think those integers may never have been used. SimpleSValBuilder should be able to handle mixed-integer-type SymIntExprs fine now, though, and the constraint managers were already being defensive (though not entirely correct). All existing tests pass.
The logic in evalBinOpNN has been simplified so that conversion is done as late as possible. As a result, most of the switch cases have been reduced to do the minimal amount of work, delegating to another case when they can by substituting ConcreteInts and (as before) reversing the left and right arguments when useful.
Comparisons require special handling in two places (building SymIntExprs and evaluating constant-constant operations) because we don't /know/ the best type for comparing the two values. I've approximated the rules in Sema [C99 6.3.1.8] but it'd be nice to refactor Sema's actual algorithm into ASTContext.
This is also groundwork for handling mixed-type constraints better than we do now.
llvm-svn: 156270