-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[analyzer][clangsa] Add new option to alpha.security.cert.InvalidPtrChecker #67663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-clang ChangesThis PR is a continuation of the Phabricator review https://reviews.llvm.org/D154603 The invalidation of pointer pointers returned by subsequent calls to genenv is Full diff: https://github.com/llvm/llvm-project/pull/67663.diff 7 Files Affected:
diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index dbd6d7787823530..1487714c72f239c 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -2399,13 +2399,34 @@ pointer. These functions include: getenv, localeconv, asctime, setlocale, strerr
char *p, *pp;
p = getenv("VAR");
- pp = getenv("VAR2");
- // subsequent call to 'getenv' invalidated previous one
+ setenv("SOMEVAR", "VALUE", /*overwrite*/1);
+ // call to 'setenv' may invalidate p
*p;
// dereferencing invalid pointer
}
+
+The ``InvalidatingGetEnv`` option is available for treating getenv calls as
+invalidating. When enabled, the checker issues a warning if getenv is called
+multiple times and their results are used without first creating a copy.
+This level of strictness might be considered overly pedantic for a standard
+getenv implementation.
+
+To enable this option, use:
+``-analyzer-config alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=true``.
+
+By default, this option is set to *false*.
+
+When this option is enabled, warnings will be generated for scenarios like the
+following:
+
+.. code-block:: c
+
+ char* p = getenv("VAR");
+ char* pp = getenv("VAR2"); // assumes this call can invalidate `env`
+ strlen(p); // warns about accessing invalid ptr
+
alpha.security.taint
^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 65c1595eb6245dd..b4f65c934bf483b 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -997,6 +997,15 @@ let ParentPackage = ENV in {
def InvalidPtrChecker : Checker<"InvalidPtr">,
HelpText<"Finds usages of possibly invalidated pointers">,
+ CheckerOptions<[
+ CmdLineOption<Boolean,
+ "InvalidatingGetEnv",
+ "Regard getenv as an invalidating call (as per POSIX "
+ "standard), which can lead to false positives depending on "
+ "implementation.",
+ "false",
+ InAlpha>,
+ ]>,
Documentation<HasDocumentation>;
} // end "alpha.cert.env"
diff --git a/clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp
index aae1a17bc0ae53e..ac879a208e4e962 100644
--- a/clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp
@@ -25,12 +25,20 @@
using namespace clang;
using namespace ento;
+
namespace {
+
class InvalidPtrChecker
: public Checker<check::Location, check::BeginFunction, check::PostCall> {
private:
- BugType BT{this, "Use of invalidated pointer", categories::MemoryError};
+ static const BugType *InvalidPtrBugType;
+ // For accurate emission of NoteTags, the BugType of this checker should have
+ // a unique address.
+ void InitInvalidPtrBugType() {
+ InvalidPtrBugType = new BugType(this, "Use of invalidated pointer",
+ categories::MemoryError);
+ }
void EnvpInvalidatingCall(const CallEvent &Call, CheckerContext &C) const;
@@ -38,6 +46,15 @@ class InvalidPtrChecker
CheckerContext &C) const;
// SEI CERT ENV31-C
+
+ // If set to true, consider getenv calls as invalidating operations on the
+ // environment variable buffer. This is implied in the standard, but in
+ // practice does not cause problems (in the commonly used environments).
+ bool InvalidatingGetEnv = false;
+
+ // GetEnv can be treated invalidating and non-invalidating as well.
+ const CallDescription GetEnvCall{{"getenv"}, 1};
+
const CallDescriptionMap<HandlerFn> EnvpInvalidatingFunctions = {
{{{"setenv"}, 3}, &InvalidPtrChecker::EnvpInvalidatingCall},
{{{"unsetenv"}, 1}, &InvalidPtrChecker::EnvpInvalidatingCall},
@@ -51,7 +68,6 @@ class InvalidPtrChecker
// SEI CERT ENV34-C
const CallDescriptionMap<HandlerFn> PreviousCallInvalidatingFunctions = {
- {{{"getenv"}, 1}, &InvalidPtrChecker::postPreviousReturnInvalidatingCall},
{{{"setlocale"}, 2},
&InvalidPtrChecker::postPreviousReturnInvalidatingCall},
{{{"strerror"}, 1},
@@ -62,6 +78,10 @@ class InvalidPtrChecker
&InvalidPtrChecker::postPreviousReturnInvalidatingCall},
};
+ // The private members of this checker corresponding to commandline options
+ // are set in this function.
+ friend void ento::registerInvalidPtrChecker(CheckerManager &);
+
public:
// Obtain the environment pointer from 'main()' (if present).
void checkBeginFunction(CheckerContext &C) const;
@@ -78,13 +98,18 @@ class InvalidPtrChecker
CheckerContext &C) const;
};
+const BugType *InvalidPtrChecker::InvalidPtrBugType;
+
} // namespace
// Set of memory regions that were invalidated
REGISTER_SET_WITH_PROGRAMSTATE(InvalidMemoryRegions, const MemRegion *)
// Stores the region of the environment pointer of 'main' (if present).
-REGISTER_TRAIT_WITH_PROGRAMSTATE(EnvPtrRegion, const MemRegion *)
+REGISTER_TRAIT_WITH_PROGRAMSTATE(MainEnvPtrRegion, const MemRegion *)
+
+// Stores the regions of environments returned by getenv calls.
+REGISTER_SET_WITH_PROGRAMSTATE(GetenvEnvPtrRegions, const MemRegion *)
// Stores key-value pairs, where key is function declaration and value is
// pointer to memory region returned by previous call of this function
@@ -94,23 +119,40 @@ REGISTER_MAP_WITH_PROGRAMSTATE(PreviousCallResultMap, const FunctionDecl *,
void InvalidPtrChecker::EnvpInvalidatingCall(const CallEvent &Call,
CheckerContext &C) const {
StringRef FunctionName = Call.getCalleeIdentifier()->getName();
- ProgramStateRef State = C.getState();
- const MemRegion *SymbolicEnvPtrRegion = State->get<EnvPtrRegion>();
- if (!SymbolicEnvPtrRegion)
- return;
-
- State = State->add<InvalidMemoryRegions>(SymbolicEnvPtrRegion);
- const NoteTag *Note =
- C.getNoteTag([SymbolicEnvPtrRegion, FunctionName](
- PathSensitiveBugReport &BR, llvm::raw_ostream &Out) {
- if (!BR.isInteresting(SymbolicEnvPtrRegion))
- return;
- Out << '\'' << FunctionName
- << "' call may invalidate the environment parameter of 'main'";
- });
+ auto PlaceInvalidationNote = [&C, FunctionName](ProgramStateRef State,
+ const MemRegion *Region,
+ StringRef Message,
+ ExplodedNode *Pred) {
+ State = State->add<InvalidMemoryRegions>(Region);
+
+ // Make copy of string data for the time when notes are *actually* created.
+ const NoteTag *Note =
+ C.getNoteTag([Region, FunctionName = std::string{FunctionName},
+ Message = std::string{Message}](
+ PathSensitiveBugReport &BR, llvm::raw_ostream &Out) {
+ if (!BR.isInteresting(Region) ||
+ &BR.getBugType() != InvalidPtrBugType)
+ return;
+ Out << '\'' << FunctionName << "' " << Message;
+ BR.markNotInteresting(Region);
+ });
+ return C.addTransition(State, Pred, Note);
+ };
- C.addTransition(State, Note);
+ ProgramStateRef State = C.getState();
+ ExplodedNode *CurrentChainEnd = C.getPredecessor();
+
+ if (const MemRegion *MainEnvPtr = State->get<MainEnvPtrRegion>())
+ CurrentChainEnd = PlaceInvalidationNote(
+ State, MainEnvPtr,
+ "call may invalidate the environment parameter of 'main'",
+ CurrentChainEnd);
+
+ for (const MemRegion *EnvPtr : State->get<GetenvEnvPtrRegions>())
+ CurrentChainEnd = PlaceInvalidationNote(
+ State, EnvPtr, "call may invalidate the environment returned by getenv",
+ CurrentChainEnd);
}
void InvalidPtrChecker::postPreviousReturnInvalidatingCall(
@@ -126,7 +168,7 @@ void InvalidPtrChecker::postPreviousReturnInvalidatingCall(
State = State->add<InvalidMemoryRegions>(PrevReg);
Note = C.getNoteTag([PrevReg, FD](PathSensitiveBugReport &BR,
llvm::raw_ostream &Out) {
- if (!BR.isInteresting(PrevReg))
+ if (!BR.isInteresting(PrevReg) || &BR.getBugType() != InvalidPtrBugType)
return;
Out << '\'';
FD->getNameForDiagnostic(Out, FD->getASTContext().getLangOpts(), true);
@@ -146,16 +188,15 @@ void InvalidPtrChecker::postPreviousReturnInvalidatingCall(
// Remember to this region.
const auto *SymRegOfRetVal = cast<SymbolicRegion>(RetVal.getAsRegion());
- const MemRegion *MR =
- const_cast<MemRegion *>(SymRegOfRetVal->getBaseRegion());
+ const MemRegion *MR = SymRegOfRetVal->getBaseRegion();
State = State->set<PreviousCallResultMap>(FD, MR);
ExplodedNode *Node = C.addTransition(State, Note);
const NoteTag *PreviousCallNote =
C.getNoteTag([MR](PathSensitiveBugReport &BR, llvm::raw_ostream &Out) {
- if (!BR.isInteresting(MR))
+ if (!BR.isInteresting(MR) || &BR.getBugType() != InvalidPtrBugType)
return;
- Out << '\'' << "'previous function call was here" << '\'';
+ Out << "'previous function call was here" << '\'';
});
C.addTransition(State, Node, PreviousCallNote);
@@ -185,6 +226,18 @@ static const MemRegion *findInvalidatedSymbolicBase(ProgramStateRef State,
// function call as an argument.
void InvalidPtrChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {
+
+ ProgramStateRef State = C.getState();
+
+ // Model 'getenv' calls
+ if (GetEnvCall.matches(Call)) {
+ const MemRegion *Region = Call.getReturnValue().getAsRegion();
+ if (Region) {
+ State = State->add<GetenvEnvPtrRegions>(Region);
+ C.addTransition(State);
+ }
+ }
+
// Check if function invalidates 'envp' argument of 'main'
if (const auto *Handler = EnvpInvalidatingFunctions.lookup(Call))
(this->**Handler)(Call, C);
@@ -193,14 +246,16 @@ void InvalidPtrChecker::checkPostCall(const CallEvent &Call,
if (const auto *Handler = PreviousCallInvalidatingFunctions.lookup(Call))
(this->**Handler)(Call, C);
+ // If pedantic mode is on, regard 'getenv' calls invalidating as well
+ if (InvalidatingGetEnv && GetEnvCall.matches(Call))
+ postPreviousReturnInvalidatingCall(Call, C);
+
// Check if one of the arguments of the function call is invalidated
// If call was inlined, don't report invalidated argument
if (C.wasInlined)
return;
- ProgramStateRef State = C.getState();
-
for (unsigned I = 0, NumArgs = Call.getNumArgs(); I < NumArgs; ++I) {
if (const auto *SR = dyn_cast_or_null<SymbolicRegion>(
@@ -218,8 +273,8 @@ void InvalidPtrChecker::checkPostCall(const CallEvent &Call,
C.getASTContext().getPrintingPolicy());
Out << "' in a function call";
- auto Report =
- std::make_unique<PathSensitiveBugReport>(BT, Out.str(), ErrorNode);
+ auto Report = std::make_unique<PathSensitiveBugReport>(
+ *InvalidPtrBugType, Out.str(), ErrorNode);
Report->markInteresting(InvalidatedSymbolicBase);
Report->addRange(Call.getArgSourceRange(I));
C.emitReport(std::move(Report));
@@ -243,7 +298,7 @@ void InvalidPtrChecker::checkBeginFunction(CheckerContext &C) const {
// Save the memory region pointed by the environment pointer parameter of
// 'main'.
- C.addTransition(State->set<EnvPtrRegion>(EnvpReg));
+ C.addTransition(State->set<MainEnvPtrRegion>(EnvpReg));
}
// Check if invalidated region is being dereferenced.
@@ -262,13 +317,17 @@ void InvalidPtrChecker::checkLocation(SVal Loc, bool isLoad, const Stmt *S,
return;
auto Report = std::make_unique<PathSensitiveBugReport>(
- BT, "dereferencing an invalid pointer", ErrorNode);
+ *InvalidPtrBugType, "dereferencing an invalid pointer", ErrorNode);
Report->markInteresting(InvalidatedSymbolicBase);
C.emitReport(std::move(Report));
}
void ento::registerInvalidPtrChecker(CheckerManager &Mgr) {
- Mgr.registerChecker<InvalidPtrChecker>();
+ auto *Checker = Mgr.registerChecker<InvalidPtrChecker>();
+ Checker->InvalidatingGetEnv =
+ Mgr.getAnalyzerOptions().getCheckerBooleanOption(Checker,
+ "InvalidatingGetEnv");
+ Checker->InitInvalidPtrBugType();
}
bool ento::shouldRegisterInvalidPtrChecker(const CheckerManager &) {
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index d86ca5d19219c64..22b68cd1fd4c1c0 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -11,6 +11,7 @@
// CHECK-NEXT: alpha.osx.cocoa.DirectIvarAssignment:AnnotatedFunctions = false
// CHECK-NEXT: alpha.security.MmapWriteExec:MmapProtExec = 0x04
// CHECK-NEXT: alpha.security.MmapWriteExec:MmapProtRead = 0x01
+// CHECK-NEXT: alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv = false
// CHECK-NEXT: alpha.security.taint.TaintPropagation:Config = ""
// CHECK-NEXT: alpha.unix.Errno:AllowErrnoReadOutsideConditionExpressions = true
// CHECK-NEXT: alpha.unix.StdCLibraryFunctions:DisplayLoadedSummaries = false
diff --git a/clang/test/Analysis/cert/env34-c-cert-examples.c b/clang/test/Analysis/cert/env34-c-cert-examples.c
index 19ca73f34c7ee5e..7e5b9be3cba6bb0 100644
--- a/clang/test/Analysis/cert/env34-c-cert-examples.c
+++ b/clang/test/Analysis/cert/env34-c-cert-examples.c
@@ -1,15 +1,49 @@
+// Default options.
// RUN: %clang_analyze_cc1 \
// RUN: -analyzer-checker=core,alpha.security.cert.env.InvalidPtr \
// RUN: -verify -Wno-unused %s
+//
+// Test the laxer handling of getenv function (this is the default).
+// RUN: %clang_analyze_cc1 \
+// RUN: -analyzer-checker=core,alpha.security.cert.env.InvalidPtr \
+// RUN: -analyzer-config alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=false \
+// RUN: -verify -Wno-unused %s
+//
+// Test the stricter handling of getenv function.
+// RUN: %clang_analyze_cc1 \
+// RUN: -analyzer-checker=core,alpha.security.cert.env.InvalidPtr \
+// RUN: -analyzer-config alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=true \
+// RUN: -verify=pedantic -Wno-unused %s
#include "../Inputs/system-header-simulator.h"
char *getenv(const char *name);
+int setenv(const char *name, const char *value, int overwrite);
int strcmp(const char*, const char*);
char *strdup(const char*);
void free(void *memblock);
void *malloc(size_t size);
-void incorrect_usage(void) {
+void incorrect_usage_setenv_getenv_invalidation(void) {
+ char *tmpvar;
+ char *tempvar;
+
+ tmpvar = getenv("TMP");
+
+ if (!tmpvar)
+ return;
+
+ setenv("TEMP", "", 1); //setenv can invalidate env
+
+ if (!tmpvar)
+ return;
+
+ if (strcmp(tmpvar, "") == 0) { // body of strcmp is unknown
+ // expected-warning@-1{{use of invalidated pointer 'tmpvar' in a function call}}
+ // pedantic-warning@-2{{use of invalidated pointer 'tmpvar' in a function call}}
+ }
+}
+
+void incorrect_usage_double_getenv_invalidation(void) {
char *tmpvar;
char *tempvar;
@@ -18,13 +52,13 @@ void incorrect_usage(void) {
if (!tmpvar)
return;
- tempvar = getenv("TEMP");
+ tempvar = getenv("TEMP"); //getenv should not invalidate env in non-pedantic mode
if (!tempvar)
return;
if (strcmp(tmpvar, tempvar) == 0) { // body of strcmp is unknown
- // expected-warning@-1{{use of invalidated pointer 'tmpvar' in a function call}}
+ // pedantic-warning@-1{{use of invalidated pointer 'tmpvar' in a function call}}
}
}
diff --git a/clang/test/Analysis/cert/env34-c.c b/clang/test/Analysis/cert/env34-c.c
index dc7b0340c311ed5..94bc2cf95ccc9b0 100644
--- a/clang/test/Analysis/cert/env34-c.c
+++ b/clang/test/Analysis/cert/env34-c.c
@@ -1,5 +1,6 @@
// RUN: %clang_analyze_cc1 \
// RUN: -analyzer-checker=alpha.security.cert.env.InvalidPtr\
+// RUN: -analyzer-config alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=true \
// RUN: -analyzer-output=text -verify -Wno-unused %s
#include "../Inputs/system-header-simulator.h"
diff --git a/clang/test/Analysis/invalid-ptr-checker.c b/clang/test/Analysis/invalid-ptr-checker.c
new file mode 100644
index 000000000000000..d2250e03f3d3d2f
--- /dev/null
+++ b/clang/test/Analysis/invalid-ptr-checker.c
@@ -0,0 +1,50 @@
+// RUN: %clang_analyze_cc1 \
+// RUN: -analyzer-checker=alpha.security.cert.env.InvalidPtr \
+// RUN: -analyzer-config alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=false \
+// RUN: -analyzer-output=text -verify -Wno-unused %s
+//
+// RUN: %clang_analyze_cc1 \
+// RUN: -analyzer-checker=alpha.security.cert.env.InvalidPtr \
+// RUN: -analyzer-config \
+// RUN: alpha.security.cert.env.InvalidPtr:InvalidatingGetEnv=true \
+// RUN: -analyzer-output=text -verify=expected,pedantic -Wno-unused %s
+
+#include "Inputs/system-header-simulator.h"
+
+char *getenv(const char *name);
+int setenv(const char *name, const char *value, int overwrite);
+int strcmp(const char *, const char *);
+
+int custom_env_handler(const char **envp);
+
+void getenv_after_getenv(void) {
+ char *v1 = getenv("V1");
+ // pedantic-note@-1{{previous function call was here}}
+
+ char *v2 = getenv("V2");
+ // pedantic-note@-1{{'getenv' call may invalidate the result of the previous 'getenv'}}
+
+ strcmp(v1, v2);
+ // pedantic-warning@-1{{use of invalidated pointer 'v1' in a function call}}
+ // pedantic-note@-2{{use of invalidated pointer 'v1' in a function call}}
+}
+
+void setenv_after_getenv(void) {
+ char *v1 = getenv("VAR1");
+
+ setenv("VAR2", "...", 1);
+ // expected-note@-1{{'setenv' call may invalidate the environment returned by getenv}}
+
+ strcmp(v1, "");
+ // expected-warning@-1{{use of invalidated pointer 'v1' in a function call}}
+ // expected-note@-2{{use of invalidated pointer 'v1' in a function call}}
+}
+
+int main(int argc, const char *argv[], const char *envp[]) {
+ setenv("VAR", "...", 0);
+ // expected-note@-1 2 {{'setenv' call may invalidate the environment parameter of 'main'}}
+
+ *envp;
+ // expected-warning@-1 2 {{dereferencing an invalid pointer}}
+ // expected-note@-2 2 {{dereferencing an invalid pointer}}
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR contains several useful improvements and the implementation seems to be correct. I added some remarks in inline comments; and I'd be interested to see some results on real-world projects.
Moreover, consider adding a testcase which validates that this checker no longer "pollutes" reports from other checkers.
a8b4ebc
to
1faca07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving this patch!
I found yet another opportunity to kill a lambda function ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, my only remark is a negligible grammar issue.
…hecker The invalidation of pointer pointers returned by subsequent calls to genenv is suggested by the POSIX standard, but is too strict from a practical point of view. A new checker option 'InvalidatingGetEnv' is introduced, and is set to a more lax default value, which does not consider consecutive getenv calls invalidating. The handling of the main function's possible specification where an environment pointer is also pecified as a third parameter is also considered now. Differential Revision: https://reviews.llvm.org/D154603
4967e43
to
2d0d8fa
Compare
I have rebased on the current main. |
Thanks to recent improvements in llvm#67663, InvalidPtr checker does not emit any false positives on the following OS projects: memcached, tmux, curl, twin, vim, openssl, sqlite, ffmpeg, postgres, tinyxml2, libwebm, xerces, bitcoin, protobuf, qtbase, contour, acid, openrct2
…71912) Thanks to recent improvements in #67663, InvalidPtr checker does not emit any false positives on the following OS projects: memcached, tmux, curl, twin, vim, openssl, sqlite, ffmpeg, postgres, tinyxml2, libwebm, xerces, bitcoin, protobuf, qtbase, contour, acid, openrct2. (Before the changes mentioned above, there were 27 reports, catching the `getenv` invalidates previous `getenv` results cases. That strict behaviour is disabled by default)
This PR is a continuation of the Phabricator review https://reviews.llvm.org/D154603
The invalidation of pointer pointers returned by subsequent calls to genenv is
suggested by the POSIX standard, but is too strict from a practical point of
view. A new checker option 'InvalidatingGetEnv' is introduced, and is set to a
more lax default value, which does not consider consecutive getenv calls
invalidating.
The handling of the main function's possible specification where an environment
pointer is also pecified as a third parameter is also considered now.