Releases: semgrep/semgrep
Release v1.75.0
1.75.0 - 2024-06-03
Added
-
Pro: Semgrep can now track taint through tuple/list (un)packing intra-procedurally
(i.e., within a single function). For example:t = ["ok", "taint"] x, y = t sink(x) # OK, no finding sink(y) # tainted, finding ``` (code-6935)
-
Optional type matching is supported in the Pro engine for Python. For example,
in Python,Optional[str]
,str | None
, andUnion[str, None]
represent the
same type but in different type expressions. The optional type match support
enables matching between these expressions, allowing any optional type
expression to match any other optional type expression when used with
metavariable-type filtering. It's important to note that syntactic pattern
matching still distinguishes between these types. (code-6939) -
Add support for pnpm v9 (pnpm)
-
Added a new rule option decorators_order_matters, which allows users to make decorators/ non-keyword attributes matching stricter. The default matching for attributes is order-agnostic, but if this rule option is set to true, non-keyword attributes (e.g. decorators in Python) will be matched in order, while keyword attributes (e.g. static, inline, etc) are not affected.
An example usage will be a rule to detect any decorator that is outside of the route() decorator in Flask, since any decorator outside of the route() decorator takes no effect.
bad: another.func() takes no effect
@another.func("func")
@app.route("route")
def f():
passok: route() is the outermost decorator
@app.route("route")
@another.func("func")
def f():
pass (saf-435)
Fixed
-
Pro: taint-mode: Fixed issue causing findings to be missed (false negatives)
when a global or class field was tainted, and then used in a sink after two
or more function calls.For example:
class Test { string bad; void test() { bad = "taint"; foo(); } void foo() { bar(); } void bar() { sink(bad); // finding no longer missed } } (saf-1059)
-
[Mostly applicable to Pro Engine] Typed metavariables will now match against the inferred type of a binding even if a constant is propagated for that binding, if we are unable to infer a type from the constant. Previously, we would simply fail to match in this case. (saf-1060)
-
Removed the URLs at the end of the log when semgrep ci --dryrun is ran because dry run doesn't interact with the app so the URLs don't make sense. (saf-924)
Release v1.74.0
1.74.0 - 2024-05-23
Fixed
-
One part of interfile tainting was missing a constant propagation phase, which causes semgrep to miss some true positives in some cases during interfile analysis.
This fix adds the missing constant propagation. (saf-1032)
-
Semgrep now matches YAML tags (e.g.
!number
in!number 42
) correctly rather
than ignoring them. (saf-1046) -
Upgraded Semgrep's Dockerfile parser. This brings in various
fixes from
tree-sitter-dockerfile
including minimal support for heredoc templates, support for variables in keys
of LABEL instructions, support for multiple parameters for ADD and COPY
instructions, tolerance for blanks after the backslash of a line continuation.
As a result of supporting variables in LABEL keys, the multiple key/value
pairs found in LABEL instructions are now treated as if they each had they own
LABEL instruction. It allows a patternLABEL a=b
to matchLABEL a=b c=d
without the need for an ellipsis (LABEL a=b ...
). Another consequence is
that the patternLABEL a=b c=d
can no longer matchLABEL c=d a=b
but it
will match aLABEL a=b
instruction immediately followed by a separate
LABEL c=d
. (upgrade-dockerfile-parser)
Release v1.73.0
1.73.0 - 2024-05-16
Added
- Added new AWS validator syntax for Secrets (scrt-278)
Fixed
- Fix
couldn't find metavar $MT in the match results
error, which may occur
when we capture FQN with the metavariable and use metavariable-type filter on
it. (code-7042) - Fixes the crash (during scan) caused by improper handling of unicode characters present in the source code. (gh-8421)
- [Pro Engine Only] Tainted values are now tracked through instantiation of React functional components via JSX. (jsx-taint)
Release v1.72.0
1.72.0 - 2024-05-08
Fixed
-
Dockerfile support: Avoid a silent parsing error that was possibly accompanied
with a segfault when parsing Dockerfiles that lack a trailing newline
character. (gh-10084) -
Fixed bug that was preventing the use of
metavariable-pattern
with
the aliengrep engine of the generic mode. (gh-10222) -
Added support for function declarations on object literals in the dataflow analysis.
For example, previously taint rules would not have matched the
following javascript code but now would.let tainted = source() let o = { someFuncDecl(x) { sink(tainted) } } ``` (saf-1001)
-
Osemgrep only:
When rules have metavariable-type, they don't show up in the SARIF output. This change fixes that.
Also right now dataflow traces are always shown in SARIF even when --dataflow-traces is not passed. This change also fixes that. (saf-1020)
-
Fixed bug in rule parsing preventing patternless SCA rules from being validated. (saf-1030)
Release v1.71.0
1.71.0 - 2024-05-03
Added
-
Pro: const-prop: Previously inter-procedural const-prop could only infer whether
a function returned an arbitrary string constant. Now it will be able to infer
whether a function returns a concrete constant value, e.g.:def bar(): return "bar" def test(): x = bar() foo(x) # now also matches pattern `foo("bar")`, previously only `foo("...")` ``` (flow-61)
-
Python: const-prop: Semgrep will now recognize "..." * N expression as arbitrary
constant string literals (thus matching the pattern "..."). (flow-75)
Changed
- The
--beta-testing-secrets-enabled
option, deprecated for several months, is now removed. Use--secrets
as its replacement. (gh-9987)
Fixed
-
When using semgrep --test --json, we now report in the
config_missing_fixtests field in the JSON output not just rule files
containing afix:
without a corresponding ".fixed" test file; we now also
report rule files using afix-regex:
but without a corresponding a
.fixed test file, and thefix:
orfix-regex:
can be in
any rule in the file (not just the first rule). (fixtest) -
Fixes matching for go struct field tags metadata.
For example given the program:
type Rectangle struct { Top int `json:"top"` Left int `json:"left"` Width int `json:"width"` Height int `json:"height"` }
The pattern,
type Rectangle struct { ... $NAME $TYPE $TAGS ... }
will now match each field and the
$TAGS
metavariable will be
bound when used in susequent patterns. (saf-949) -
Matching: Patterns of statements ending in ellipsis metavariables, such as
x = 1 $...STMTS
will now properly extend the match range to accommodate whatever is captured by
the ellipsis metavariable ($...STMTS). (saf-961) -
The SARIF output format should have the tag "security" when the "cwe"
section is present in the rule. Moreover, duplicate tags should be
de-duped.Osemgrep wasn't doing this before, but with this fix, now it does. (saf-991)
-
Fixed bug in mix.lock parser where it was possible to fail on a python None error. Added handler for arbitrary exceptions during lockfile parsing. (sc-1466)
-
Moved
--historical-secrets
to the "Pro Engine" option group, instead of
"Output formats", where it was previously (in error). (scrt-570)
Release v1.70.0
1.70.0 - 2024-04-24
Added
-
Added guidance for resolving API token issues in CI environments. (gh-10133)
-
The osemgrep show command supports 2 new options:
dump-ast
dump-pattern
.
Seeosemgrep show --help
for more information. (osemgrep_show) -
Added additional output flags which allow you to write output to multiple files in multiple formats.
For example, the comand
semgrep ci --text --json-output=result.json --sarif-output=result.sarif.json
Displays text output on stdout, writes the output that would be generated by passing the--json
flag
toresult.json
, and writes the output that would be generated by passing the--sarif
toresult.sarif.json
. (saf-341) -
Added an experimental feature for users to use osemgrep to format
SARIF output.When both the flags --sarif and --use-osemgrep-sarif are specified,
semgrep will use the ocaml implementation to format SARIF.This flag is experimental and can be removed any time. Users must not
rely on it being available. (saf-978)
Changed
- The main regex engine is now PCRE2 (was PCRE). While the syntax is mostly
compatible, there are some minor instances where updates to rules may be
needed, since PCRE2 is slightly more strict in some cases. For example, while
we previously accepted[\w-.]
, such a pattern would now need to be written
[\w.-]
or[\w\-.]
since PCRE2 rejects the first as having an invalid range. (scrt-467)
Fixed
-
Semgrep LS now waits longer for users to login (gh-10109)
-
When semgrep ci finishes scanning and uploads findings, it tells the
app to mark the scan as completed.For large findings, this may take a while and marking the scan as
completed may timeout. When a scan is not marked as completed, the app
may show that the repo is still processing, and confuses the user.This change increases the timeout (previously 20 minutes) to 30
minutes. (saf-980) -
Fix
semgrep ci --oss-only
when secrets product is enabled. (scrt-223)
Release v1.69.0
1.69.0 - 2024-04-16
Added
- Tracing: remove support for SEMGREP_OTEL_ENDPOINT and replace with
--trace-endpoint <url>
.
This change is for an internal feature for debugging performance. (saf-885)
Changed
- Passing --debug to Semgrep will not print much, unless a set of tags is specified
viaLOG_TAGS
. You can get all debug logs withLOG_TAGS=everything
. We do not
want --debug's output to be enourmous, as it tends not to be useful and yet cause
some problems. Note that --debug is mainly intended for Semgrep developers, please
ask for help if needed. (gh-10044) -
- The environment variables used to select the debug-level log messages
are now prefixed withSEMGREP_
(orPYTEST_SEMGREP_
) to avoid namespace
pollution and undesired cross-application side effects.
The supported environment variables are nowSEMGREP_LOG_TAGS
andPYTEST_SEMGREP_LOG_TAGS
. (gh-10087)
- The environment variables used to select the debug-level log messages
- The implicit tag to show all debug-level log messages changes from
everything
toall
. All debug-level messages shown by default are
now tagged and selectable with adefault
tag. (gh-10089)
Fixed
- In generic mode (default, spacegrep engine), matching a pattern that
ends with an ellipsis now favors the longest match rather than the shortest
match when multiple matches are possible. For example, for a given target
programa a b
, the patterna ... b
will matcha b
as before but
the patterna ...
will now match the longera a b
rather thana b
. (gh-10039) - Fixed the inter-file diff scan issue where the removal of pre-existing findings
didn't work properly when adding a new file or renaming an existing file. (saf-897)
Release v1.68.0
1.68.0 - 2024-04-08
Added
- Scan un-changed lockfiles in diff-aware scans (gh-9899)
- Languages: Added the QL language (used by CodeQL) to Semgrep (saf-947)
- SwiftPM parser will now report package url and reference. (sc-1218)
- Add support for Elixir (Mix) SCA parsing for pro engine users. (sc-1303)
Fixed
- Output for sarif format includes dataflow traces. (gh-10004)
- The environment variable
LOG_LEVEL
(as well asPYTEST_LOG_LEVEL
) is
no longer consulted by Semgrep to determine the log level. Only
SEMGREP_LOG_LEVEL
is consulted.PYTEST_SEMGREP_LOG_LEVEL
is also
consulted in the current implementation but should not be used outside of
Semgrep's Pytest tests. This is to avoid accidentally affecting Semgrep
when inheriting theLOG_LEVEL
destined to another application. (gh-10044) - Fixed swiftpm parser to no longer limit the amount of found packages in manifest file. (sc-1364)
- Fixed incorrect ecosystem being used for Elixir. Hex should be used instead of Mix. (sc-elixir)
- Fixed the match_based_ids of lockfile-only findings to differentiate between findings in cases where one rule produces multiple findings in one lockfile (sca-mid)
- Secrets historical scans: fixed a bug where historical scans could run on differential scans. (scrt-545)
Release v1.67.0
1.67.0 - 2024-03-28
Added
--historical-secrets
flag for running Semgrep Secrets regex rules on git
history (requires Semgrep Secrets). This flag is not yet implemented for
--experimental
. (scrt-531)
Changed
-
Files with the
.phtml
extension are now treated as PHP files. (gh-10009) -
[IMPORTANT] Logged in users running
semgrep ci
will now run the pro engine by default! Allsemgrep ci
scans will run with our proprietary languages (Apex and Elixir), as well as cross-function taint within a single file, and other single file pro optimizations we have developed. This is equivalent tosemgrep ci --pro-intrafile
. Users will likely see improved results if they are runningsemgrep ci
and did not already have additional configuration to enable pro analysis.The current default engine does not include cross-file analysis. To scan with cross-file analysis, turn on the app toggle or pass in the flag
--pro
. We recommend this unless you have very large repos (talk to our support to get help enabling cross-file analysis on monorepos!)To revert back to our OSS analysis, pass the flag
--oss-only
(or use--pro-languages
to continue to receive our proprietary languages).Reminder: because we release first to our canary image, this change will only immediately affect you if you are using
semgrep/semgrep:canary
. If you are usingsemgrep/semgrep:latest
, it will affect you when we bump canary to latest. (saf-845)
Fixed
-
Fixed a parsing error in Kotlin when there's a newline between the class name and the primary constructor.
This could not parse before
class C constructor(arg:Int){}
because of the newline between the class name and the constructor.
Now it's fixed. (saf-899)
Release v1.66.2
1.66.2 - 2024-03-26
Added
- osemgrep now respects HTTP_PROXY and HTTPS_PROXY when making network requests (cdx-253)
Changed
- [IMPORTANT] The public rollout of inter-file differential scanning has been
temporarily reverted for further polishing of the feature. We will reintroduce
it in a later version. (saf-268)
Fixed
- Autofix on variable definitions should now handle the semicolon
in Java, C++, and C#. (saf-928)