Skip to content

Commit a486d4a

Browse files
authored
Always omit license details from full-scan diff request (#221)
* fix(core): always omit license details from full-scan diff request (#CE-224 follow-on) The full-scan diff request (fullscans.stream_diff) now always sets include_license_details=false, decoupled from the --exclude-license-details flag. This prevents the CE-224 truncation crash (Unterminated string / JSON parse failure on large repos, reported by the tremendous org) from recurring even when the flag is not passed. Why this is safe (no output changes): the license fields the diff endpoint can embed are never consumed off the diff. With --generate-license off, the only consumer (the legal/FOSSA artifact builder) never runs. With --generate-license on, get_license_text_via_purl re-fetches license data from the dedicated PURL endpoint and overwrites whatever the diff embedded before anything reads it. Either way the embedded payload was dead weight that only bloated the response. --exclude-license-details still works but its scope is now narrower: it controls only the dashboard report URL, not the internal diff payload. Help text updated. Core.get_added_and_removed_packages(..., include_license_details=True) remains as an explicit override seam (exercised in tests). Minor bump to 2.4.0: outputs are provably unchanged, but this is a deliberate default-behavior change (2.3.0 made the flag propagate; 2.4.0 makes the lean diff the default), which warrants a minor bump per the project's semver policy. Signed-off-by: lelia <2418071+lelia@users.noreply.github.com> * chore: trim changelog release notes * chore: require socketdev 3.1.2 * docs: note exclude license flag scope change --------- Signed-off-by: lelia <2418071+lelia@users.noreply.github.com>
1 parent ce9f0e1 commit a486d4a

7 files changed

Lines changed: 87 additions & 42 deletions

File tree

CHANGELOG.md

Lines changed: 23 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,18 @@
11
# Changelog
22

3+
## 2.4.0
4+
5+
### Changed: license details are no longer requested on the full-scan diff
6+
7+
- Full-scan diff requests now always set `include_license_details=false`, keeping
8+
large diff responses smaller and avoiding truncation crashes on large repos.
9+
- Soft breaking change for flag-scripted use: `--exclude-license-details` still
10+
controls the dashboard report URL, but no longer affects the internal diff
11+
request. Its `--help` text has been updated to reflect the narrower scope.
12+
- License artifact output is unchanged: `--generate-license` continues to fetch
13+
license details from the dedicated PURL endpoint.
14+
- Requires `socketdev>=3.1.2`.
15+
316
## 2.3.1
417

518
### New: brotli-compressed `.socket.facts.json` upload
@@ -31,40 +44,21 @@ Details:
3144

3245
### New: `--exit-code-on-api-error`
3346

34-
Adds a configurable exit code for API / infrastructure failures (timeouts,
35-
network errors, unexpected exceptions), so CI pipelines can distinguish them
36-
from blocking security findings (exit `1`):
37-
38-
```
39-
socketcli --exit-code-on-api-error 100 ...
40-
```
41-
42-
Default is `3` (the code the CLI already used for these errors), so **default
43-
behavior is unchanged** — the exit code only changes when you pass the flag.
44-
Set it to a Buildkite `soft_fail` code, or to `0` to swallow infra errors.
45-
46-
**Interaction to be aware of:** `--disable-blocking` forces exit `0` for *all*
47-
outcomes and therefore overrides `--exit-code-on-api-error`. Use the new flag
48-
*without* `--disable-blocking` if you want a custom infra-error code to take
49-
effect. See the exit-code reference in the README.
50-
51-
> A future `3.0` release is planned to make infrastructure errors exit non-zero
52-
> even under `--disable-blocking` (so outages stop being silently swallowed).
53-
> That is a breaking change and is intentionally **not** in this release.
47+
- Added `--exit-code-on-api-error` so CI can distinguish API / infrastructure
48+
failures from blocking security findings. The default remains `3`; the flag
49+
only changes behavior when set explicitly.
50+
- `--disable-blocking` still takes precedence and exits `0` for all outcomes.
5451

5552
### New: commit message auto-truncation
5653

57-
`--commit-message` values longer than 200 characters are now automatically
58-
truncated before being sent to the API, preventing HTTP 413 errors from
59-
oversized URL query parameters (common with AI-generated commit messages or
60-
`$BUILDKITE_MESSAGE`).
54+
- `--commit-message` values longer than 200 characters are now truncated before
55+
being sent to the API, preventing HTTP 413 errors from oversized query
56+
parameters.
6157

6258
### Improved: Buildkite log formatting
6359

64-
When running inside a Buildkite job (`BUILDKITE=true`), infrastructure errors
65-
emit Buildkite log section markers (`^^^ +++` / `--- :warning:`) so the error
66-
section auto-expands in the BK UI, plus a `soft_fail` hint. No effect on other
67-
CI platforms.
60+
- Infrastructure errors now emit Buildkite log section markers when
61+
`BUILDKITE=true`, making those failures easier to find in Buildkite logs.
6862

6963
### Fixed
7064

@@ -73,6 +67,7 @@ CI platforms.
7367
which was constructed without the CLI timeout and defaulted to 1200s.
7468
- `--exclude-license-details` now propagates to the full-scan diff comparison
7569
request (it was only applied to full-scan params / report URLs before).
70+
7671
## 2.2.93
7772

7873
- Bundled twelve Dependabot dependency updates: `urllib3`, `gitpython`, `python-dotenv`, `pytest`, `uv`, `cryptography`, `pygments`, `requests`, and `idna` (main app), plus `axios`, `requests`, and `flask` (e2e fixtures). `idna` 3.11 → 3.15 includes the fix for CVE-2026-45409.

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ build-backend = "hatchling.build"
66

77
[project]
88
name = "socketsecurity"
9-
version = "2.3.1"
9+
version = "2.4.0"
1010
requires-python = ">= 3.11"
1111
license = {"file" = "LICENSE"}
1212
dependencies = [
@@ -16,7 +16,7 @@ dependencies = [
1616
'GitPython',
1717
'packaging',
1818
'python-dotenv',
19-
"socketdev>=3.0.33,<4.0.0",
19+
"socketdev>=3.1.2,<4.0.0",
2020
"bs4>=0.0.2",
2121
"markdown>=3.10",
2222
"brotli>=1.0.9; platform_python_implementation == 'CPython'",

socketsecurity/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
__author__ = 'socket.dev'
2-
__version__ = '2.3.1'
2+
__version__ = '2.4.0'
33
USER_AGENT = f'SocketPythonCLI/{__version__}'

socketsecurity/config.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -705,7 +705,12 @@ def create_argument_parser() -> argparse.ArgumentParser:
705705
"--exclude-license-details",
706706
dest="exclude_license_details",
707707
action="store_true",
708-
help="Exclude license details from the diff report (boosts performance for large repos)"
708+
help=(
709+
"Exclude license details from the dashboard report URL. "
710+
"As of 2.4.0 the internal diff request always omits license details "
711+
"(they were unused there and bloated large-repo responses), so this "
712+
"flag now only affects the report link, not diff performance."
713+
)
709714
)
710715
output_group.add_argument(
711716
"--max-purl-batch-size",

socketsecurity/core/__init__.py

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1070,14 +1070,35 @@ def get_added_and_removed_packages(
10701070
self,
10711071
head_full_scan_id: str,
10721072
new_full_scan_id: str,
1073-
include_license_details: bool = True
1073+
include_license_details: bool = False
10741074
) -> Tuple[Dict[str, Package], Dict[str, Package], Dict[str, Package]]:
10751075
"""
10761076
Get packages that were added and removed between scans.
10771077
10781078
Args:
10791079
head_full_scan_id: Previous scan (maybe None if first scan)
10801080
new_full_scan_id: New scan just created
1081+
include_license_details: Whether to ask the diff endpoint to embed
1082+
per-package license attribution/details in the response.
1083+
1084+
Defaults to ``False`` on purpose. The diff endpoint exists to
1085+
compare alerts between two scans; the license fields it can embed
1086+
are never consumed off the diff:
1087+
* When ``--generate-license`` is OFF, the only consumer of
1088+
``Package.licenseDetails``/``licenseAttrib`` (the legal/FOSSA
1089+
artifact builder) is never invoked, so the embedded license
1090+
data is parsed and then dropped on the floor.
1091+
* When ``--generate-license`` is ON, ``get_license_text_via_purl``
1092+
re-fetches license data from the dedicated PURL endpoint and
1093+
OVERWRITES whatever the diff embedded, before anything reads it.
1094+
Either way the embedded license payload is dead weight, and on
1095+
large dependency trees it inflated the diff response past ~2.3MB
1096+
and truncated it mid-string, crashing ``response.json()``
1097+
(CE-224, customer: Tremendous). Defaulting to ``False`` keeps the
1098+
diff lean with zero change to any output artifact. The parameter
1099+
is retained as an explicit override seam, not wired to the
1100+
``--exclude-license-details`` user flag (which still governs the
1101+
human-facing dashboard report URL).
10811102
10821103
Returns:
10831104
Tuple of (added_packages, removed_packages) dictionaries
@@ -1299,15 +1320,23 @@ def create_new_diff(
12991320
except OSError as e:
13001321
log.warning(f"Failed to clean up temporary file {temp_file}: {e}")
13011322

1302-
# Handle diff generation - now we always have both scans
1323+
# Handle diff generation - now we always have both scans.
1324+
#
1325+
# Note: we intentionally do NOT forward params.include_license_details
1326+
# (the --exclude-license-details user flag) into the diff request. The
1327+
# diff path never consumes embedded license data (see
1328+
# get_added_and_removed_packages docstring), so requesting it only bloats
1329+
# the response and risks the CE-224 truncation crash on large repos. The
1330+
# user flag still controls the dashboard report URL below; it just no
1331+
# longer gates this internal diff payload.
13031332
(
13041333
added_packages,
13051334
removed_packages,
13061335
packages
13071336
) = self.get_added_and_removed_packages(
13081337
head_full_scan_id,
13091338
new_full_scan.id,
1310-
include_license_details=getattr(params, "include_license_details", True)
1339+
include_license_details=False
13111340
)
13121341

13131342
# Separate unchanged packages from added/removed for --strict-blocking support

tests/core/test_sdk_methods.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,13 +95,17 @@ def test_get_added_and_removed_packages(core):
9595
# Get two different scans to compare
9696
added, removed, all_packages = core.get_added_and_removed_packages("head", "new")
9797

98-
# Verify SDK was called correctly
98+
# Verify SDK was called correctly.
99+
# include_license_details defaults to "false": the diff path never consumes
100+
# embedded license data (license artifacts come from the PURL endpoint), so
101+
# requesting it only bloats the response and risks the CE-224 truncation
102+
# crash on large repos.
99103
core.sdk.fullscans.stream_diff.assert_called_once_with(
100104
core.config.org_slug,
101105
"head",
102106
"new",
103107
use_types=True,
104-
include_license_details="true",
108+
include_license_details="false",
105109
)
106110

107111
# Verify the results
@@ -116,6 +120,18 @@ def test_get_added_and_removed_packages(core):
116120
assert "dp2_t1" in removed # Verify transitive dependencies are also tracked
117121
assert "pypi/direct_package_1@1.6.0" in all_packages # Unchanged package is in full package map
118122

123+
def test_get_added_and_removed_packages_license_override(core):
124+
"""The include_license_details override seam still works when explicitly requested."""
125+
core.get_added_and_removed_packages("head", "new", include_license_details=True)
126+
127+
core.sdk.fullscans.stream_diff.assert_called_once_with(
128+
core.config.org_slug,
129+
"head",
130+
"new",
131+
use_types=True,
132+
include_license_details="true",
133+
)
134+
119135
def test_empty_alerts_preserved(core):
120136
"""Test that empty alerts arrays stay as empty arrays and don't become None"""
121137
# Get the scan that contains dp2 (which has empty alerts array)

uv.lock

Lines changed: 5 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)