fix(server): avoid loading huge task results for metadata queries by contrueCT · Pull Request #3060 · apache/hugegraph

contrueCT · 2026-06-17T13:48:28Z

Purpose of the PR

Fix task metadata/recovery paths that can still load and decompress large task results, making GET /tasks, task restore, and DELETE /tasks/{id}?force=true unusable when historical tasks contain huge result payloads.

This is related to #3057 and #3059. The PR focuses on metadata-only access and cleanup paths, while large Gremlin result export/chunking can be handled separately.

Main Changes

Add scheduler APIs that allow task reads with or without result payloads.
Add metadata-only task deserialization that avoids reading ~task_result.
Keep GET /tasks/{id} default behavior compatible, and add with_result=false.
Make GET /tasks, task restore, and delete paths use metadata-only reads.
Add lightweight task vertex deletion to avoid GraphTransaction.removeVertex() force-loading indexed task vertices.
Avoid loading separated ~taskresult vertices when cleaning distributed task results.
Return a result-stripped copy for in-memory tasks when withResult=false.

Verifying these changes

Trivial rework / code cleanup without any test coverage. (No Need)
Already covered by existing tests, such as (please modify tests here).
Need tests and can be verified as follows:
- mvn test -pl hugegraph-server/hugegraph-test -am -P core-test,rocksdb -Dtest=TaskCoreTest -DfailIfNoTests=false -Dcheckstyle.skip=true -Drat.skip=true
- mvn compile -pl hugegraph-server/hugegraph-api -am -DskipTests=true -Dcheckstyle.skip=true -Drat.skip=true
- mvn test-compile -pl hugegraph-server/hugegraph-test -am -P api-test,rocksdb -Dtest=TaskApiTest -DskipTests=true -Dcheckstyle.skip=true -Drat.skip=true

Does this PR potentially affect the following parts?

Dependencies (add/update license info & regenerate_known_dependencies.sh)
Modify configurations
The public API
Other affects (typed here)
Nope

Documentation Status

Doc - TODO
Doc - Done
Doc - No Need

codecov · 2026-06-17T14:17:02Z

Codecov Report

❌ Patch coverage is 59.84252% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 29.90%. Comparing base (c3f56b5) to head (47fb2f9).

Files with missing lines	Patch %	Lines
.../apache/hugegraph/task/TaskAndResultScheduler.java	0.00%	26 Missing ⚠️
...g/apache/hugegraph/task/StandardTaskScheduler.java	70.37%	7 Missing and 1 partial ⚠️
.../main/java/org/apache/hugegraph/task/HugeTask.java	86.84%	2 Missing and 3 partials ⚠️
...pache/hugegraph/task/DistributedTaskScheduler.java	0.00%	4 Missing ⚠️
...ava/org/apache/hugegraph/task/TaskTransaction.java	76.47%	1 Missing and 3 partials ⚠️
.../org/apache/hugegraph/auth/HugeGraphAuthProxy.java	57.14%	3 Missing ⚠️
...ain/java/org/apache/hugegraph/api/job/TaskAPI.java	80.00%	1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (c3f56b5) and HEAD (47fb2f9). Click for more details.

HEAD has 2 uploads less than BASE

Flag BASE (c3f56b5) HEAD (47fb2f9)

3 1

Additional details and impacted files

@@             Coverage Diff              @@
##             master    #3060      +/-   ##
============================================
- Coverage     36.10%   29.90%   -6.20%     
+ Complexity      338      264      -74     
============================================
  Files           803      804       +1     
  Lines         68291    68382      +91     
  Branches       8970     8982      +12     
============================================
- Hits          24656    20452    -4204     
- Misses        40990    45564    +4574     
+ Partials       2645     2366     -279

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

imbajin

Blocking: yes. Summary: Found a scheduler API compatibility regression and a failing targeted test. Evidence: TaskCoreTest#testTaskWithoutResult failed locally; codecov/project is also failing.

🔗 Please also check the failed codecov/project status: https://app.codecov.io/gh/apache/hugegraph/pull/3060

imbajin · 2026-06-17T19:22:45Z

    @Override
    public <V> Iterator<HugeTask<V>> tasks(List<Id> ids) {
-        return this.tasksWithoutResult(ids);
+        return this.tasks(ids, false);


‼️ Preserve the existing tasks(...) result contract

The new TaskScheduler default methods keep the old behavior as tasks(ids) -> tasks(ids, true), but this override changes the task/result-backed scheduler to return metadata-only tasks by default. Existing Java callers of scheduler.tasks(...).next().result() can now silently get null while scheduler.task(id) still returns the result. Please keep the no-result path opt-in by removing these overrides or delegating them to withResult=true, and pass false only from the metadata-only call sites such as REST list, restore, delete, and scheduler scans.

imbajin · 2026-06-17T19:22:45Z

+        TaskScheduler scheduler = graph.taskScheduler();
+        CountDownLatch latch = new CountDownLatch(1);
+
+        TaskCallable<String> callable = new TaskCallable<String>() {


⚠️ Use a reloadable task callable in this regression test

This anonymous TaskCallable is persisted by class name, then the worker reloads it through TaskCallable.fromClass(). It has no public no-arg constructor, so the task fails to load as TaskCallable$1, and scheduler.waitUntilTaskCompleted(id, 10) times out. I reproduced it with mvn test -pl hugegraph-server/hugegraph-test -am -P unit-test -Dtest=TaskCoreTest#testTaskWithoutResult -DfailIfNoTests=false -ntp. Please use an existing reloadable callable or add a small static callable class for this test.

fix(task): avoid loading huge task results

47fb2f9

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. api Changes of API bug Something isn't working labels Jun 17, 2026

imbajin reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): avoid loading huge task results for metadata queries#3060

fix(server): avoid loading huge task results for metadata queries#3060
contrueCT wants to merge 1 commit into
apache:masterfrom
contrueCT:task/avoid-loading-task-result

contrueCT commented Jun 17, 2026

Uh oh!

codecov Bot commented Jun 17, 2026

Uh oh!

imbajin left a comment

Uh oh!

imbajin Jun 17, 2026

Uh oh!

imbajin Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

contrueCT commented Jun 17, 2026

Purpose of the PR

Main Changes

Verifying these changes

Does this PR potentially affect the following parts?

Documentation Status

Uh oh!

codecov Bot commented Jun 17, 2026

Codecov Report

Uh oh!

imbajin left a comment

Choose a reason for hiding this comment

Uh oh!

imbajin Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

imbajin Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants