Skip to content

Fix eval tests#846

Merged
gspencergoog merged 4 commits intoflutter:mainfrom
gspencergoog:fix_tests
Apr 1, 2026
Merged

Fix eval tests#846
gspencergoog merged 4 commits intoflutter:mainfrom
gspencergoog:fix_tests

Conversation

@gspencergoog
Copy link
Copy Markdown
Collaborator

Description

This fixes the packages/genui/examples/eval tests, since they are failing. They also aren't being run by CI, presumably because they make actual backend calls.

@gspencergoog gspencergoog requested a review from polina-c April 1, 2026 00:16
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the UI update logic by removing redundant surface updates in the SurfaceController and updating the prompt builder to instruct the AI against modifying existing surfaces. Feedback includes a security concern regarding the lack of backend enforcement for the 'no-update' policy, a suggestion to improve the robustness of surface update assertions in tests, and a recommendation to add a comment explaining the removal of the surface update trigger to clarify how reactivity is handled.

Comment on lines +120 to 122
updated.length == created.length,
'In chat setup surfaces should not be updated after initial creation',
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check updated.length == created.length only verifies the total number of events. It doesn't ensure that each specific surface was updated exactly once. For example, if one surface is updated twice and another is never updated, this check would still pass. A more robust check would verify the update count per surface ID.

      updated.length == created.length,
      'In chat setup surfaces should not be updated after initial creation',
    );
    for (final id in created) {
      final updateCount = updated.where((u) => u == id).length;
      reporter.expect(updateCount == 1, 'Surface $id should be updated exactly once');
    }

gspencergoog and others added 2 commits March 31, 2026 17:45
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@gspencergoog gspencergoog force-pushed the fix_tests branch 2 times, most recently from 3a382b5 to 734ef06 Compare April 1, 2026 01:09
@gspencergoog gspencergoog merged commit 580c8a7 into flutter:main Apr 1, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants