Skip to content

Refactor completion suggestions engine to use declarative rules#1761

Open
rolandwalker wants to merge 4 commits intomainfrom
RW/add-completion-engine-tests
Open

Refactor completion suggestions engine to use declarative rules#1761
rolandwalker wants to merge 4 commits intomainfrom
RW/add-completion-engine-tests

Conversation

@rolandwalker
Copy link
Copy Markdown
Contributor

@rolandwalker rolandwalker commented Mar 31, 2026

Description

Refactor completion suggestions engine to use declarative rules, breaking up suggest_based_on_last_token() into many small functions.

This is not perfect, as some functions such as _emit_lparen() remain large, and others such as suggest_special() are untouched. The rules are still ordered as a list, which could be finicky for future changes. An alternative is including a priority rank in the SuggestRule dataclass.

A risk is that the declarative rules make ample use of lambdas, which might impose a performance penalty.

Motivation: making the rules easier to understand and modify, and making rules easier to migrate to sqlglot, which may be more reliable and performant.

Further work could include turning the return values from the _emit functions into a list of Suggestion instances, instead of using dicts. It could also be nice to migrate the rules in SUGGEST_BASED_ON_LAST_TOKEN_RULES into a separate file completion_rules.py, in which perhaps all rules are not based on the last token.

The PR consists of four commits, which may make it easier to review

  1. tests
  2. the primary refactor to review
  3. tests
  4. make SQL parsing lazy for performance

Checklist

  • I added this contribution to the changelog.md file.
  • I added my name to the AUTHORS file (or it's already there).
  • To lint and format the code, I ran
    uv run ruff check && uv run ruff format && uv run mypy --install-types .

@rolandwalker rolandwalker self-assigned this Mar 31, 2026
@rolandwalker rolandwalker force-pushed the RW/add-completion-engine-tests branch from 132f539 to 4427504 Compare March 31, 2026 19:10
@rolandwalker rolandwalker added the codex request Codex review for an updated PR label Mar 31, 2026
@github-actions
Copy link
Copy Markdown

  1. No blocking findings on correctness or security in the PR diff (completion_engine refactor appears behavior-preserving, with targeted tests for most branches).

  2. Test gap: add a non-progressing rewind test for a binary operator token (not just comma), so the recursion guard in completion_engine.py is explicitly covered for both paths.
    Current coverage hits comma-only cases in test_completion_engine.py and test_completion_engine.py.

Validation note: I couldn’t run the suite in this environment because uv/sqlparse are unavailable (uv: command not found, ModuleNotFoundError: sqlparse).

@rolandwalker rolandwalker force-pushed the RW/add-completion-engine-tests branch from 4427504 to 9accb87 Compare March 31, 2026 19:27
@rolandwalker rolandwalker added codex request Codex review for an updated PR and removed codex request Codex review for an updated PR labels Mar 31, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

Codex Review:

  1. Medium: autocomplete performance regression from eager parsing

    • In completion_engine.py, suggest_based_on_last_token() now always builds SuggestContext, which always parses SQL via _build_suggest_context() -> _parse_suggestion_statement() (line 103).
    • Previously, cheap guards (word starts with digit, quote, inside quotes) returned early before parsing. That optimization is now gone, so parsing runs even in cases that immediately return [].
    • Impact: higher per-keystroke latency in common typing paths (especially quoted strings), i.e. a user-visible regression in completion responsiveness.
    • Action: restore fast-path short-circuiting before parse, or make parse/tokens lazy so rules that don’t need parse data don’t trigger it.
  2. Missing regression test for the above fast-path behavior

    • Current tests validate output correctness but not that parse is skipped for guard cases.
    • Action: add a test that monkeypatches _parse_suggestion_statement and asserts it is not called when word_before_cursor starts with digit/quote or cursor is inside '/" strings.

No security issues stood out in this PR.

I couldn’t run tests in this environment because uv/pytest are unavailable here.

@rolandwalker rolandwalker force-pushed the RW/add-completion-engine-tests branch from 9accb87 to abdf19b Compare March 31, 2026 19:47
@rolandwalker rolandwalker added codex request Codex review for an updated PR and removed codex request Codex review for an updated PR labels Mar 31, 2026
@github-actions
Copy link
Copy Markdown

Codex Review

No blocking findings in the PR diff (eb1dd47c...abdf19b) for correctness or security.

I checked the refactor in completion_engine.py against the prior branch logic and didn’t find behavior regressions in rule ordering or fallback semantics; the declarative rules preserve the old branches.

Residual risk / test gap:

  1. I could not execute the test suite in this environment because uv/pytest are unavailable (command not found), so this review is static-analysis only. Running uv run -- pytest test/pytests/test_completion_engine.py in CI/local is still important for confidence.

@rolandwalker rolandwalker removed the codex request Codex review for an updated PR label Mar 31, 2026
in preparation for a refactor
breaking up suggest_based_on_last_token() into many small functions.

This is not perfect, as some functions such as _emit_lparen() remain
large, and others such as suggest_special() are untouched.  The rules
are still ordered as a list, which could be finicky for future changes.
An alternative is including a priority rank in the SuggestRule
dataclass.

A risk is that the declarative rules make ample use of lambdas, which
might impose a performance penalty.

Motivation: making the rules easier to understand and modify, and
making rules easier to migrate to sqlglot, which may be more reliable
and performant.

Further work could include turning the return values from the _emit
functions into a list of Suggestion instances, instead of using dicts.
It could also be nice to migrate the rules in
SUGGEST_BASED_ON_LAST_TOKEN_RULES into a separate file
completion_rules.py, in which perhaps all rules are not based on the
last token.
adding some todo comments in completion_engine.py, related to tests
which are xfailed here.
so that guards such as "is inside string" can run without full parsing,
for performance.
@rolandwalker rolandwalker force-pushed the RW/add-completion-engine-tests branch from 823ea55 to bd6183c Compare April 1, 2026 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant