Skip to content

Implement RISC-V dynamic linking#323

Draft
DrXiao wants to merge 6 commits into
sysprog21:masterfrom
DrXiao:feat/dynlink-for-rv32
Draft

Implement RISC-V dynamic linking#323
DrXiao wants to merge 6 commits into
sysprog21:masterfrom
DrXiao:feat/dynlink-for-rv32

Conversation

@DrXiao
Copy link
Copy Markdown
Collaborator

@DrXiao DrXiao commented Apr 23, 2026

The proposed changes primarily improve the ELF generation and the RISC-V backend, enabling the build system to generate a dynamically linked shecc targeting the RISC-V architecture.

Although the current changes allow both bootstrapping and test suite to complete successfully, this is still a work in progress. The TODO items are listed as follows:

  • Enhance code quality, comments and commit messages.
  • Confirm the RISC-V ABI compliance.
  • Add a new script (riscv-abi.sh) to validate the RISC-V ABI.
  • Improve the documentation and README.
  • Consolidate arm.mk and riscv.mk into a common build logic (e.g.: configure RUNNER_LD_PREFIX).
  • (Any new requirements ...)

Summary by cubic

Implements RV32 dynamic linking with RELA, an ABI-compliant PLT/GOT, and a __libc_start_main startup path; external calls route via PLT and the stack stays 16-byte aligned. Adds a RISC-V ABI test suite and expands CI to validate static/dynamic builds for arm and riscv with auto-detected toolchains.

  • New Features

    • RV32 dynamic linking: .rela.plt; PLT0=32B, stubs=16B; per-arch RESERVED_GOT_NUM (RV32=2, ARM=3); DYN_LINKER /lib/ld-linux-riscv32-ilp32d.so.1.
    • ELF: select REL/RELA via use_relaplt and ELF_MACHINE_*; correct dynamic tags (RELA/RELASZ/RELAENT/JMPREL/PLTREL/PLTRELSZ); GOT init honors reserved entries.
    • RISC-V codegen: generate PLT per ABI; direct calls for internal, externs via PLT; entry calls __libc_start_main; preserve ra; keep syscall trampoline for static.
    • Tests: tests/riscv-abi.sh; RV32 dynamic snapshots (hello, fib).
  • Dependencies

    • Shared cross-toolchain detection in mk/common.mk with per-arch TOOLCHAIN_CANDIDATES; CI downloads a RISC-V glibc toolchain and adds /opt/riscv/bin to PATH; static/dynamic matrices run for arm and riscv with RUNNER_LD_PREFIX auto-set under QEMU.

Written for commit ee9db4a. Summary will update on new commits.

Review in cubic

@DrXiao
Copy link
Copy Markdown
Collaborator Author

DrXiao commented Apr 23, 2026

As the apt package manager only provides a 64-bit RISC-V GNU cross-compilation toolchain, I utilize riscv-gnu-toolchain , which is a 32-bit variant, and leverage its artifacts for RISC-V dynamic linking development and validation.

The updated GitHub Actions also downloads the 32-bit variant to validate RISC-V dynamic linking.

@jserv
Copy link
Copy Markdown
Collaborator

jserv commented Apr 23, 2026

As the apt package manager only provides a 64-bit RISC-V GNU cross-compilation toolchain, I utilize riscv-gnu-toolchain , which is a 32-bit variant, and leverage its artifacts for RISC-V dynamic linking development and validation.
The updated GitHub Actions also downloads the 32-bit variant to validate RISC-V dynamic linking.

Evaluate Run 32-bit applications on 64-bit Linux kernel, which is exactly RV32-on-RV64 userspace compatibility, not emulation.

@DrXiao
Copy link
Copy Markdown
Collaborator Author

DrXiao commented Apr 24, 2026

Evaluate Run 32-bit applications on 64-bit Linux kernel, which is exactly RV32-on-RV64 userspace compatibility, not emulation.

I'm not sure whether I understand correctly. Do you mean that the proposed changes should be verified on a RISC-V machine?

@jserv
Copy link
Copy Markdown
Collaborator

jserv commented Apr 24, 2026

Do you mean that the proposed changes should be verified on a RISC-V machine?

See sysprog21/kbox#18
The RISE RISC-V Runners is a managed GitHub Actions runner service that executes CI/CD workflows on real RISC-V hardware.

Introduce ELF_MACHINE_ARM32 (0x28) and ELF_MACHINE_RV32 (0xf3) to
support architecture-specific logic in future developments.
@DrXiao DrXiao force-pushed the feat/dynlink-for-rv32 branch from 446b538 to 44f2f47 Compare May 9, 2026 03:52
@DrXiao
Copy link
Copy Markdown
Collaborator Author

DrXiao commented May 9, 2026

RISE RISC-V Runners' documentation explicitly states that binaries must be compiled for riscv64. According to the FAQ - What architectures are supported?.

RISC-V 64-bit (riscv64) only. All runners execute on physical RISC-V hardware. There is no RISC-V emulation. Binaries must be compiled for riscv64.

I created another branch (feat/dynlink-for-rv32-test-rv64-runner) to verify if these RISC-V runners could support riscv32 binaries. However, the test result indicates that the runner failed to execute statically linked shecc.

Based on both the documentation and my test, it appears that RISE RISC-V runners lack support for 32-bit executables.

@DrXiao
Copy link
Copy Markdown
Collaborator Author

DrXiao commented May 28, 2026

RISC-V calling convention:

  1. First eight arguments are passed to a0-a7, and the extra arguments are pushed onto the stack.
  2. stack should be 128-bit (16-byte) aligned. In RISC-V ABIs Specification document, the section 2-1 describes

    2-1. Integer Calling Convention

    The stack grows downwards (towards lower addresses) and the stack pointer shall be aligned to a
    128-bit boundary upon procedure entry.

  3. Caller/Callee saved registers:
    Register ABI Name Saver
    zero, gp, tp (None)
    ra Caller
    sp Callee
    a0 - a7 Caller
    s0 - s11 Callee
    t0 - t6 Caller

Item 1: is done by the register allocation phase.

The register allocator uses virtual registers (vreg0-vreg7) to allocate reigsters for arguments when encountering a function call. vreg0-vreg7 will be mapped to a0-a7, so first eight arguments are naturally passed to these registers.

Since the current shecc only supports up to 8 arguments, no extra arguments need to be passed to the stack. Thus, we can skip this handling.

Item 2: is ensured by the RISC-V code generator.

When handling the stack pointer, the code generator will guarantee that sp is always incremented or decremented by a multiple of 16 bytes.

Item 3:

  • ra and sp: are properly handled by the code generator.
  • a0 - a7: are implicitly handled in register allocation phase.
  • s0 - s11: are not necessary to be preserved or restored.
  • t0 - t6: are not necessary to be handled.

Therefore, this item can also be considered complete. Further details and explanations can be found in riscv-codegen.c.

DrXiao added 5 commits June 1, 2026 21:43
This commit primarily improves the ELF handling and code generator to
enable the compiler to produce a dynamically linked executable targeting
the RISC-V architecture.

- Allow the ELF handling to generate RELA relocation table.
  - Use REL relocation when the target architecture is Arm. Othereise,
    use RELA relocation for RISC-V.
- Improve GOT generation process.
  - Arm: reserve three entries.
  - RISC-V: reserve two entries.
- Implement PLT generation for RISC-V.
  - The generation process follows the RISC-V ABI. The first PLT entry
    uses 8 instructions to call '_dl_runtime_resolve'. The subsequent
    entry uses 4 instructions to perform an indirect function call via
    GOT.
- Refine the function call handling for the RISC-V code generator.
  - Perform a direct call for internal functions
  - Otherwise, use PLT table to peform an indirect call for external
    functions.
- Enhance the build system:
  - Allow the build system to generate dynamically linked compilers when
    targeting the RISC-V architecture.
  - Detect the sysroot path of the RISC-V GNU toolchain automatically.
Modify the 'update-snapshots' and 'check-snapshots' make targets to
include generation and validation of new snapshots for the RISC-V
architecture using dynamic linking.
The update workflow now downloads a RISC-V GNU toolchain to provide
necessary dependencies and validate the dynamically linked compiler
targeting the RISC-V architecture.
Because two architecture-specific makefile fragments contain similar
snippets for locating the cross-compilation toolchain path, this commit
consolidates them into a shared build logic, thereby reducing code
duplication.
A new shell script is introduced to validate whether generated
executables targeting RISC-V correct comply with the RISC-V ABI.

The tests include:
- Parameter Passing: tests function calls with different numbers of
  arguments.
- Stack Alignment: validates whether the stack is always 16-byte
  aligned when calling a function.
- Return Values: confirms if the return value is correct after a
  function returns.
- External Calls: verifies whether dynamically linked programs can call
  external functions.
- Register Preservation: verify whether the contents of function
  argument registers are properly preserved when calling a function.
- Structure Passing: validates if a small structure object can be passed
  correctly.
@DrXiao DrXiao force-pushed the feat/dynlink-for-rv32 branch from 2599d74 to ee9db4a Compare June 1, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants