Provide more control over and access to JIT's output

From LLVM

Jump to: navigation, search

Proposed by JeffreyYasskin.

This proposal is a work in progress. I'll mail it to llvmdev when it's ready. If I have any facts wrong, feel free to fix them. :)

Contents

[edit] Current State

Right now, the ExecutionEngine provides functions to manually map a GlobalValue* to a particular address, allocate memory for GlobalValues and emit their contents into that memory, and free some of the kinds of memory it can allocate. There are several needs this doesn't fill.

[edit] Unsatisfied requirements

  • Programmatically get the range of a JITted function to pass to a disassembler like udis or gdb. The --debug-only=jit command-line flag will print this information to stderr, but it would be nice to get it from within the program.
  • There's no method to free emitted GlobalValues that aren't used anymore.
  • When recompiling a function, the JIT overwrites the original machine code with a jump to the new machine code. It's the user's responsibility to exclude other threads from this code while the compilation proceeds. To reduce the amount of time threads need to be blocked, it would be nice to separate the recompilation from the step overwriting old machine code.
  • There's no facility for garbage collecting old machine code after all users have been updated.
  • Some ModulePasses delete unused globals. The JIT keeps a reference to any global transitively referred to by anything it compiles. The JIT also forbids deleting anything it has a reference to. This effectively prevents running global optimizations on code that has already been JITted once.

[edit] Proposed solution

Create a class (hierarchy?) representing JIT output. This class can have methods to get the address range of a particular emitted chunk. This class should also provide a way to delete the chunk when it's no longer used and remove the JIT's reference to the underlying GlobalValue, allowing the GlobalValue to be deleted. Add new versions of functions like getOrEmitGlobalVariable and getPointerToFunction that return an instance of this class.

Data members available in this new class:

  • void*: the address of the machine code
  • int: the size of the machine code in memory
  • relocations: (maybe?) the array of relocation objects, so the machine code can be moved by the user

[edit] Holes in the solution

How do other JITs manage these problems?

What should we do for the interpreter?

[edit] Comments

See Talk page for comments.

Personal tools