Cassie Jones [Sun, 1 Mar 2020 20:15:16 +0000 (21:15 +0100)]
Implement variable capture
We just copy the variables into the closure environment after it gets
allocated. This involves a decent chunk of stack traffic because we have
to increment their reference counts.
In the process, I figured out that the "ivy_app_mut" optimization for
later functions isn't sound in the presence of nested functions, because
one of the earlier applications can return a shared function which isn't
acceptable to mutate. (There are ways to make it sound if you know for
sure what functions are being called).
Cassie Jones [Sun, 1 Mar 2020 16:28:31 +0000 (17:28 +0100)]
Reduce stack traffic by finding loads through SSA
Instead of loading Load instructions directly and then storing them onto
the stack, we perform loads by looking through the SSA and seeing what
instruction generated them. If it was a load, we run that load directly,
otherwise we load from the stack slot that it corresponds to.
Cassie Jones [Sun, 1 Mar 2020 14:53:18 +0000 (15:53 +0100)]
Make entry point symbols respect mangling
The entry point symbols were hardcoded in strings, now they use the
Label::Extern so they get proper printing. These should be migrated to
use the assembly instruction types, probably.
Cassie Jones [Sun, 1 Mar 2020 13:30:22 +0000 (14:30 +0100)]
Modify runtime symbol decoration
C name decoration varies across platforms. On macOS, names from C are
prefixed with an underscore. On Linux, they're not, and on 64-bit
Windows they're not.
The macOS situation observed by inspecting the symbols.
Linux compatibility issue discovered here (and above in the thread):
https://twitter.com/16kbps/status/1233955883148861440
Windows symbol decoration documented here:
https://docs.microsoft.com/en-us/cpp/build/reference/decorated-names#FormatC
> Note that in a 64-bit environment, functions are not decorated.
Currently handling this with conditional compilation, but that prevents
cross-compilation. In the future it would be good to have the formatting
method select how to print them, but that would involve switching away
from fmt::Display, since we can't pass the ABI information into that.
Alternately, we could pack the ABI information into the symbol enum.
Cassie Jones [Sun, 1 Mar 2020 02:10:11 +0000 (03:10 +0100)]
Add a compiler CLI
The compiler CLI lets you output assembly, the pretty-printed code, or a
compiled binary. It shells out to clang for assembling. You have to have
clang installed, and you have to have the runtime library in the correct
place. I need to figure out a way to embed the runtime library or
something...
Cassie Jones [Sun, 1 Mar 2020 00:48:44 +0000 (01:48 +0100)]
Add an extensible framework for adding built-ins
Now builtins can be written in assembly at the bottom, and they'll be
included and registered in the entry point on-demand by the program.
Only functions that are referenced will be included in the output.
This will need to be extended to handle functions which reference other
functions (like the handling of "true" and "false" if any of the
comparisons are included), but that's a matter of front-end changes.
Cassie Jones [Sat, 29 Feb 2020 18:28:23 +0000 (19:28 +0100)]
Implement basic code generation
This commit makes it possible to compile and assemble sixty-four.vy, and
compute the correct result!!!!!!!
In order to avoid having to do any detailed analysis on variable usage,
this currently accesses everything via the stack. This means allocating
stack space for every instruction, and then saving and loading
everything from the stack frame.
The translation of individual functions is pretty direct. Due to going
via the stack for everything, we can translate each individual
instruction on its own, in order, without looking at other instructions.
This currently doesn't implement copying data into the closures, because
the current test case doesn't do that at all.
This also hard-codes the one builtin (debug) that's referred to by the
test program.
Cassie Jones [Sat, 29 Feb 2020 16:56:45 +0000 (17:56 +0100)]
Add skeleton of x64 compilation code
We compile the globals and functions in the program into a list of
global labels and definitions. Add a bunch of instruction and etc. types
here for formatting. Then, assemble all of this together in the program.
The program realigns the stack (probably overkill?) and then calls into
the entry point symbol, then exits.
I added ivy_exit so that I wouldn't have to refer to any
platform-specific symbols in the assembly that's emitted, all of that
can be handled by the runtime. Unfortunately there will still be
platform specific behavior because of different calling conventions.
Cassie Jones [Sat, 29 Feb 2020 15:21:48 +0000 (16:21 +0100)]
Add the target-independent trans module
This handles translation of the AST to an SSA form. Currently not doing
any control flow constructs inside a function, so functions just have a
single basic block with linear SSA. This does some basic load forwarding
and de-duplication optimizations since they're straightforward.
The most important effect here is translating the globally resolved
names into offsets into the closure record, and listing the variables
that need to be copied into constructed closures.
Cassie Jones [Fri, 28 Feb 2020 20:39:37 +0000 (21:39 +0100)]
Add runtime library and example assembly
This is further exploration into what the code could look time, but this
time there were productive results. This has a few parts:
- The runtime is written as Rust code! It's almost 100% unsafe code
because I didn't want to think about safety boundaries with the
assembly code that's outside the runtime. I can probably end up making
almost all of it safe code with a little bit of work.
- The code mostly works by constructing lambdas with the runtime
function, and then lambda bodies are mostly just applying objects to
each other with the ivy_app function.
This sample program implements doing arithmetic on Church numerals, and
then uses it to run the debug print 64 times.
Cassie Jones [Fri, 28 Feb 2020 20:03:07 +0000 (21:03 +0100)]
Add memory layout docs and assembly experiments
My goal was to figure out how I intended to compile the language by
writing the assembly myself and figuring out the format of all the
objects. I pretty quickly realized that lots of this belongs as runtime
routines rather than sequences the compiler outputs, for things like
creating closures, allocating objects, etc.
What you see here is in a sort of mixed-up state because it started out
by making raw system calls (so allocate and free were more involved),
but is now set up to call into system calls via libSystem.dylib. It
doesn't do anything interesting right now, it was just set up to learn
some things.
Most of the code here is implementing type-checking for the tagged
pointer + 63-bit integer scheme, which is inspired by OCaml. The
specific object layouts are described in docs/object-layout.md.
Cassie Jones [Thu, 27 Feb 2020 18:26:23 +0000 (19:26 +0100)]
Add a pretty-printer for the AST
This uses the pretty-crate for wadler-style pretty-printers. I'm
printing them with a pretty syntax and not with s-expressions, because
it's easier to print pretty than it is to parse pretty.
Cassie Jones [Thu, 27 Feb 2020 17:27:37 +0000 (18:27 +0100)]
Add support for globals which don't get captured
Lots of top-level definitions don't need to be copied into every
closure, because they're constant for the whole program. Global
variables are never inserted into the upvars entry, so they won't be
copied into closures.
This also adds "builtin" names that are inserted into the initial
context for arithmetic and comparisons. Booleans have to be builtin due
to being returned by the comparisons, but they're still intended to be
implemented via church encoding.
Cassie Jones [Thu, 27 Feb 2020 16:58:19 +0000 (17:58 +0100)]
Parse S-expressions into the abstract syntax tree
While doing this, we handle name resolution and "upvar" captures. The
name resolution context "Env" forms a stack that has a reference to the
outer environment at each level. A new level is pushed at each lambda
that's introduced. The environment includes a list of "upvars" that have
to be copied into the closure, which is updated every time a variable
lookup finds the variable in an outer environment instead of the local
environment. This process has to use a RefCell because you can't make
the lifetimes work out with a chain of mutable references.
The parsing aside from that environment stack is relatively
straightforward parsing.
Cassie Jones [Thu, 27 Feb 2020 13:19:40 +0000 (14:19 +0100)]
An s-expression parser and an example program
This is intended to be a slightly fancier lambda calculus with
arithmetic support. The way I've written the example here suggests not
making conditionals a builtin, and instead just using church encodings.
This will probably be bad for displaying data purposes? I'm not sure how
I want to handle I/O. Maybe do an ML and put impure I/O in here?