alpaca-lang / alpaca
- воскресенье, 26 марта 2017 г. в 03:12:28
Erlang
Functional programming inspired by ML for the Erlang VM
Alpaca is a statically typed, strict/eagerly evaluated, functional programming language for the Erlang virtual machine (BEAM). At present it relies on type inference rather than explicit type annotations. It was formerly known as ML-flavoured Erlang (MLFE).
Make sure the following are installed:
Make a new project with rebar3 new app your_app_name
and in the
rebar.config
file in your project's root folder
(e.g. your_app_name/rebar.config
) add the following:
{plugins, [
{rebar_prv_alpaca, ".*", {git, "https://github.com/alpaca-lang/rebar_prv_alpaca.git", {branch, "master"}}}
]}.
{provider_hooks, [{post, [{compile, {alpaca, compile}}]}]}.
Check out
the tour for the language basics,
put source files ending in .alp
in your source folders, run rebar3 compile
and/or rebar3 eunit
.
Something that looks and operates a little bit like an ML on the Erlang VM with:
term()
or any()
The above is still a very rough and incomplete set of wishes. In future it might be nice to have dialyzer check the type coming back from the FFI and suggest possible union types if there isn't an appropriate one in scope.
.beam
binariesHere's an example module:
module simple_example
-- a basic top-level function:
let add2 x = x + 2
let something_with_let_bindings x =
-- a function:
let adder a b = a + b in
-- a variable (immutable):
let x_plus_2 = adder x 2 in
add2 x
-- a polymorphic ADT:
type messages 'x = 'x | Fetch pid 'x
{- A function that can be spawned to receive `messages int`
messages, that increments its state by received integers
and can be queried for its state.
-}
let will_be_a_process x = receive with
i -> will_be_a_process (x + i)
| Fetch sender ->
let sent = send x sender in
will_be_a_process x
let start_a_process init = spawn will_be_a_process init
Alpaca is released under the terms of the Apache License, Version 2.0
Copyright 2016 Jeremy Pierre
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Please note that this project is released with a Contributor Code of
Conduct, version 1.4. By participating in this project you agree to abide by its
terms. See code_of_conduct.md
for details.
You can join #alpaca-lang
on freenode to discuss the
language (directions, improvement) or get help. This IRC channel is
governed by the same code of conduct detailed in this repository.
Pull requests with improvements and bug reports with accompanying tests welcome.
It's not very usable yet but the tests should give a relatively clear picture as to
where we're going. test_files
contains some example source files used
in unit tests. You can call alpaca:compile({files, [List, Of, File, Names, As, Strings]}, [list, of, options])
or alpaca:compile({text, CodeAsAString}, [options, again])
for now.
Supported options are:
'test'
- This option will cause all tests in a module to be type checked and exported
as functions that EUnit should pick up.{'warn_exhaustiveness', boolean()}
- If set to true (the default), the compiler will print warnings regarding missed patterns in top level functions.Errors from the compiler (e.g. type errors)
are almost comically hostile to usability at the moment. See the
tests in alpaca_typer.erl
.
You will generally want the following two things installed:
Thanks to @tsloughter's Alpaca Rebar3 plugin it's pretty easy to get up and running.
Make a new project with Rebar3 (substituting whatever project name
you'd like for alpaca_example
):
$ rebar3 new app alpaca_example
$ cd alpaca_example
In the rebar.config
file in your project's root folder add the
following (borrowed from @tsloughter's docs):
{plugins, [
{rebar_prv_alpaca, ".*", {git, "https://github.com/alpaca-lang/rebar_prv_alpaca.git", {branch, "master"}}}
]}.
{provider_hooks, [{post, [{compile, {alpaca, compile}}]}]}.
Now any files in the project's source folders that end with the
extension .alp
will be compiled and included in Rebar3's output
folders (provided they type-check and compile successfully of course).
For a simple module, open src/example.alp
and add the following:
module example
export add/2
let add x y = x + y
The above is just what it looks like: a module named example
with a
function that adds two integers. You can call the function directly
from the Erlang shell after compiling like this (note alpaca prepends alpaca_
to the module name, so in the erlang shell you must explicitly add this):
$ rebar3 shell
... compiler output skipped ...
1> alpaca_example:add(2, 6).
8
2>
Note that calling Alpaca from Erlang won't do any type checking but if you've written a variety of Alpaca modules in your project, all their interactions with each other will be type checked and safe (provided the compile succeeds).
If you have installed the prerequisites given above, clone this repository and run tests and dialyzer with:
rebar3 eunit
rebar3 dialyzer
There's no command line front-end for the compiler so unless you use
@tsloughter's Rebar3 plugin detailed in the previous section, you will
need to boot the erlang shell and then run alpaca:compile/2
to build
and type-check things written in Alpaca. For example, if you wanted to
compile the type import test file in the test_files
folder:
rebar3 shell
...
1> Files = ["test_files/basic_adt.alp", "test_files/type_import.alp"].
2> alpaca:compile({files, Files}, []).
This will result in either an error or a list of tuples of the following form:
{compiled_module, ModuleName, FileName, BeamBinary}
The files will not actually be written by the compiler so the binaries
described by the tuples can either be loaded directly into the running
VM (see the tests in alpaca.erl
) or written manually for now unless of
course you're using the aforementioned rebar3 plugin/
Most of the basic Erlang data types are supported:
true
or false
:atom
1.0
1
"A string"
. These are encoded as UTF-8 binaries.c"characters here"
[1, 2, 3]
or 1 :: 2 :: [3]
<<"안녕, this is some UTF-8 text": type=utf8>>
, <<1, 2, 32798: type=int, size=16, signed=false>>
, etc("a", :tuple, "of arity", 4)
#{:atom_key => "string value"}
. These
are statically typed as lists are (generics, parametric polymorphism).{x=1, hello="world"}
will produce a record with an x: int
and hello: string
field. Please see the language tour for more details.type t = int | pid int
for a type that covers integers and processes that receive integers.In addition there is a unit type, expressed as ()
.
Note that the tuple example above is typed as a tuple of arity 4 that
requires its members to have the types string
, atom
, string
,
integer
in that order.
On top of that you can define ADTs, e.g.
type try 'success 'error = Ok 'success | Error 'error
And ADTs with more basic types in unions work, e.g.
type json = int | float | string | bool
| list json
| list (string, json)
Types start lower-case, type constructors upper-case.
Integer and float math use different symbols as in OCaml, e.g.
1 + 2 -- ok
1.0 + 2 -- type error
1.0 + 2.0 -- type error
1.0 +. 2.0 --ok
Basic comparison functions are in place and are type checked, e.g. >
and <
will work both in a guard and as a function but:
1 > 2 -- ok
1 < 2.0 -- type error
"Hello" > "world" -- ok
"a" > 1 -- type error
See src/builtin_types.hrl
for the included functions.
Pretty simple and straightforward for now:
let length l = match l with
[] -> 0
| h :: t -> 1 + (length t)
The first clause doesn't start with |
since it's treated like a
logical OR.
Pattern match guards in clauses essentially assert types, e.g. this
will evaluate to a t_bool
type:
match x with
b, is_bool b -> b
and
match x with
(i, f), is_integer i, is_float f -> :some_tuple
will type to a tuple of integer
, float
.
Since strings are currently compiled as UTF-8 Erlang binaries, only the first clause will ever match:
type my_binary_string_union = binary | string
match "Hello, world" with
b, is_binary b -> b
| s, is_string s -> s
Further, nullary type constructors are encoded as atoms and unary constructors in tuples led by atoms, e.g.
type my_list 'x = Nil | Cons ('x, my_list 'x)
Nil
will become 'Nil'
after compilation and Cons (1, Nil)
will
become {'Cons', {1, 'Nil'}}
. Exercise caution with the order of
your pattern match clauses accordingly.
No distinction is made syntactically between map literals and map
patterns (=>
vs :=
in Erlang), e.g
match my_map with
#{:a_key => some_val} -> some_val
You can of course use variables to match into a map so you could write a simple get-by-key function as follows:
type my_opt 'a = Some 'a | None
let get_by_key m k =
match m with
#{k => v} -> Some v
| _ -> None
ML-style modules aren't implemented at present. For now modules in Alpaca are the same as modules in Erlang with top-level entities including:
use module.type
)let
bindings.An example:
module try
export map/2 -- separate multiple exports with commas
-- type variables start with a single quote:
type maybe_success 'error 'ok = Error 'error | Success 'ok
-- Apply a function to a successful result or preserve an error.
let try_map e f = match e with
Error _ -> e
| Success ok -> Success (f ok)
Tests are expressed in an extremely bare-bones manner right now and
there aren't even proper assertions available. If the compiler is
invoked with options [test]
, the following will synthesize and
export a function called add_2_and_2_test
:
let add x y = x + y
test "add 2 and 2" =
let res = add 2 2 in
match res with
4 -> :ok
| _ -> beam :erlang :error [no_match] with _ -> meaningless_return
Any test that throws an exception will fail so the above would work
but if we replaced add/2
with add x y = x + (y + 1)
we'd get a
failing test. If you use the rebar3 plugin mentioned above, rebar3 eunit
should run the tests you've written. There's a bug currently
where the very first test run won't execute the tests but all runs
after will (not sure why yet).
The expression that makes up a test's body is type inferenced and checked. Type errors in a test will always cause a compilation error.
An example:
let f x = receive with
(y, sender) ->
let z = x + y in
let sent = send z sender in
f z
let start_f init = spawn f init
All of the above is type checked, including the spawn and message sends.
Any expression that contains a receive
block becomes a "receiver"
with an associated type. The type inferred for f
above is the
following:
{t_receiver,
{t_tuple, [t_int, {t_pid, t_int}]},
{t_arrow, [t_int], t_rec}}
This means that:
f
has it's own function type (the t_arrow
part) but it also
contains one or more receive calls that handle tuples of integers
and PIDs that receive integers themselves.f
's function type is one that takes integers and is infinitely
recursive.send
returns unit
but there's no "do" notation/side effect support
at the moment hence the let binding. spawn
for the moment can only
start functions defined in the module it's called within to simplify
some cross-module lookup stuff for the time being. I intend to
support spawning functions in other modules fairly soon.
Note that the following will yield a type error:
let a x = receive with
i -> b x + i
let b x = receive with
f -> a x +. i
This is because b
is a t_float
receiver while a
is a t_int
receiver. Adding a union type like type t = int | float
will solve
the type error.
If you spawn a function which nowhere in its call graph posesses a
receive
block, the pid will be typed as undefined
, which means
all message sends to that process will be a type error.
The FFI is quite limited at present and operates as follows:
beam :a_module :a_function [3, "different", "arguments"] with
(ok, _) -> :ok
| (error, _) -> :error
There's clearly room to provide a version that skips the pattern match and succeeds if dialyzer supplies a return type for the function that matches a type in scope (union or otherwise). Worth noting that the FFI assumes you know what you're doing and does not check that the module and function you're calling exist.
Compiler error messages may be localized by calling alpaca_error_format:fmt/2
.
If no translation is available in the specified locale, the translation for
en_US will be used.
Localization is performed using gettext ".po" files stored in priv/lang. To add a new language, say Swedish (sv_SE), create a new file priv/lang/alpaca.sv_SE.po. If you use Poedit, you may then import all messages to be translated by selecting "Catalog -> Update from POT file..." in the menu, and then pick priv/lang/alpaca.pot. The messages may be a bit cryptic. Use the en_US as an aid to understand them.
The POT file is automatically updated whenever alpaca is compiled. Updates to po-files are also picked up at the compile phase.
A very incomplete list:
self()
- it's a little tricky to type. The type-safe solution is
to spawn a process and then send it its own pid. Still thinking
about how to do this better.gen_server
, etc.;
in OCaml for printing in a function
with a non-unit result.This has been a process of learning-while-doing so there are a number of issues with the code, including but not limited to:
alpaca_ast_gen.erl
and alpaca_typer.erl
. Frankly
the latter is begging for a complete rewrite.Parsing/validating occurs in several passes:
yecc
for the initial rough syntax form and basic module structure. This is
where exports and top-level function definitions are collected and the
initial construction of the AST is completed.Several passes internally
yecc
parser), building a list of top-level internal-only
and exported functions for each module. The output of this is a global
environment containing all exported functions by module and an environment of
top-level functions per module or a list of found errors.At present this is based off of the sound and eager type inferencer in http://okmij.org/ftp/ML/generalization.html with some influence from https://github.com/tomprimozic/type-systems/blob/master/algorithm_w where the arrow type and type schema instantiation are concerned.
module example
export add/2
let add x y = adder x y
let adder x y = x + y
The forward reference in add/2
is permitted but currently leads to some wasted
work. When typing add/2
the typer encounters a reference to adder/2
that is
not yet bound in its environment but is available in the module's definition.
The typer will look ahead in the module's definition to determine the type of
adder/2
, use it to type add/2
, and then throw that work away before
proceeding to type adder/2
again. It may be beneficial to leverage something
like ETS here in the near term.
Infinitely recursive functions are typed as such and permitted as they're
necessary for processes that loop on receive
. Bi-directional calls between modules
are disallowed for simplicity. This means that given module A
and B
, calls
can occur from functions in A
to those in B
or the opposite but not in
both directions.
I think this is generally pretty reasonable as bidirectional references probably
indicate a failure to separate concerns but it has the additional benefit of
bounding how complicated inferencing a set of mutually recursive functions can
get. The case I'm particularly concerned with can be illustrated with the
following Module.function
examples:
let A.x = B.y ()
let B.y = C.z ()
let C.z = A.x ()
This loop, while I belive possible to check, necessitates either a great deal of state tracking complexity or an enormous amount of wasted work and likely has some nasty corner cases I'm as yet unaware of.
The mechanism for preventing this is simple and relatively naive to start: entering a module during type inferencing/checking adds that module to the list of modules encountered in this pass. When a call occurs (a function application that crosses module boundaries), we check to see if the referenced module is already in the list of entered modules. If so, type checking fails with an error.
There is currently no "any" root/bottom type. This is going to be a problem for
something like a simple println
/printf
function as a simple to use version
of this would best take a List of Any. The FFI to Erlang code gets around this
by not type checking the arguments passed to it and only checking the result
portion of the pattern matches.