Variations on the "leverage language from elixir" pattern.
It seems that a pattern emerged in my recent months or years of Elixir, and it looks like I am now knee-deep in this pattern : leveraging other languages from Elixir.
This leverage takes, it seems, three forms :
- Writing Elixir code that generates and runs <language> code.
- Embedding (à la Zigler) <language> code in Elixir and running it.
- Writing Elixir code that generates instructions for <language>.
And it all depends on managing <language>‘s runtimes from Elixir, which takes two forms :
- Checking for <language>‘s presence and adequation and providing instructions
- Managing the install itself from Elixir
Note that I won’t talk of Rustler which is already covered a bit on the blog and now well-known. It is a fantastic tool but does not offer the “code-in-code” approach described here. Nevertheless, a Rustler NIF in a monorepo still achieves the benefits I’m talking about here.
First Form : Writing Elixir code that generates and runs <language> code.
I was looking for a library that would allow read & write access to the three main Ms Office document types, meaning Word, Powerpoint, and Excel. From a previous life, I had used PHPOffice and it provided the required functionality.
But I would not like to truly add PHP to my stack. Chatting with another developer, I mocked what kind of reduction would happen if we leveraged PHPOffice from elixir (yes, there’s an error on line 12).
So I set out to build a POC for just that. It would not be used because I finally chose to leverage Java and the Apache POI family of libraries instead of PHPOffice (more on that later).
You first need to be able to run PHP code and ideally somehow get its output :
# Check if PHP and composer are accessible :
iex> Xhpoffice.Prerequisites.clear?()
true
# Note : this is the second pattern, inlining code in Elixir !
iex> script = ~PHP"""
$foo = 5;
elixir_return(1 + 2 + $foo);
"""
iex> result = Xhpoffice.Php.run!(script)
8
After having that, I tried to reproduce the PHPOffice basic example :
import Xhpoffice.Word
output_path = Path.join(File.cwd!(), "output.docx")
doc = make_document()
{doc, section} = add_section(doc)
{doc, _section, _object} = add_text(doc, section, "How convenient !")
doc |> write(output_path) |> run!
This rather idiomatic API goes around the mutability requirements of the original library by making the calls take a document or section explicitly, and returning them.
In the background, the document accumulates a tree of calls specific to PHPWord, generating variable IDS and keeping track of them :
[
[
{:assign, "941995C2B5135245424CA0615E6CBDE7",
{:static, "\\PhpOffice\\PhpWord\\IOFactory", "createWriter",
[{:var, "674317FBD969EA74B1FF85C880B2CB92"}, "Word2007"]}},
{:method, {:var, "941995C2B5135245424CA0615E6CBDE7"}, "save",
["<cwd>/output.docx"]}
],
[
{:assign, "536A810991F2B5C6E5C2CDE3EC99C921",
{:method, {:var, "739CFE7AD01797F26E5C8B305A674886"}, "addText",
["How convenient !"]}}
],
[
{:assign, "739CFE7AD01797F26E5C8B305A674886",
{:method, {:var, "674317FBD969EA74B1FF85C880B2CB92"}, "addSection", []}}
],
{:assign, "674317FBD969EA74B1FF85C880B2CB92",
{:new, "\\PhpOffice\\PhpWord\\PhpWord", []}}
]
It is then compiled to PHP :
<?php
$varF4F605C1FFBCF3CDC59374EBDF8B0DB8 = new \PhpOffice\PhpWord\PhpWord();
$var83F1F620C6BD472063B55A504626B8C8 = $varF4F605C1FFBCF3CDC59374EBDF8B0DB8->addSection();
$varF8440D2829E71AAC14AAFF423F0C15F9 = $var83F1F620C6BD472063B55A504626B8C8->addText("How convenient !");
$var65ADDF5337DA12DC1D21C4C819200176 = \PhpOffice\PhpWord\IOFactory::createWriter($varF4F605C1FFBCF3CDC59374EBDF8B0DB8,"Word2007");
$var65ADDF5337DA12DC1D21C4C819200176->save("<cwd>/xhpoffice/output.docx");
?>
The whole compiler is here : it does not support object values, but for this POC, did not need to. We can see that the pattern does not consist to be able to cover the full PHP language, but only the subset useful to imperatively build documents with PHPWord.
def do_compile(calls) do
spells = calls |> Enum.reverse() |> List.flatten()
output =
for spell <- spells do
eval(spell)
end
joined = Enum.join(output, "\n")
joined
end
def map_args(args), do: Enum.map_join(args, ",", &eval/1)
def dollar(id), do: "$var#{id}"
def eval({:assign, id, expr}), do: "#{dollar(id)} = #{eval(expr)};"
def eval({:new, class, args}), do: "new #{class}(#{map_args(args)});"
def eval({:var, id}), do: dollar(id)
def eval({:method, subject, name, args}), do: "#{eval(subject)}->#{name}(#{map_args(args)});"
def eval({:static, class, name, args}), do: "#{class}::#{name}(#{map_args(args)});"
def eval(value) when is_binary(value), do: Jason.encode!(value)
There’s a bit of additionnal housekeeping related to how the PHP script is ran, its dependencies injected, but irrelevant here. The idea is to only cover the needed bases for your library. You’re not going to leverage the whole of <language>‘s ecosystem anyway.
Second Form : Embedding (à la Zigler) <language> code in Elixir and running it.
I’ve shown it above with the ~PHP
sigil and this is what Zigler
provides, so this section will be quite short.
I love multi-letter sigils (elixir > 1.15) as they provide two convenient features :
- The ability to format the content of the sigil with a custom formatter
- The ability to handle the content in a custom way
If you never used custom sigils, here is the general outline :
~CUSTOM"""
custom content here
"""
# is equivalent to
sigil_CUSTOM(contents, opts)
You are free to define the sigil_CUSTOM
function and it does not need to return a binary.
As an example, the ~PHP
sigil above returns a temporary file path. In DtsBuddy
, a runtime overlay loader library I’m slowly working on for my hobby projects, the ~DTS
sigil returns an {:ok, "file", "name"}
3-tuple.
In Alzo, my ~JS
sigil only returns its content, but it is used to hook a custom formatter to the sigil. That way, editor format-on-save or mix format
properly format the sigil’s content.
defmodule Alzo.SigilJS do
@behaviour Mix.Tasks.Format
def features(_opts), do: [sigils: [:JS], extensions: []]
def format(contents, opts) do
c = """
<<'EOF'
#{contents}
EOF
"""
command = "node #{@prettier_path} --parser acorn #{c}"
port = Port.open({:spawn, command}, [:binary])
receive do
{^port, {:data, d}} -> d
_ -> contents
after
1_000 -> contents
end
end
def sigil_JS(str, _opts) do
str
end
end
We can see that prettier
is called in a Port. In case it fails, the content is left un-formatted. Note that I am not passing user-generated content but my own source code only.
In sheerlox’s Nodelix, I’m working to add the ~NODE
and ~PACKAGE
sigils to add one-off JS script execution and dependencies installation to the library.
A few caveats if you’re willing to provide multi-letter sigils :
- They hide behavior. In this case that’s also a good thing. The end user uses that sigil because of the shortcut it provides in doing plumbing (creating a temporary script folder, installing dependencies, etc). But that should be made very explicit and I think returning instruction tuples for further side-effect execution is a good compromise.
- They are unsupported under Elixir 1.15. Users that cannot upgrade are still able to call
sigil_CUSTOM
directly, but you should not write the~CUSTOM
form in your library code or tests. - String interpolation is disabled inside multi-letter sigils. It also is a good thing and there is discussion on that topic online with good reasons provided. You should provide your own interpolation implementation if you wish to do so.
Note that you can circumvent the interpolation problem with a simple macro : for example, I can provide a nodescript
macro in which a regular heredoc block can be used :
defmacro node_script(do_block) do
{:sigil_NODE, [delimiter: "\"\"\"", context: Elixir.Nodelix.SigilNode, imports: []],
[
{:<<>>, [indentation: 0],
[
Keyword.get(do_block, :do)
]},
[]
]}
end
target = "world"
node_script do
"""
console.log("Hello #{target}")
"""
# equivalent to
sigil_NODE(
"""
console.log("Hello world !")
""",
[]
)
# equivalent to
~NODE"""
console.log("Hello world !")
"""
My point with this section is that this pattern is very useful to me. If you are considering it, please make it opt-in (provide another way to achieve the outcome) and explicit (require manual handling of the final side effects).
Bonus : Syntax highlighting in the sigils
Syntax highlighting was on my plans, but I did not get to it yet. This discussion on ElixirForum made me lookup how to create a VSCode / Zed plugin to highlight foreign languages in sigils.
https://elixirforum.com/t/script-tags-in-heex-and-app-js/
The general, non-editor-specific way is to create a tree-sitter grammar that will match the opening and closing forms of the sigils, then delegate to an existing language. This is easier to do than, for example, a complete grammar for HEEX, that looks like HTML but is not HTML. If you like to write lexers and parsers you should feel right at home :-).
I’ve put up a repo with the scaffolded VSCode extension and still have to do the same for Zed which is my daily driver.
https://github.com/alzo-archi/vscode-sigil-js-highlighting-sample
Frank Dugan also shows in this thread that this is doable with a lot less ceremony in NeoVim : https://elixirforum.com/t/script-tags-in-heex-and-app-js/67992/19
Third Form : writing Elixir code that generates instructions for <language>
I particularly like this one. It is similar to the first one (writing elixir code that generates <language> code) but shifts the responsibility of the instructions interpreter to the target language.
This is what I was doing in those two posts : Canvas from Elixir and What if Liveview had DOM Access ?.
In the canvas example, I composed higher level functions from canvas primitives to render a game screen :
def render(%Game{} = game, options \\ [width: 800, height: 800]) do
w = options[:width]
h = options[:height]
operations = [
clear_screen(w, h),
draw_players(game, w, h),
draw_ball(game, w, h)
]
commands = render_commands(operations, :html5)
end
```elixir
defp clear_screen(w, h) do
[
{:fill_style, "black"},
{:fill_rect, [0, 0, w, h]}
]
end
defp draw_players(%Game{} = game, w, h) do
for player <- game.players do
[
:begin_path,
{:line_width, 20},
{:stroke_style, "red"},
{:arc, [w / 2, h / 2, Enum.min([w, h]) / 2 * 0.7, player.start_pos, player.end_pos]},
:stroke,
:close_path
]
end
end
The JS side received a list of commands and their arguments :
const commands = [4, "black", 7, 0, 0, 800, 800, 0, 5, 20, 6, "red", 8, 400.0, 400.0, 280.0, 0, 0.6283185307179586, false, 2, 3, 0, 5, 20, 6, "red", 8, 400.0, 400.0, 280.0, 2.0943951023931957, 2.7227136331111543, false, 2, 3, 0, 5, 20, 6, "red", 8, 400.0, 400.0, 280.0, 4.188790204786391, 4.81710873550435, false, 2, 3, 0, 4, "white", 8, 400.0, 400.0, 10, 0, 6.283185307179586, false, 1, 3];
Those commands were interpreted by a small interpreter mapping an integer to a command and consuming the correct number of arguments :
const run = (context, commands) => {
let len = commands.length;
while (len > 0) {
const item = commands.shift();
len--;
switch (item) {
case 8:
args = commands.splice(0, 6);
context.arc(...args);
len -= 6;
break;
Since this is tedious to write, the interpreter code itself was generated from Elixir based on a simple mapping :
@call_map %{
begin_path: {0, :call, 0},
fill: {1, :call, 0},
stroke: {2, :call, 0},
close_path: {3, :call, 0},
fill_style: {4, :prop},
line_width: {5, :prop},
stroke_style: {6, :prop},
fill_rect: {7, :call, 4},
arc: {8, :call, 6}
}
This is the pattern 1 finding its way in pattern 3 !
In the liveview example, I provided an elixir runtime and a JS handler to handle commands over time
(Note : this has been cleaned up and is actually used !) The idea was to be able to query or set DOM properties from Elixir, leveraging sequential execution and batching of calls.
def handle_event("sequential_example", _unsigned_params, socket) do
alias TestWeb.DOMStuff.Ops
box_1 = "#some_box"
box_2 = "#some_other_box"
{:noreply,
TestWeb.DOMStuff.seq_exec(
socket,
[
Ops.ignore(box_2),
Ops.get_bounding_client_rect(box_1, fn s, v -> update(s, :box_dimensions, fn _ -> v end) end),
Ops.set_height(box_2, fn s -> "#{2 * s.assigns.box_dimensions["width"]}px" end),
Ops.get_bounding_client_rect(box_2, fn s, v -> update(s, :blue_box_dimensions, fn _ -> v end) end),
Ops.un_ignore(box_2),
],
fn s -> update(s, :got_everything, fn _ -> true end) end
)}
end
Here, the Ops
module provides functions over manual writing of common operation tuples but is totally optional :
{:get_bounding_client_rect, [], "#some_other_box",
fn s, v -> update(s, :blue_box_dimensions, fn _ -> v end) end}
Again, it’s only tuples and lists all the way down, building a tree of calls, until the actual execution. That also means that this pattern is super easy to test since it’s only made of small pure functions building more and more complex expressions.
Managing language runtimes from Elixir
This seems a good idea to me from a developer’s perspective. In Xava(work in progress) I’m making progress on installing a JDK and Maven on-demand from Elixir. I will also take advantage of the jinterface-build repo I opened to provide prebuilt OtpErlang.jar
for the last three versions of OTP.
(JInterface is a Java library to make a Java app behave as an Erlang node. It is now considered legacy but stable and should be fine for my Elixir-driven office documents handling instead of doing the same ceremony over HTTP or another protocol.)
I’m emulating sheerlox’s Nodelix for Xava. There’s a bit of plumbing and verification to be sure that a runtime-downloaded language runtime is safe to use, but nothing terrible.
The opposite, which is commonly done in Nerves helper libraries, is to provide a few runtime checks or diagnostic modules that will provide you feedback on how additional components must be installed. This makes a lot of sense for embedded systems.
What’s the point ?
For dynamic systems that behave like modular monoliths, I think the runtime-install of another language runtime is a great solution. It enables lighter app nodes and heavier worker nodes with a single codebase.
But the biggest upside to me is the same you get by switching from, for example, a React SPA + Phoenix API to a LiveView app : defragmentation.
Elixir is amazing at safe orchestration of processes, so leveraging that capacity to bring third-party language solutions in a single coherent codebase is useful. I do not have to switch from an Elixir project to a Node project to a Java project to leverage those languages, and do not have to abandon other ecosystem’s strengths either.
It takes a bit of work, with the downside of having an abstraction layer built and tested, but it is possible to provide either a functional API over a third party language+library, or just an execution layer to embed a third party language in your Elixir code.
What am I doing anyway ?
- Because parts of Alzo are loaded and swapped at runtime and do not need a frontend build system :
- Runtime-defined hooks and styles for LiveView (hence the ~JS and ~CSS, not shown, sigils)
- Abstracting over DOM queries in LiveView
- Because parts of Alzo benefit from a side-effect free data manipulation language for super-admins (me) :
- Cleaning up and making OVO production-ready with both compute and time limits, after removing the weird GUI parts
- Because Alzo needs to generate static websites, but with nice WYSIWYG drag-n-drop editors, and has some canvas stuff happening :
- Managing static websites ultimately built with AstroJS (https://astro.build) at runtime, hence the need of Nodelix
- A friendly Elixir API for HTML5 Canvas
- And because Alzo needs to analyze, fill and rewrite Office documents and that I would rather use a single library for that :
- Interop with Apache POI through Java, through Jinterface
The “write an interpreter and runtime-everything” pattern seems to appear before my eyes a few times a year. I think that’s an Elixir-caused behavior that I would not have developed if I had stayed in another ecosystem. But fantastic resources exist outside of our Elixir community and it would be a shame to be unable to reach for them.