[p-dev] Compilation strategy

Paul Bone paul at plasmalang.org
Sun Oct 30 22:48:40 AEDT 2016


I'm beginning to think about the module system. The module system is also
related to the language/implementation's compilation strategy.  This is
something that, although I have some ideas, I'm willing to bet that others
have equally valuable but different ideas due to different experience.  So
I'd like to raise some options and see what people think.

Right now, plasmac compiles Plasma source (.p) files into Plasma bytecode
(.pz) files.  These can be executed by pzrun.  By adding multi-module
support and separate compilation there are two main things that need
solving: linking and providing interface information to plasmac for building
other modules.

Linking
-------

Multiple Plasma Bytecode files should be able to be linked (packaged) into a
single bytecode file.  This means that the runtime will not need to load
multiple files, and a program can consist of a single file for convenience.
This is also mostly straightforward.

However it is also important to allow load time or runtime linking.  It can
be necessary to distribute a program and its libraries as separate files (eg
licensing).  So the runtime code will need to support this eventually, this
is a long term goal.

It will also be desirable to support native code generation and distribute
both programs and libraries in binary form.

There are no outstanding questions in my mind about linking, but it is
relevant to this topic and if anyone has any other ideas I'd love to hear
them.


Interface files
---------------

Plasma should over separate compilation, a normal way to do this is to allow
interface files to be generated and then used to compile other modules.
I'm generally in favor of having plasmac generate the interface files.  But
there are two alternatives I'll describe briefly.

The first approach is requiring the developer to create the interface
files.  This is what C does and seems to be not a very good idea.  That is
until we remember that OCaml also does this, and it's less awful for OCaml,
partly because they have an actual module system.  Nevertheless I'm not all
that keen on this idea.

Anther option is to store the interface information within the generated .pz
file for that module.  I think it's what Java does.  AIUI this has a couple
of downsides.  It couples the interface with the code rather tightly (and
there may be reasons for developers to separate them).  It also couples the
generation of both files together.  A small change in the implementation
updates the interface file, and may force (timestamp based) recompilation of
other files.  Having a single file can make some things easier.

Therefore generating a separate interface file (.pzi) seems best.  This also
extends naturally to generating optimisation interface files (.pzo) to enable
inter-module optimisation (yeah, I'm thinking ahead).

Compilation strategy
--------------------

I'm leaning towards having separate compilation options for generating
interface files than building a module.

    Generate my_module.pzi
    $ plasmac --make-interface my_module.p

    Generate my_module.pz
    # plasmac --compile my_module.p

Separating these build steps allows modules' implementations to form cyclic
dependencies easily.  And makes the main compilation step easier to
parallelise, since the compilation of one module does not depend on the
compilation of another.

Project's compilation should be driven by a separate program just called
"plasma" that uses dependency information to decide which files need to be
rebuilt.

Text or binary
--------------

Binary can usually be read and written faster.  But text is easier for
humans to read if necessary.

Cycles
------

Module implementations should be able to form cycles.  But allowing cycles
for module interfaces depends on other considerations, such as whether an
interface file should include transitive information.

In general developers should attempt to avoid cycles.  Since they often
indicate that there are other design/architecture problems.

Transitive dependencies
-----------------------

Golang boasts fast compilation times because it doesn't have to read many
interface files, since transative dependcies' information is provided by the
direct dependency's interface file.  And therefore Go compilation doesn't
need to open many files.  It is true that Golang compilation speeds are
fast, I don't know if this is the only reason why; it's hard to imagine that
the compiler IO bound to the degree where this would make a difference.
However, it's still a good idea and _does_ reduce the IO.  It also
simplifies library distribution.  So this seems like a good idea.
Does golang support cyclic dependencies?  How can that work?


Please let me know if you have any thoughts on these ideas, any new ideas or
know of any options or questions that I haven't considered.  In particular
let me know if there's something that a language implementation does/doesn't
do that it particularly helpful or painful that I may not be aware of.

Thanks.


-- 
Paul Bone
http://paul.bone.id.au


More information about the dev mailing list