Sunday, November 16, 2014

Writing languages


We want language writing to be as simple as possible, and it usually never is. Every language is different in their own way and there is really no unique solution. Not all languages, for instance, can be represented by traditional grammars or understood by parser technologies such as Antlr.

A good solution, then, would be made of multiple solutions, the first of which we'll discuss here. We make the distinction between two processes: compiling and linking. Compiling will transform the language's syntax into a common syntax (in this case a Roslyn parse tree) and linking will assign semantic meaning to the result of the compilation process after all code has been compiled.

With that we introduce the first way to create Excess languages: A Roslyn compiler. This is the lowest level of of planned compilers and consists of traditional Roslyn tree transformations, this would work particularly well on languages which syntax is c# compatible. One clear example of it is the synch/asynch languages introduced in the previous post.

Right onto the compilation process, these compilers do not have to deal with its encompassing syntax, which means they do not need to worry about the outer asynch syntax unless it needs to access its parameter list or similar info which is provided via context. As such, it would only need to transform the user code:

            SyntaxNode xscode = ctx.Compile(code);
         
            return CodeTemplate.ReplaceNode(
                              CodeTemplate.DescendantNodes().OfType<BlockSyntax>().First(),
                              xscode);

It is possible the inner code contains other xs features, so the compiler will first use the provided xs compiler to transform said user code. Other than that, it is just a simple Roslyn template substitution where the code in the template is replaced by the user code. The template looks like :

        static private StatementSyntax CodeTemplate = SyntaxFactory.ParseStatement(@"
            Task.Factory.StartNew(() =>
            {
            });");

As for the linking process:

        public SyntaxNode link(ExcessContext ctx, SyntaxNode node, SemanticModel model)
        {
                return node;
        }

Which does nothing since the user code is already written in c# and thus already have all semantic information needed... and that is all. It is a very simple example, however, for most languages out there not named c# 4.5 these would be nice constructs to have.

For the future there are a couple of parser utilities planned. Obviously using Antlr grammars for the compilation process, since they integrate so well with the Microsoft stack. Also planned are a couple of excess languages to aid both compilation and linking. Stay tuned.

No comments:

Post a Comment