×

Sign In

Sign in with your username and password to access your account.
×
Previous
Part 5 of 5

Brainfuck at Mach3

In the latest episode of our endeavor to create a Brainfuck meta-compiler we got a first working version and learned about some interesting behaviors of the v8 JIT optimizer. When leaving off, I mentioned that there are still many opportunities left to improve the performance of the generated code. In the following sections, these tricks will be explained as well as the way they were implemented.

I want to highlight, that I didn’t come up with most of these ideas. While browsing, I stumbled upon multiple websites about similar, way more evolved projects. Especially Nayuki’s compiler written in python used some cool methods and introduced the concept of “simple” loops. Furthermore, the article by Mats Linander talked about “scan-loops”, which were another neat addition to my implementation. If you are interested in more things Brainfuck you should definitely check them out. (Mats seems to be a mad man who wrote a Brainfuck-Compiler in Brainfuck capable of creating Linux binaries 👀)

Building trees 🌳

First off, we have to take a look at the old compiler. It basically consisted of a single function that recurs for every Brainfuck loop, processes it, pushes a string of JS code to an array, and returns. The compiler did mainly four things:

  • Merge repeating instructions
  • Replace instructions with JS code snippets
  • Replace loops with function calls
  • Flatten nested loops into multiple functions calling each other

All of these operations could be done on an instruction by instruction basis, allowing everything to happen in a single step. By “step” I mean a complete iteration through the source code. This time we will apply more involved optimizations to the code requiring possibly many steps, as they build upon another.

Thus we need some form of intermediate representation of the Brainfuck source code, that is easily traversable, convenient to modify, and furthermore simple to extend with functionalities. Possibly the most common data structure used to model high-level languages are trees. Every loop will be represented by a node, having either more nodes as loops and/or leaves as instructions.

Such a tree is quite nice to be implemented in object-oriented languages, as every node/leave can be a specialized type, of a basic node class. In our case, every node will inherit from a base class called ‘Instruction’. It only contains a static factory method ‘create’ that reads the current instruction, tries to merge any following repetitions, instantiates a new node, and returns it. Either a ‘DataInstruction’, ‘PointerInstruction’, ‘Loop’, or ‘OutputInstruction’ might be created. For reasons described in the first article I don’t want to deal with the lack of blocking IO in JS, just in case you are wondering why there is no input class.

The four classes and their parent classes are described below. Later on, when we tackle the different optimizations we will introduce additional even more specialized instructions.

Combined Instruction

Any instruction that might be merged with neighboring repetitions, derives from the ‘CombinedIstruction’ class. The only thing it stores is the number of repetitions found as the ‘data’ member field. The constructor takes additionally the instruction ASCII character and its possible polarities. Based on that, the sign of the ‘data’ might be switched.


class CombinedInstruction extends Instruction {
  constructor( c, pos, neg, d ) {
    super();

    if( c === pos ) {
      this.data= d;
    } else if( c === neg ) {
      this.data= -d;
    } else {
      throw Error('Wrong char for '+ this.constructor.name+': '+ c);
    }
  }
}

DataInstruction & PointerInstruction

The ‘+’ and ‘-‘ instructions in Brainfuck are represented by ‘DataInstruction’ leaves in the tree. The class derives from ‘CombinedInstruction’. Therefore no two classes for either addition or subtraction are needed. Instead, there is only this single instruction, that adds the ‘data’ value to the current cell.


class DataInstruction extends CombinedInstruction {
  constructor( c, d ) {
    super( c, ASCII.Plus, ASCII.Minus, d );
    this.offset= 0;
    this.doSet= false;
  }

  getHead() {
    if( this.doSet ) {
      return 'data[ptr+'+ this.offset+ ']=';

    } else {
      return 'data[ptr+'+ this.offset+ ']+=';
    }
  }

  print( s ) {
    s.append( this.getHead() + this.data+ ';' );
  }

  toMult( fo ) {
    return new MultInstruction( this.data, this.offset, fo );
  }
}

The PointerInstruction works the same way, with the only difference being, that the printed JS code will be distinct in the end.

IOInstruction & OutputInstruction


class IOInstruction extends Instruction {
  constructor() {
    super();

    this.offset= 0;
  }
}

class OutputInstruction extends IOInstruction {
  constructor() { super(); }

  print( s ) {
    s.append( 'pc('+ this.offset+ ');' );
  }
}

The ‘OutputInstruction’ derives from ‘IOInstruction’ and is as simple as it gets: It prints the JS-Code to call ‘pc()’ and that’s it.

Block & Loop

A code block is basically an array of instructions, or in other words a tree node. Every Brainfuck loop is represented by a block, which may contain instructions and further nested blocks. Moreover, the whole script in itself is a block. All operations on the code will be handled through the interface of this class.

Loops are both instructions and blocks. They derive from the ‘Instruction’ class and describe the entry point of a Brainfuck loop. But as they contain code to be executed, they also are blocks.

In a better world, JS would allow for multi-inheritance, and the loop class would derive from both base classes, or implement one of them (probably Instruction) as an interface. Thus I decided to use a somewhat shoehorned decorator pattern like approach. Every loop object has a block instance as member-field and implements the same methods, which often just wrap the blocks methods. This is not what I would call super nice oo-design but this was the first thing that came to mind.

Parsing

The parsing of the Brainfuck source code is handled by the constructor method of the ‘Block’ class. It iterates over every byte and creates tries to instantiate instruction objects from them via the ‘create()’ factory. A ‘null’ reference might be returned, as not every ASCII character in the source corresponds to a Brainfuck command. Comments consisting of alphabetical characters are ignored for example.


//...

while( !it.isEnd() ) {
  // End loop sub-block and jump over closing bracket
  if( it.get() === ASCII.CBracket ) {
    if( isLoop ) {
      it.next();
      return;
    }

    // Global block
    throw Error('Unexpected end of loop');
  }

  let ins= Instruction.create( it );
  if( ins ) {
    this.instr.push( ins );

    // Chache refs to all loops of the block for later optimaziations
    if( ins instanceof Loop ) {
      this.loops.push( ins );
    }
  }
}

//...

If a ‘Loop’ object is created, it internally will call the constructor of its code block member, which again will load its source code. When finished, the factory returns the loop object and the parent block resumes iterating through its code. However, now it needs to start behind the child blocks code, which would require always passing the child blocks end position back to the parent one. A better way is to use the same counter variable everywhere so that the current parsing position is always up to date.

But global variables are ugly and should be avoided. JS generally does not allow to pass function parameters by reference, with one exception: objects. Wrapping the counter in an object, enables it to be passed from parent to child without needing to be returned, as both refer to the same object in memory. As the object is used to iterate over some data it is called an iterator.

Technically the object doesn’t get passed, to the function. As you only have a reference to the object on the heap anyway, it’s this reference that is passed by value. If you want to be nit-picky, this is different from passing a stack local object via reference like you could in C/C++.


class ByteIterator {
  constructor( b ) {
    this.buff= b;
    this.idx= 0;
  }

  isEnd() { return this.idx >= this.buff.length; }

  next() { this.idx++; return this; }

  get() { return this.buff.readUInt8(this.idx); }
}

The ‘ByteIterator’ class contains a reference to the buffer allocated by ‘readFileSync’ and the counter. It provides an interface to advance the counter, get the currently indexed byte, and query whether the end of the buffer has been reached.

Transformations

After all of this setting stuff up, we are finally ready to take a look at the different operations the compiler is going to perform on the code before printing JS code. There are six stages of conversion and optimization, which are described below.

Offset fusing

Whenever we move the cell pointer we have to store the new position, so that the following commands act on the desired cell. This ultimately results in a lot of writes to the cell pointer variable that could be avoided, if the compiler converted them to temporary offsets. What do I mean by that? Let’s look at the following example: >>> ++ < -

If we look at this little Brainfuck snippet and had to describe it, we could either say: “Move the pointer three to the left, add two, move the pointer one to the left and subtract one” or more straightforward: “Add two to the cell three slots away from the left and subtract one from the one only two cells away”. One way only looks at each instruction and uses the cell pointer as the only way to refer to the current cell. The other option is to take all pointer movements into account, which are offsets from the one directly before it, and convert them to offsets from the very first instruction on. A series of movements like this -2 +1 +5 -3 +1 would be transformed into -2 -1 +4 +1 +2 .

The observant among you might have already realized, that this all well and good but it is not completely equivalent. By always keeping the value in the cell pointer variable the same, code that follows after might act on the wrong cells. Therefore we need to actually move the pointer at the very end of a transformed section of code to the cell it would have ended up at if it moved all the time. This cell is just the last one, referred to by a transformed offset.

In the example above, showing a list of transformed offsets, we need to add 2 to the cell pointer before leaving the code section. The following example shows how a snippet of printed JS code looks after it was transformed.


p+= 46;

data[p]+= 865;
p+= -7;

data[p]+= 79;
p+= 6;

/////////////////////

data[p+ 46]+= 865;

data[p+ 39]+= 79;

data[p+ 45]+= 440;

The code above is part of a little benchmark that compares both methods. The performance difference is not mind-blowing but consistent. With Chrome I got an uplift of about 10% and even 30% using Firefox.

As already mentioned above, this first step of transforming the parsed source tree is implemented as a method of the Block class which is then also wrapped by Loop . The function traverses through all nodes and their children to remove the pointer movements. Thus the cell offset has to be stored by the data and IO instructions themselves. The value is stored in the offset property of each instruction interacting with cells.


fuseOffsets( offset= 0 ) {
  this.counterOffset= offset;

  for( let i= 0; i< this.instr.length; i++ ) {
    const cur= this.instr[i];
    if( cur instanceof PointerInstruction ) {
      // Save the number of cell shifts done by the pointer move instruction
      // and replace it with null
      offset+= cur.data;
      this.instr[i]= null;

      // Save largest pointer offset
      maxPointerOffset= Math.max( maxPointerOffset, offset );

    } else if( cur instanceof DataInstruction || cur instanceof IOInstruction ) {
      if( cur instanceof IOInstruction ) {
        this.hasIO= true;
      }

      // Set the offset the data instruction should use
      cur.offset= offset;

      // Change of the counter cell
      if( offset === this.counterOffset ) {
        this.counterData+= cur.data;
      }

    } else if( cur instanceof Loop ) {
      // Recurse in child blocks/loops
      cur.fuseOffsets( offset );
      if( cur.isComplex() ) {
        this.hasComplexLoop= true;
      }
    }
  }

  this.pointerOffset= offset- this.counterOffset;

  // Do pointer fixup at the end of block/loop
  if( this.pointerOffset ) {
    this.instr.push( new PointerInstruction( ASCII.Right, this.pointerOffset ) );
  }
}

The whole method is mostly made up of a single loop iterating over each instruction inside the block. If a PointerInstruction is encountered its offset is added to an accumulator, that keeps track of the global offset. The instruction is then nulled out and therefore removed from the code. This is just to prevent unnecessary array resizing, which would require shifting around the other elements. The accumulator is initialized with a value received from the caller and used to set the offset member of any DataInstruction or IOInstruction .

If an entry to a loop is reached, fuseOffsets is called recursively with the current accumulated offset as a parameter. As every code block is required to end with the offset being set back to the value it started with, no return value is needed. To meet this constraint, a PointerInstruction needs to be appended if the current offset mismatches the initial one.

This difference is stored as pointerOffset as this is the accumulated number of cells an isolated code section moves the cell pointer. Moreover, the initial offset provided by the caller is stored in the counterOffset member. If a loop has a pointer offset of zero, this means, that the cell checked before the loop is entered, is the same one checked if the loop should be rerun. In this case, the loop behaves like a simple while-loop with a condition variable.

Last but not least, all additions and subtractions done to this cell are combined, to get an accumulated counting step done each iteration. This information is very important for later optimizations, which is why we keep track of it. In case the loop decrements the counter cell by one on each run we have the equivalent of a for-loop.

If you just asked yourself “Why would anyone be interested in writing a loop that uses a different counter cell each time it runs?”, you might be even more surprised how simple a common use case is: [>] That's it. A loop that runs and moves the context pointer by one. The result is a loop that moves the pointer to the right until a cell containing zero is found. As it searches for a cell they are referred to as “searchers” and can either run to the right or left.

Inline simple loops

What is it that a Brainfuck loop does basically? It either adds stuff to cells, moves the cell pointer, does IO, or enters another loop. In most cases, there is no IO and the loop behaves like a for-loop with a single counter cell that's decremented by one and hence does no pointer movements.

Loops located at the end of the source tree branches don’t contain any child loops and therefore do nothing else than adding or subtracting from cells. Having no child loops guarantees that the loop counter and more broadly any other cell is modified only linearly. In other words, every cell modified, will be added or subtracted by the number of loop runs. If the loop counter is decremented by one for each run, the number of loop runs is known before it starts. The value in the counter cell is the number of iterations to be made. In case of all of these constraints being met, a loop is linear, has no side effects, and therefore is referred to as “simple”. Any other loop is “complex”.

As such a simple-loop always runs for a number of N times, which is known when it is entered, it can be replaced with an optimized version. The transformed version multiplies every offset added or subtracted from a cell by N. It moreover removes the call to the JS-function containing the loop and inlines the new version instead. Finally, the counter cell is set to zero, as the following code might rely on the fact, that the counter cell was cleared by the iterations. For this specific reason a ResetInstruction is implemented.


function fn24() {
  while (data[ptr + 25]) {
    data[ptr + 26] += data[ptr + 32] * 1;
    data[ptr + 32] = data[ptr + 26] * 1;
    data[ptr + 30] += data[ptr + 26] * 1;
    data[ptr + 27] += data[ptr + 26] * 1;
    data[ptr + 26] = 0;
    ptr += 9;
    ae();
  }
}

Because now no iteration is happening side effects are forbidden. Such are IO operations, which now would only print or prompt the user once, and pointer movements that don't result in a sum of zero. They would also break linearity as they might change the counter cell’s content.


isSimple() {
  return (!this.loops.length) && (!this.pointerOffset) && !this.hasIO && (this.counterData === -1);
}

In summary, a loop can be inlined if it is simple. A loop is simple if it does no pointer movements, the counter cell gets decremented by one during each iteration, no IO happens and no other loops are entered.


inlineSimple() {
  this.loops.forEach( (l, i) => {
    // Get the simplified code of the loop and inline it
    if( l.isSimple() ) {
      let idx= this.instr.indexOf( l );
      this.instr.splice(idx, 1, ...l.simplify() );

      // Remove inlined loop from array
      this.loops[i]= null;

    // Let child block check if it can inline its children
    } else {
      l.inlineSimple();
    }
  });

  this.removeReassignment();
}

The function above is called on the root node and checks all loops in the block whether they are simple. If a loop is simple, simplify is called. This method returns the code to inline, which is inserted into the array of instructions. Lastly, the loop is nulled out. inlineSimple is called recursively for all complex loops that can't be inlined.


simplify() {
  this.instr.forEach( (ins, idx) => {
    if( ins instanceof DataInstruction ) {
      // Convert to inlined multiplication
      if( ins.offset !== this.counterOffset ) {
        this.instr[idx]= ins.toMult( this.counterOffset );

      // Don't allow changes to the counter until the end of the loop
      } else {
        this.instr[idx]= null;
      }

    } else if( ins ) {
      // If not data instruction and not null
      console.error( this );
      throw Error('This is not a simple loop!');
    }
  });

  // Reset the counter cell
  this.instr.push( new ResetInstruction( this.counterOffset ) );

  return this.instr;
}

The actual simplification process is pretty simple in itself. It’s handled by a loop that checks every instruction if it is a DataInstrutcion , which then is replaced by a MultInstruction . Changes to the counter cell are nulled out completely on the other hand, as the cell is reset at the very end anyway.


class MultInstruction extends DataInstruction {
  constructor( d, o, fo ) {
    super( ASCII.Plus, d );
    this.offset= o;
    this.factorOffset= fo;
  }

  print( s ) {
    s.append( this.getHead()+ ' data[ptr+'+ this.factorOffset+ ']*'+ this.data+ ';' );
  }
}

This instruction is a DataInstruction that stores an additional cell offset. The factorOffset is set to the value of the counterOffset member when a DataInstruction is converted.

This optimization handles another common Brainfuck pattern, the clear-loop: [-] It is used to set the current cell to zero, by decrementing it multiple times. As there aren’t any other instructions in the loop only a single ResetInstruction will be emitted, which is a perfect fit for the intended purpose of the loop. The same goes for copy loops, that are replaced with multiplications by one. One could say we hit two birds with one stone. (Plz leave me alone PETA)

Remove Reassignment

Due to the previous code transformations, we end up with a lot ResetInstruction s. The zeroed counter cells are probably being reused by other code sections, which then add or subtract values from it. Adding to zero is the same as if we just override the contents of the cell, which then would make the previous ResetInstruction redundant.

>Therefore, if it is known that a DataInstruction acts on a cell that was previously reset, the ResetInstruction can be removed. The DataInstruction class provides the doSet flag to indicate that a cell modification should override the cell's value instead of offsetting it.


removeReassignment() {
  this.instr.forEach( (ins, idx) => {
    // Find all reset instructions
    if( ins instanceof ResetInstruction ) {
      
      // Find first data interaction with the cell
      for( let i= idx+1; i!= this.instr.length; i++ ) {
        const cur= this.instr[i];
        
        // Loop may change the pointer offsets or interact with the cell
        if( cur instanceof Loop || cur instanceof PointerInstruction ) {
          break;

        // Read/Output might expect cell to be zero
        } else if( cur instanceof OutputInstruction ) {
          if( cur.offset === ins.offset ) {
            break;
          }

        // Remove the reset instruction and allow the data instruction to
        // override the cells value
        } else if( cur instanceof DataInstruction ) {
          if( cur.offset === ins.offset ) {
            cur.doSet= true;
            this.instr[idx]= null;
            break; // <- dying inside >:^(
          }
        }
      }
    }
  });
}

This method is called by inlineSimple and checks every instruction of the block whether it is a ResetInstruction . Provided it is, another loop tries to find a DataInstruction that acts on the same cell before a loop or pointer movement is encountered. These might change the pointer position, internally act on the cell, or expect it to be zero. The search is also ended early if an IOInstruction uses the cell for the same reasons. If an instruction is found, the flag is set and the ResetInstruction is nulled out.

(You might wonder why there is a highly unprofessional note next to the last break statement. I spend a considerable amount of time debugging to realize it was missing.)

Convert to branches

When giving an introduction to Brainfuck in the first article, I described how loops can be used as if statements. Resetting the counter variable in the loop via a [-] or moving the pointer to a known cell containing zero guarantees that the loop won’t be run a second time. Converting every Brainfuck loop to a while loop in JS is convenient but has a downside if it is used as an if . It introduces an unnecessary condition check at the end, to test whether it should keep running. In this case, it will always be false and therefore redundant. Therefore it would be nice if an actual if statement could be printed in the JS code instead of a more generic while loop.

To be sure, that the loop can be converted it has to be ensured that it would not run more than a single time. The cell, the pointer refers to when leaving the code block, has to be zeroed out. The simplest way we can ensure that is that the last instruction called on the cell is a ResetInstruction . Additionally, no other branches, loops, or pointer movements may be in between.


convertToBranch() {
  this.loops.forEach( (l, idx) => {
    if( l ) {
      const br= l.convertToBranch();
      if( br ) {
        // Replace instruction and loop entry
        this.loops[idx]= br;
        this.instr[ this.instr.indexOf(l) ]= br;
      }
    }
  });
}

This method just iterates over all loops and tries to convert them to Branch object. A branch will be returned on success and then used to replace the loop as instruction and in the loop array.


convertToBranch() {
  this.block.convertToBranch();

  // Offset to the cell that will be actually checked when the loop tries to run again
  const rerunCheckCell= this.block.counterOffset+ this.block.pointerOffset;

  // Iterate in reverse direction
  for( let i= this.block.instr.length-1; i >= 0; i-- ) {
    const ins= this.block.instr[i];
    // Found a reset instr first
    if( ins instanceof ResetInstruction ) {
      if( ins.offset === rerunCheckCell ) {
        return this.toBranch();
      }

    // Found data instr -> cell may not be zero
    } else if( ins === DataInstruction ) {
      if( ins.offset === rerunCheckCell ) {
        return null;
      }

    // Found loop or pointer ins -> cell may not be zero or has moved
  } else if( ins instanceof PointerInstruction || ins instanceof Loop || ins instanceof Branch ) {
      return null;
    }
  }

  return null;
}

The actual conversion is done by the loop. It calculates the cell checked for a rerun first and then iterates over its instructions in reverse. If a ResetInstruction is found that sets the cell before any other instruction is found that might modify it, we may do the transformation.

The Branch class inherits from Instruction and has its own specialized print function that emits an if statement. The other methods wrap the interface needed for later transformation steps.


class Branch extends Instruction {
  constructor( l ) {
    super();

    this.id= l.id;
    this.block= l.block;
  }

  fuseLoopFunctions() {
    this.block.fuseLoopFunctions();
    return this;
  }

  convertSearch() { this.block.convertSearch(); }

  print( s ) {
    s.append( 'if(data[ptr+'+ this.block.counterOffset+']){' );
    this.block.print( s );
    s.append( '}' );
  }

  printFunctions( s ) { this.block.printFunctions( s ); }
}

Convert Search Runners

Another pattern that can be simplified is search loops or scan loops. They move the cell pointer to the left or right until they find a cell that is zero and therefore stop. Detecting them is pretty simple. If a loop only contains pointer movements it at least behaves like a search loop and thus can be replaced by a specialized instruction.


convertSearch() {
  this.loops.forEach( (l, idx) => {
    if( l ) {
      const ins= l.convertSearch();
      if( ins ) {
        // Remove loop entry
        this.loops[idx]= null;
        this.instr[ this.instr.indexOf(l) ]= ins;
      }
    }
  });
}

convertSearch() {
  // Check that loop only contains a single pointer move instruction
  let ctr= 0;
  let onlyContainsPointerMove= this.block.instr.every( ins => {
    if( !ins ) { return true; }
    if( ins instanceof PointerInstruction ) { return !ctr++; }
    return false;
  });

  // Convert or recur
  if( onlyContainsPointerMove && ctr ) {
    return new SearchInstruction( this.block.counterOffset, this.block.pointerOffset );
  } else {
    this.block.convertSearch();
  }
  return false;
}

class SearchInstruction extends Instruction {
  constructor( off, mv ) {
    super();

    this.counterOffset= off;
    this.pointerOffset= mv;
  }

  print( s ) {
    if( this.pointerOffset > 0 ) {
      s.append( 'sfwd('+ this.counterOffset+ ','+ this.pointerOffset+ ');' );
    } else {
      s.append( 'srev('+ this.counterOffset+ ','+ Math.abs(this.pointerOffset)+ ');' );
    }
  }
}

The SearchInstruction emits JS calls to predefined functions. sfwd and srev search either in forward or reverse direction. They are internally implemented as simple loops. If you really wanted you could do some additional improvements. By using an ArrayBuffer instead of an array to hold the cells, a function like memchr could be implemented. It is very efficient as it looks at whole 32 or 64bit data words at a time instead of checking each byte after the other.

Fuse Loop functions

This is the last step before printing any JS code. Due to the lack of any form of subroutines in Brainfuck, there might be code that performs the same task in multiple locations. In such a case instead of creating multiple identical functions, all calls could go to one version. The elimination of this duplicate code reduces the total number of JS functions to be parsed, compiled, and optimized by V8. Again, this transformation is performed on a per-block basis. To make it work, it has to be ensured that the emitted functions are interchangeable.

In the Block class the method works pretty much the same as for the search-loop conversion. If the transformation succeeds an object is returned, that is used to replace the loop in the block.


fuseLoopFunctions() {
  if( !this.block.loops.length ) {
    const body= new StringRef();
    this.printFuncBody( body );

    let f= funcMap.get( body.get() );
    if( f ) {
      //console.log('Replacing function:', this.id, body.get() );
      f.duplication++;
      return f;
    }

    funcMap.set( body.get(), this );
    return this;

  } else {
    this.block.fuseLoopFunctions();
    return this;
  }
}

The actual conversion happens in the Loop class and only for leaves of the parsed tree. The way it works is super ugly but very simple. First, the body of the function is printed and cached for later. Then a lookup happens into a map of all loops by their function body. If a loop with an identical body is found, it is returned. Failing that, the own body is added to the map.

One could argue, that first doing all compilation and then check if there are any duplicates might be a waste of time. Well, it is, but I implemented this step last and therefore it ended up as the last step in the compilation. But I figure this is definitely not as bad as hashing every function body in a map.

Another approach could be comparing the Brainfuck representations of the loops. However, to achieve similar results larger loops, containing smaller ones, need to be compared as no inlining has happened yet. This could also be handled by multiple passes merging first leaves and then parent nodes.

Printing 🖨

Subsequently, all code blocks need to be printed out as JS functions. printFunctions recursively travels up the tree and prints all function bodies. In the end, the root nodes print method is called to emit the entry point.

All methods get a StringRef passed as a parameter. Its purpose is to work similarly to a Java StringBuilder . Internally it just uses basic string concatenation though, as this is the fastest way according to some benchmarks on StackOverflow.

The JS source code string also contains the implementations of intrinsic functions, called by certain instructions, like: pc , ae , sfwd , etc. These functions are appended first like a header for the rest of the program. CODE: compile

Benchmark Time

I think I can say the results speak for themselves.

Compared to the previous top performer version 4.5 the time was reduced by a factor of 1.6. This is an impressive uplift, however looking back at the older versions, the difference gets even more pronounced. V2 for example is more than three times slower and the very first V1 is outpaced by a factor of more than 15. If V3 and its improved version are ignored, we can see the continual advancements made in terms of runtime speed.

Closing Notes

Looking back at the things we touched upon until now, we had the opportunity to discuss numerous interesting topics and concepts. First, we talked about esoteric languages and more specific Brainfuck as a popular example. We built a barebones prototype with awful performance. Then we took a look at bytecode and used it to come up with a converter and interpreter executing it. As an extended tangent and more out of curiosity how well it translated into JS we tried implementing two versions of direct threading interpreters. In addition to learning what direct threading is we talked about its use case in more serious virtual machines and interpreters like JSCore. Finally, we risked the jump to meta compiling instead of interpreting. Again we started off with a pretty barebones first version and improved it to be fastest yet. Now we took it a step further and introduced several optimization passes to beat our high score.

With this, I thank you all for reading and hope you enjoyed following along. This will be the last article on this topic in the near future at least. Although I have still some ideas left, they have to wait for the time being as I would like to explore different subjects that are possibly a bit less massive.

I had all code already written during, but it took me until now to actually finish all articles about it. Some is due to other projects, work on the website itself and of course procrastina… I mean other more important stuff outside coding and the internet. Still, I hope the end did not feel too rushed, I wrote it during multiple sessions and I would have loved to include some more trivia and background info about the individual optimization strategies. For example how some of them are used by real compilers translating high-level languages to machine code. But looking at the word count (and my watch) it’s about time to call this done. At least for now. 😊

Part 4
last edited 14th Sep 21