NOTE: To use the advanced features of this site you need javascript turned on.

Home Knowledge Base Software Migrators Common Architectures

Common Architectures

Just as in the case of compilers, there are three major architectural models for migrators. They are classified based on the number of transformation stages performed on the original source code or an internal representation:
  • One-pass migrators,

  • Multi-pass migrators – with two subcategories:

    • Action-based multi-pass migrators,

    • Multi-pass migrators based on transformations of the abstract semantic tree,

  • Mixed migrators.

From an architectural point of view, the common points between these migrator types are the existence of the syntactic analyzer for processing source language sentences and the absence of the semantic analyzer.

The absence of the semantic analyzer is justified, because translation starts with the assumption that the source code sentences have been subjected to semantic analysis during their development process and that these sentences are correct.

One-pass migrators only make a single pass on the sentences of the source code and immediately translate them using action-based mechanisms. This is the simplest and fastest kind of software migrator that uses the least amount of memory. It is usually suitable for translations between programming languages of the same type, whose grammars are similar in structure.

The architecture of a one-pass migrator contains the syntactic analyzer, the action engine and the post-processor:



The syntactic analyzer is comprised of the scanner and parser, just as in the case of a compiler. The action engine performs the translation of the grammatical constructions into valid constructions of the target language.

Post-processing operations include beautification, physical structuring and other operations aimed at increasing the readability of the generated source code.

For most languages though, a one-pass translation is not possible, because at some point the migrator needs information that is found in a currently unprocessed portion of the input flow. A multi-pass migrator is more suitable in this situation; just like a multi-pass compilers, it performs additional traversals of the source program. The first traversal generates an internal representation (the abstract semantic tree) on which the rest of the traversals operate.

For example let us consider the instruction sequence in a fictional L language:


I2 e7
e5: I3
e7: I5
I6 e5


Instruction I2 refers tag e7 declared at instruction I5. Also instruction I6 refers tag e5 declared at instruction I3.

A transformation of tag identifiers is necessary, taking into account elements of the declared instruction. Tag e7 thus becomes tag e7_I5 and tag e5 is transformed into e5_I3. A one-pass migrator can only transform the e5 identifier, because this is the one declared before use, while a multi-pass migrator makes both transformations because it has an internal representation of the source code, containing information on all declarations and usages of program identifiers.

After processing the input using a multi-pass migrator, the result is the following:


I2 e7_I5
e5_I3: I3
e7_I5: I5
I6 e5_I3


There are two distinct types of multi-pass migrators, with different approaches of the migration process: migrators based on actions and migrators based on transformations of the abstract semantic tree.

Action-based migrators pass through the source program several times, once in the original form and then in its AST form. The final pass is the one that transforms the AST into target language code, and all previous passes collect information necessary to this final pass.



Migrators based on semantic tree transformations structure the translation into several stages, each one solving a type of problem by applying transformations to the nodes of the semantic tree.



Mixed migrators combine the advantages of both types of multi-pass migrators and are usually the most suitable solution for complex migrations.