Core structure

Books index

book 1 - Lifecycles, grammars and phases

book 2 - Core structure

book 3 - Semantic analysis

book 4 - Code generation

1 The CORE lifeset

In the previous book we have worked with the tree of objects created by the grammar parser. In the simple examples shown there, having the parsed data printed on the output window has been quite simple and straightforward. However we shall soon discover that the grammar tree is not the best source to continue with further calculations.

1.1 Working with a single lifeset

To understand this concept, we can go back to one of the examples we worked on in book 1:

Grammar definition able to parse files — Figure 1.1.1 - Grammar definition.

In that project we developed a very simple code generation that consisted in writing on the output window the list of variable/value pairs. We took advantage of the nice class MyRule that Macrocoder created for us:

class MyRule: GBase {
	GString varName;
	GString value;
}

Thanks to that, using phases and extensions, our generation code was actually reduced to one line of code (line 3):

extend class MyRule {
	in phase ShowData {
		do {
			system().msg << "Set variable " << varName << " to value " << value << endl;
		}
	}
}

Figure 1.1.2 - Implementation of the ShowData phase.

At this point, we decide we want the user to be able to write not only set x="abc" but also set x=1234. In other words, we want to be able to use quoted strings but also simple numbers.

This is a very simple change in the grammar:

Figure 1.1.3 - Grammar modified to support also numbers.

However, after this change, the generated classes have now a different form. The MyRule class now looks like this:

class MyRule: GBase {
	GString varName;
	
	// was: GString value;
	variant of GBase multiValue;
}

The GString value attribute is gone; instead, we have a variant of GBase multiValue. The variant will be set to an instance of QuotedValue if the user entered a quoted string or to an instance of NumericValue if the user typed a number. These last two classes will have the following definition:

class QuotedValue: GBase {
	GString text;
}

class NumericValue: GBase {
	GNumeric number;
}

Our new parser is now able to parse the following target source file:

set myText = 1234
set yourText = "ABCD"
set hisText = "His display is 15\" wide"

Line 1 would have been forbidden by the previous grammar; indeed, the new one accepts it.

lifeset GRAMMAR (cauldron)
- obj1: ManySets
  - setEntries
    - child1: MyRule
      - varName = myText
      - multiValue: NumericValue
        
        number = 1234
    - child2: MyRule
      - varName = yourText
      - multiValue: QuotedValue
        
        text = ABCD
    - child3: MyRule
      - varName = hisText
      - multiValue: QuotedValue
        
        text = His display is 15" wide

Figure 1.1.4 - Structure of objects resulting from parsing the new target source file.

So far, so good. However, as soon as we get to the code generation method, we discover that our code is to be rewritten:

extend class MyRule {
	in phase ShowData {
		do {
			system().msg << "Set variable " << varName << " to value ";
			if (valid(multiValue.get().cast (QuotedValue))) {
				system().msg << multiValue.get().cast (QuotedValue).text;
			}
			if (valid(multiValue.get().cast (NumericValue))) {
				system().msg << multiValue.get().cast (NumericValue).number;
			}
			system().msg << endl;
		}
	}
}

Figure 1.1.5 - Implementation of the new extended ShowData phase.

The complete rules and target projects for this example can be downloaded at this link: ManySets2.zip.

Although this code does exactly what it did its previous version, it required major rewriting due to changes in the source tree of classes. Now this code is much longer, less readable and harder to write and maintain.

Let's remind that this is an extremely simple "one-line" example for the sake of this tutorial: any real world generation project would include much more "action" and this approach would be very inconvenient. Macrocoding grammars are always being upgraded and extended together with the needs and the experience developed with the coding project they are supporting. Grammar and code generation structures must be disjoint.

2 The dual-lifeset model

The solution we are about to discuss is considered the standard approach for all Macrocoder projects. It is based on two lifesets:

GRAMMAR contains the grammar generated structures and objects; they are defined by Macrocoder according to the input grammar;
CORE contains the classes that simply represent the information we need; they are defined by the rules developer according to the logical structure she or he needs;

In the dual-lifeset model, the sequence of operations during execution is:

the parser creates the initial instances in the GRAMMAR lifeset;
a phase executed in the GRAMMAR lifeset reads the instances created by the parser and instantiates the objects in the CORE lifeset;
further phases in the CORE lifeset do the actual code generation;

Let's create a dual-lifeset example with the following steps!

Step 1: create the CORE lifeset and classes

In a new file called core.fcl we define the CORE lifeset and one class named SetCommand:

lifeset CORE;

class SetCommand {
	LocString variableName;
	LocString assignedText;
}

Figure 2.2 - File core.fcl, implementation of the CORE classes.

The SetCommand class will describe one set ... statement. We used the LocString instead of the basic String type for the attributes: we shall see later on how useful this type is.

Step 2: create the grammar

We define a grammar identical to the one we defined at the beginning of this book:

Step 3: create the createCore phase

Now we have to define a phase in the GRAMMAR lifeset which will read the instances produced by the parser and create related instances in the CORE lifeset. This is the grammar.fcl file:

grammar GRAMMAR;

father-first phase createCore = 1 creates CORE;

extend class MyRule {
	in phase createCore {
		do {
			var CORE::SetCommand coreInstance;
			coreInstance.variableName.set (varName);
			coreInstance.assignedText.set (value);
			lset.CORE.enroll (coreInstance);
		}
	}
}

Figure 2.4 - File grammar.fcl: implementation of the createCore phase in the GRAMMAR lifeset.

Once again, let's analyze the source code line by line:

line 1: we extend the GRAMMAR lifeset: this action is triggered by the grammar objects;
line 3: we define a phase called createCore; the phase is defined exactly as we did so far, except for the creates CORE declaration. This declaration has the effect to enable GRAMMAR lifeset to create objects in the CORE lifeset. Also, it defines the execution order of phases among these two lifesets: first all phases of GRAMMAR, the all phases of CORE. As you can guess, if GRAMMAR can create in CORE, CORE will not be allowed to create in GRAMMAR (the creation graph must be acyclic).
lines 6 and 7: these lines declare that we are implementing the action for phase createCore on class MyRule;
line 8: this line creates a local variable called coreInstance of type CORE::SetCommand; the CORE:: prefix means that the type is to be taken from the CORE lifeset and not from the current one, i.e. GRAMMAR;
line 9: we fill the variableName attribute of the newly created coreInstance object by reading the data from the varName attribute coming from the grammar;
line 10: we do the same with the assignedText; now the new coreInstace object contains the same values (the name of the variable and the value to be assigned) that the MyRule automatic object has;
line 11: here is where the magic happens; lset is an implicit parameter that refers to the current GRAMMAR lifeset; since this lifeset has at least one phase declared as creating CORE, it comes with a link called CORE that points to the object representing the CORE lifeset. In this line we are asking the CORE lifeset to enroll the instance in its cauldron.

The lset.CORE.enroll (coreInstance) line is very important because it takes the coreInstance object and places it under the ownership of the CORE lifeset cauldron. Without this action, the coreInstance object would disappear as soon as the phase method would be terminated: instead, it will be stored in the CORE lifeset and later it will have all its lifeset's phases executed.

Step 4: create the generate phase

The final step is to create the generate phase in the CORE lifeset. We shall do it in a fle named core_generate.fcl.

lifeset CORE;

phase generate = 1;

extend class SetCommand {
	in phase generate {
		do {
			system().msg << "Set variable " << variableName << " to value " << assignedText << endl;
		}
	}
}

Figure 2.5 - File core_generate.fcl: implementation of the generate phase in the CORE lifeset.

In this case, as declared by line 1, we are now working within the lifeset CORE. At line 3 we declared a phase named generate. We did not specify neither children-first nor father-first so it will default to the latter one; however, the traversal order in this case does not matter since we are operating only on one tree level.

The code at line 7 is the same we had before and it does the same thing, except that now it takes its data from the CORE lifeset instead of taking it from the GRAMMAR grammar lifeset.

We have now converted the simple project we started this book with to the complete dual-lifeset model. The complete rules and target projects for this example can be downloaded at this link: ManySets3.zip.

Step 5: modify the grammar to support numbers

We shall now modify the grammar to support again quoted strings and numbers as we did before. After having modified the grammar as in figure 1.1.3, we now have to update the createCore phase method MyRule:

grammar GRAMMAR;

father-first phase createCore = 1 creates CORE;

extend class MyRule {
	in phase createCore {
		do {
			var CORE::SetCommand coreInstance;
			coreInstance.variableName.set (varName);
	
			if (valid(multiValue.get().cast (QuotedValue))) {
				coreInstance.assignedText.set (multiValue.get().cast (QuotedValue).text);
			}
			if (valid(multiValue.get().cast (NumericValue))) {
				coreInstance.assignedText.set (multiValue.get().cast (NumericValue).number);
			}
	
			lset.CORE.enroll (coreInstance);
		}
	}
}

Figure 2.6 - File grammar.fcl: implementation of the createCore phase in the GRAMMAR lifeset to support new extended grammar.

The changes are concetrated in lines 11-16, where the coreInstance.assignedText is set reading the string from the quoted or the numeric child according to which is set. However, besides of those changes, all the generation code inside the CORE lifeset remains unchanged.

The complete rules and target projects for this example can be downloaded at this link: ManySets4.zip.

2.1 Phases and tree upscan

In the example above, the task of creating CORE objects was totally delegated to the MyRule class. The createCore method of that class did the creation of the CORE::SetCommand object (line 7, figure 2.6); then it read its own varName attribute and transfered its value to the variableName method of the CORE::SetCommand object it just created.

The handling of the "value" parameter looked a bit more complicated: the method had to probe its variant multiValue to see what type it contained (i.e. a QuotedValue or a NumericValue). Then it had to cast the variant type and access the internal fields of each type (lines 11 and 14).

This approach has two drawbacks:

it is against classes information hiding: although what we need is always a string to be assigned to the assignedText attribute, we have to go into the details of each child's class;
even more important is the structural information hiding: the grammar structure of this example is relatively simple, but in many real cases we would have had to dig through multiple levels of the grammar tree, entering nested variants and arrays, to reach the data.

Splitting the action

The recommended way to approach this case is by using the power of phase processing and distribute the activity among multiple classes.

The idea is to have MyRule create the CORE::SetCommand object and set the values related to its simple attributes (i.e. varName). Then, the other complex children (i.e. QuotedValue and NumericValue) will update the CORE::SetCommand instance by adding their own information.

Let's see this approach step by step.

Step 1: create the CORE instance

In this step, the MyRule class creates the CORE::SetCommand instance. Let's look at the code:

grammar GRAMMAR;

father-first phase createCore = 1 creates CORE;

extend class MyRule {
	in phase createCore {
		link of CORE::SetCommand myCoreCommand;
		do {
			var CORE::SetCommand coreInstance;
			myCoreCommand.set (coreInstance);
			coreInstance.variableName.set (varName);
			lset.CORE.enroll (coreInstance);
		}
	}
}

Figure 2.1.1 - File grammar.fcl: implementation of the createCore phase in the GRAMMAR to create the CORE::SetCommand object and store it in the myCoreCommand link.

The code above is almost identical to those already seen in figure 2.6. Let's discuss the changes:

line 7: we added the attribute myCoreCommand of type link of CORE::SetCommand; this attribute, set at line 10, will maintain a link to the created CORE::SetCommand; this link will be later used by the children to set their own data.
line 9: we create the CORE::SetCommand as usual;
line 10: as anticipated, the created CORE::SetCommand is bound to the link attribute myCoreCommand;
line 11: the MyRule class sets directly the variableName attribute of coreInstance using its simple attribute varName;
line 12: the coreInstance object is enrolled to the CORE lifeset as we did before;
Note that nowere this method sets the coreInstance.assignedText attribute: this action will be performed by other classes as we shall see later on;

The figure below shows the newly added link myCoreCommand both in the grammar diagram and in the instances tree:

MyRule rule diagram — Figure 2.1.2 - Grammar and instances tree with evidenced the `myCoreCommand` link.

Step 2: update the values

It is now time to extend the classes QuotedValue and NumericValue to have them update the CORE::SetCommand attributes themselves. Let's add the following code to the grammar.fcl source file:

extend class QuotedValue {
	in phase createCore {
		do {
			upscan(MyRule).myCoreCommand.assignedText.set(text);
		}
	}
}

extend class NumericValue {
	in phase createCore {
		do {
			upscan(MyRule).myCoreCommand.assignedText.set(number);
		}
	}
}

Figure 2.1.3 - File grammar.fcl: implementation of the createCore phase in for the QuotedValue and NumericValue classes.

The implementation of QuotedValue and NumericValue is identical. They implement the createCore phase method with one single line of code (line 4 for QuotedValue, line 12 for NumericValue).

Let's decompose line 4:

`upscan(MyRule).myCoreCommand.assignedText.set(text)`	the `QuotedValue` object goes upwards in the instance tree looking for the first instance of `MyRule` it can find; this is done by the *upscan(type)* function;
`upscan(MyRule).myCoreCommand.assignedText.set(text)`	once found the `MyRule` instance, it goes through its link `myCoreCommand`; the link is valid because we used a father-first phase: this guarantees that every object executes its phase method before its children;
`upscan(MyRule).myCoreCommand.assignedText.set(text)`	from the link `myCoreCommand`, it can now access the `CORE::SetCommand` and its `assignedText` attribute; in this way, it can complete the CORE object by adding its data.

With this technique, the inner grammar elements can collaborate to the creation of a single CORE object by pushing their contents into the instance created by one of their relatives. The great advantage is that to implement this solution we do not need to manage the details of the grammar tree structure. To use the data of QuotedValue We must know only that:

the CORE::SetCommand instance is created by MyRule and linked by its attribute myCoreCommand;
QuotedValue is a child of MyRule;

All the job of traversing the tree is done by the upscan function.

In depth: a preview on phase protection

We can take this opportunity to introduce one of the most important features of phase programming: phase protection.

As we will discover while going on with this tutorial, the code generation process consists in several tree traversals (phases) each one reaching a goal. So far, our projects were made of two phases: the first one (createCore) was to create the CORE objects; the second one (generate) was to generate the output code. It already obvious that execution order among phases is important: if phase generate would have been executed before createCore, there would be no CORE instances to work on.

Execution order is important also within a phase. Splitting the phase code as we did in figure 2.1.3 is possible because every MyRule creates its CORE::SetCommand object and sets its coreInstance link before the children try to access it. If we would have defined phase createCore as children-first instead of father-first, the children would have executd thei phase method before their parents and they would have found the coreInstance link still unset.

In normal programming languages, this problem would be detected at execution time. The phase rules of Macrocoder, instead, allow prevention of such issues at compile time.

Besides of the usual class access protection (private, public, etc.), Macrocoder supports compile time instance phase protection. This protection scheme is so powerful, that class protection is mostly left at its default of public.

Let's take this code snippet:

extend class MyRule {
	in phase createCore {
		link of CORE::SetCommand myCoreCommand;
		do {
			var CORE::SetCommand coreInstance;
			myCoreCommand.set (coreInstance);
			coreInstance.variableName.set (varName);
			lset.CORE.enroll (coreInstance);
		}
	}
}

As we can see at line 3, the link of CORE::SetCommand myCoreCommand declaration is done inside the in phase createCore {...} block. The meaning of that is:

the myCoreCommand attribute can only be set by the createCore phase method of the object that owns it; note that we used the word "object": while the private: class protection allows members to be accessed by any instance of the hosting class, the phase protecion limits this to the exact instance that owns the attribute;
no one can ever access, neither for reading nor writing, the myCoreCommand attribute before the phase method of point 1; thus its enable phase is createCore;
once set, all the subsequent objects can freely access this attribute but read-only; its finalize phase is also createCore;

Let's see what happens if we move the declaration of myCoreCommand out of the in phase block (see line 2):

extend class MyRule {
	link of CORE::SetCommand myCoreCommand;
	in phase createCore {
		do {
			var CORE::SetCommand coreInstance;
			myCoreCommand.set (coreInstance);
			coreInstance.variableName.set (varName);
			lset.CORE.enroll (coreInstance);
		}
	}
}

The Macrocoder compiler immediately reports an error. By default, attributes defined outside the in phase block have enable=INITIAL finalize=INITIAL. Phases INITIAL and FINAL are two pseudo-phases representing before the first phase and after the last phase. Attributes with those rules can be set only when the object is initially built and never changed later on. Therefore, the command myCoreCommand.set at line 6 can not be executed during phase createCore because the target attribute has been already finalized.

We already worked with INITIAL attributes: in figure 2.2 we declared:

lifeset CORE;

class SetCommand {
	LocString variableName;
	LocString assignedText;
}

The two attributes variableName and assignedText are both defined at class level, thus they have enable=INITIAL finalize=INITIAL. This means that they can be set only when instancing the object and this is exactly what we did in figure 2.4. Note that, once set, these attributes can not be changed anymore during the evolution phases.

Phase protection allows close monitoring of the evolution of the internal information, guaranteeing that:

information is always accessed only when ready
information is always consistent

3 Summary

In this book we have covered the following concepts about Macrocoder programming:

to maintain independence between the grammar and the core data model, the standard Macrocoder model recommends the usage of two separate lifesets, GRAMMAR and CORE;
the GRAMMAR, starting from the instances created by the parser, creates the initial instances in the CORE lifeset;
the action of creating the CORE instances can be spread among various grammar objects thanks to the father-first kind of phase and the upscan statement; in this way, peripheric nodes collaborate with their parents in filling the CORE objects with their initial data;
once all the CORE instances are created, they begint executing their own phases and GRAMMAR instances are forgotten forever;
phase protection allows a very strict control on which attributes are ready to be used and which, instead, may still contain incomplete data.

« Go to book 1 Go to book 3 »