Coding for Reuse Course - Module 3 Lesson 2
Executing sub-programs and decision making

In this lesson we will study how to execute a sub-program, that is a sequence of instructions outside of the current sequence and the return to the point of origin. This ability is critical to completing our microprocessor project. We will then study how our microprocessor makes decisions and implement the decision making instructions.

License Terms  

The source and/or object code (subsequently referred to as code) presented in this and subsequent lessons is the property of and owner who retains copyright protection. This code in whole or in part is licensed only for single copy per user, non-commercial use; except in the case of educational institutions, where a license of single copy per student, only as part of course curricula, is permitted.

Donald Gerard Polak (owner)

Executing sub-programs  

Sub-programs are the backbone of modern computing. They allow a common sequence of code to be shared by several different sequences and/or programs. For example, operating system calls are all sub-programs.

The ability to execute a sub-program and return to the point of origin is not unique to processors. In fact, many programmable logic controllers (PLC) and microcontrollers share this ability. However, without this ability the microprocessor would have a difficult, if not impossible, task of implementing the functions that set it apart.

In the 8085x architecture, this ability is primarily invoked by the 'CALL' and 'RET' pair of instructions. The CALL instruction is very similar to the unconditional JMP instruction except in a couple of cases. The CALL instruction requires a six cycle instruction fetch, so from the bus cycle compatibility perspective we must request the additional two cycle bus delay before beginning execution. The CALL instruction must then request the address to which is must vector. It first requests the lower address byte from PC + 1 (PC was already incremented on the instruction fetch) and moves PC to temp. When the first byte arrives from memory it is placed in the lower byte of the PC. Another memory fetch is then placed from the contents of temp. Once the request is acknowledged the temp register is incremented again. The byte from memory is placed into the upper byte of the PC. The SP then decremented and the upper byte of the temp register is written to the address in the SP. The SP is decremented again and the lower byte of the temp register is written to the address in the SP. In this way, the return address is 'pushed' on to the top of the stack and code execution vectors to the subroutine address.

The 'RET' instruction is roughly equivalent to 'POP PC'. Again, it requires a 6 cycle instruction fetch phase so we must request a 2 cycle bus delay. We then request a read with SP as address source. When the read request is acknowledged, we increment the SP. The read data is directed to the lower 8-bits of the program counter. When that transaction completes, another read request is posted from SP as address source. Upon acknowledgment, the SP is incremented again and the read data directed to the upper 8 bits of the program counter.

You might start asking the 'why' question about now, since there are many alternatives to this complex scenario. The answer is simple, at the same time relatively complex. Earlier MMUs (memory management units) had the ability to recognize instructions and begin address translations based upon the instruction and bus cycle. This was initially true concerning stack reference instructions. The stack, rather than using up valuable addressing space, was kept in a separate area of physical memory. When the instruction was recognized as a stack reference instruction, the MMU could then add bits to the address to reference a different address in physical memory only during the actual memory cycle. The same held true for code space. In order to expand code space, these processors and the MMU used an 'overlay' concept. An overlay is an area of virtual memory (i.e. address space) that references code in one part of physical memory at one time and another part at other times. In fact, this concept is used today with most modern processors. For instance, most user programs are essentially overlays.

Using the 'CALL' and 'RET' instructions you may execute a sub-program and return to the point of origin. In this implementation we also encoded the 'HLSP' and 'XTHL' instructions from the Module 2, Lesson 6 Exercise.

The HLSP instruction should be a no-brainer at this point. It requires an entry in the first instruction decode set to point to the move operation. The only thing that sets it apart is that both the instruction entry, and the move parameter are conditional on the S8085D architecture, so the appropriate '`ifdef' statements must precede those entries, and the '`endif' statement must be used to complete the entry.

The XTHL instruction takes a little bit of additional effort since we removed the differentiator bits from our state machine. The XTHL instruction requires two reads from, and two writes to the stack. The temp register is needed to hold the current contents of the HL register until they are written to the stack. I order for the 'sel16' signal to be activated, the stack read requests must use both source bytes and both destination bytes. In order to comply with these constraints and not use any differentiator bits, the first read request is posted with SP source and MEM destination. The XTHL instruction executes as follows:

  1. A read request is posted with SP as address source MEM as destination).

  2. When the bus request is acknowledged, the contents of HL are moved to TEMP.

  3. When edck is received, memory data is directed to L on the next clock cycle.

  4. A write request is posted with SP as address source (memory destination).

  5. TEMP lower is posted as write data and held until dack is received.

  6. A read request with SP + 1 as address source (memory destination) is posted.

  7. When edck is received, memory data is directed to H on the next clock cycle.

  8. A write request is posted with SP + 1 as address source (memory destination).

  9. TEMP upper is posted as write data and held until dack is received.

This sequence will exchange the contents of HL with the two bytes on top of the stack.

For both of these instructions to work, the source select block requires some modifications. Up until now, the temp register was used as a sixteen-bit entity. now we need to treat it as two eight bit entities. This means the upper and lower bytes require separate enables, and the temp lower mux needs to be able to select between the upper and lower bytes of the temp register.

Project Skeleton with HLSP, XTHL, CALL, and RET instructions implemented.

Of course we will have to update our test bench in order to check the added instructions. We also will check the JMP and OUT instructions. Note that the definitions file must now be updated to define S8085D in order to run the HLSP instruction.

New Test Bench

When we run the simulation we come up with the following results:

GTKWAVE Display of the CALL Instruction

GTKWAVE Display of the JMP Instruction

GTKWAVE Display of the OUT Instruction

GTKWAVE Display of the XTHL Instruction

GTKWAVE Display of the HLSP and RET Instructions

Decision making  

Processors make decisions based on flags set by previous instructions. Flags can be thought of as artifacts of the instruction results. In most cases, the decision is made on the condition of one particular flag. The decision making requires the processor to take one course of action if the condition is met, or another if the condition is not met.

Decision making is usually based upon existing instructions. In the case of the 8085x architecture, the JMP, CALL, and RET instructions can be made both unconditional, and conditional. When used as conditional instructions, the conditional versions must replicate many of the bus cycles of the unconditional versions.

In order to minimize microcode states, our model simplifies the decision making process. Every instruction generates a specific flag mask (in flags_r). Conditional instructions use this flag mask in an entirely different way. The flag mask in conditional instructions is used to select which flag to check. The nf flag is used to select whether to check for a set or cleared condition. The results of this check are presented as a single signal called 'match' to the microcode state machine. Therefore, even though the specific instructions must appear in the state decode tables, the net result is only two or three additional states.

During our initial simulation we found and corrected errors with the CMC and STC instructions, the uf flag, and the generation of match when the fmask has the nf flag position set. Using the techniques discussed in module 3, lesson 1. This activity resulted in the following code :

Project Skeleton with decision making instructions implemented.

Of course we will want to simulate the new instructions. We need to check both the true and false states for the condition tested. That means we have to update our test bench again in order to check the added instructions.

Updated Test Bench

Here are some of the results of the new simulation :

GTKWAVE Display of the JZ and JNZ Instructions with ZF Clear.

GTKWAVE Display of the CNC and CC Instructions with CF Set.

GTKWAVE Display of the RZ and RNC Instructions with ZF clear and CF Set.

GTKWAVE Display of the JNUI and JUI Instructions after an INXr Instruction with Overflow.

GTKWAVE Display of CMC, RC and RNC Instructions with CF Initially Set.

Even though these results look good, there are some items left on our to do list. First of all, we still need to check the operation of our flags during the POPPSW instruction. We also discovered this article while researching the operation of the UF flag; therefore, we may need to re-visit that. These issues will be tackled in parallel with further development of our project in subsequent lessons.

  1. How can we think of flags?

  2. Draw the state transition diagram for our implementation of the XTHL instruction.

  3. What instruction in the 8085x architecture do we primarily use to execute a subprogram and return to the point of origin? Why would we want to?

  4. What is decision making usually based on?

  5. In our architecture, how is the flag mask used during decision making?

  6. Are decision making instructions entirely unique or can they share states with other instructions? Explain.