Coding for Reuse Course - Module 1 Lesson 2
Architectural Development and Design Partitioning

In this lesson we will study how to develop a project architecture and partition that architecture so that it can be subsequently coded for reuse.
Architectural Development  

This is probably the one of the most important steps in the entire design process. Architectural development imposes global constraints upon the entire project and all of its subsequent realizations.

DO NOT EVEN CONSIDER A TARGET DEVICE OR EVEN DEVICE FAMILY GOING INTO THIS STEP!!!! I know that one device vendor probably gave you a nicer golf shirt than the other one did at that last presentation; but when you begin pinpointing specific devices or families prior to this point, you become constrained by the vendor instead of the actual customer requirements. You can also become a prime candidate for the device evolution lottery. The development/introduction process takes some finite amount of time. I have seen numerous projects delayed or even cancelled because an essential device was obsoleted prior to a project becoming generally available to customers. If you targeted a specific device or device family, what will happen to your project?

When performing this step, do not limit yourself to current customer requirements, but plan for future evolution. Normally, unless you are designing to a fixed industry standard, those requirements may change during the development process.

Although partitioning may occur in iterative phases on a sub-project level; the initial project architecture, when properly executed, should remain essentially intact. If not, all I can say is "God help you."

1. Establishing and Partitioning a Project Architecture
    
1.1 General
    

One of the most frequent inefficiencies encountered when developing an architecture or even circuitry for a project is the duplication from a previous version of that or similar projects. This serves as an artificial constraint, stifles innovation, and may even propagate previously undiscovered, but still serious, errors.

I have seen a circuit, duplicated in many similar designs, that mistook an output, meant to drive an optional Pierce oscillator, as an input. Since the output was ancillary, and therefore caused no immediate, measurable failures it was considered standard. The error was causing increased power consumption and noise susceptibility, decreased performance due to die heating, and eventual device failures.

Your target device may also vary significantly from the original in terms of type and availability of resources. Older devices used 2.5 micron or larger lithography and had limited, point-to-point metal layers for routing. As a consequence, most featured higher fan-in to individual cells and the largest contributors to pin-to-pad delays were those cells. Modern devices in general production feature lithographies as small as 20 nanometers and additional metal layers with programmable access. These devices typically have lower fan-in to individual cells; and the largest contribution to pin-to-pad delays are associated with routing.

This is not to say that previous versions or similar devices should not be considered when developing a project. I only suggest that these factors should be taken as guidance, rather than as a blueprint for your project.

When developing an architecture, best practice also dictates taking into account the characteristics of the target range of devices even though, at an architectural level, this is a very high-level view.

The architecture that you develop should not only satisfy the customer requirements at project inception, but anticipate what those needs will be at project completion. It should also incorporate your ideas for further utilization, expansion, and enhancements to your project. In other words, do not think just the immediate requirements, but future project evolution. The best expression I have seen of this approach was a poster (I don't recall the source) that said:

  • "Do not undertake vast projects with half vast solutions".

Providing the support infrastructure necessary for project evolution during this phase of development will minimize the impact of "scope growth." Scope growth normally originates from changes in customer requirements during the project development cycle, and, if un-anticipated, can be a project killer. Un-mitigated scope growth can be best summarized by the expression (origin unknown; paraphrased):

  • "The purpose of any good employee should be to anticipate any potential problems, develop solutions for those problems, and be prepared to execute those solutions when called upon.... However, when you're up to your a** in alligators it is difficult to remember that your original objective was to drain the swamp."

    
1.2 Isolating Preliminary Functional Blocks
    

The best approach to architectural development is to consider the project from a functional perspective, using guidance from similar devices, then looking for major functional divisions. Do not begin deciding how a block would perform that function. Build a preliminary, high-level, block diagram based upon these divisions. A paper copy could be used for this block diagram; however, I strongly suggest a soft version, since the block diagram is often used in other documents such as design specifications and functional descriptions. Graphical design software or word processors with built-in graphical design capabilities, and the ability to group drawing elements and text objects are good for creating the electronic version; particularly since blocks may be frequently re-arranged.

Connect the external (from a project perspective) signals that must originate (to include bi-directional) from these preliminary blocks but do not interconnect the blocks or add input signals just yet. These originating or bidirectional signals may be either generic, (when creating a new device) or specific (when replacing or enhancing an existing device).

Next, refine the block diagram by examining the signals that must originate from these major blocks and their relation to one another.

Further divide your block diagram by examining the prospective functional blocks at a high-level for compliance with Rule 3.11. A good way of doing this is by writing a description of each block. If more than two or three 'ands' or 'alsos' appear in this description, the block is too big; and requires further partitioning.

Socialize your work and get input from your peers.

By examination of our project intellectual property, specific output signals, and other processors in general; we can see the following major functional blocks:

  1. A reset block that controls initialization and generates RESET OUT. In our project, this is a virtual rather than a physical block, since RESET OUT can be generated by the I/O and the power reset function is normally controlled by synthesis directives.

  2. A timing block which controls synchronization and originates SYSCLK and X2.

  3. An interrupt block which controls the recognition and prioritization of interrupts and originates -INTA.

  4. A bus interface block which controls the flow of data into or out of the processor. It originates A[15:8], AD[7:0}, ALE, S0, S1,IO/-M,-RD, -WR, and HLDA.

  5. A serial port that terminates SID originates SOD.

  6. A group of registers (referred to a register file) that does not originate any output signals.

  7. An arithmetic/logic unit that modifies data going to the register file or the external bus. It does not originate any output signals.

  8. An instruction decoder/sequencer that controls instruction execution. It does not originate any output signals.

Preliminary Block Diagram

    

1.3 Develop the Interconnection Topology

    
  1. Identify those signals that can originate both externally and internally, or from multiple sources. Because of rule 3.7, we cannot add internal tristate drivers for these signals and, instead, must add an additional block to select the appropriate signal source. Connect the originating signals to this block and any additional signal or signal group(s) from this block to the termination point(s). This step becomes iterative, since adding additional blocks may introduce requirements for more selection blocks.

    In our microprocessor project, we can see that data going into the arithmetic/logical unit may originate from either the register file block or the AD input. Therefore, we must add a source selection block. For most operations, the arithmetic/logical unit requires two separate sources. Data going to the register file or AD bus can originate from this source selection block or from the arithmetic/logical block, therefore we must add a destination data selection block.

  2. Working from your initial block diagram, identify and add those external (from project perspective) input signals that each block requires to perform its suggested function. These signals may or may not fan into multiple blocks.

  3. Identify and add those derivative, single point-of-origin, single-point of destination signals or groups of signals that (from a functional viewpoint) must logically originate in one particular block and terminate in another.

    In our microprocessor project, we can see that current flags must originate in the register file and terminate in the arithmetic/logical unit for modification or comparison.

  4. Identify and add those input signals or groups of signals that (from a functional perspective) must logically emanate from one particular block and fan into multiple blocks.

    This becomes a grey area. In our project, for instance, modified flags in all but one instruction must originate in the arithmetic logical unit and terminate in either the instruction decode and sequencing or the register file. The sole exception is the 'POPPSW' instruction, where modified flags can originate from either the arithmetic/logical unit or the AD bus. In these type of cases, you can either add an additional selection block, or take advantage of existing resources. Since the source selection block already makes the AD bus available to the arithmetic/logical unit, we will take advantage of that resource to form a single bus, although we may revisit this decision at a later time.

  5. OOptimize known bus sizes for your target range of devices. Certain bussed signals, for instance address and data, have known widths or steps of widths. Older devices, with limited routing resources, sometimes employed multiplexed busses to minimize routing. In modern and programmable devices, multiplexed busses can consume resources and even introduce additional routing constraints. In general, I prefer to size busses according to the largest data width. But there is a trade-off, extremely wide busses with high fanout can consume valuable, impedance controlled 'long lines'. Therefore, a mix of wider bus widths and bus multiplexing should be considered in these cases. Conditional compiles with both versions could then be used to evaluate the best solution.

    For our project, we can see that address and data internally use the same busses. Furthermore, we can see that all addresses and the results of some operations are 16 bits wide. Even though the data width for the majority of operations is only 8 bits, we must design our source and destination busses to accommodate the larger, 16-bit width.

  6. Re-examine your completed block diagram to ensure that your proposed architecture will support your customers immediate and foreseen needs. Be sure to add descriptions of any added blocks; then, whenever possible, have your customer and peers also examine the proposal. Do not try to 'hard sell' your proposal. Let it stand or fall on its own merits and make the necessary adjustments. Be prepared enough, however, to negotiate any scope growth at this point. If you followed the principles outlined in Section 1.1, you can delay or revector some of the most onerous "blue sky" proposals.

By applying the principles of section 1.1, section 1.2, and section 1.3 we can arrive at the following preliminary architecture for our project (note that some signals are not named in this diagram simply to improve readability on low resolution displays) :

Completed Block Diagram

2. Adapting Your Project Architecture to an Intellectual Property
    
2.1 Additional Blocks
    

Because of rule 3.4 we see that I/O's must be isolated into a single, top-level block. We can also see from rules 3.5 and 3.6 that the top levels of different hierarchical functions must be connected to each other. In our project this connection is accomplished within the I/O block; however, you keep the I/O block separate and add an interconnect block. This allows the I/O block to be easily swapped out. Changeable I/O is desirable, since the I/O architecture of an embedded intellectual property can vary significantly from the I/O block needed when that same IP is used for a stand-alone application. This also allows the synthesis tool to readily remove unused functions by tying the inputs used only by that function to a static value (0 or 1).

Many device vendor synthesis tools can infer global clocks. In some cases, they must be tied to a buffer in the I/O module. I generally apply those buffers without any associated compiler directives, since may ASICs will require them. However, applying a buffer to an input signal may affect the synthesis of certain vendor topologies or devices so you may end up adding compiler directives or conditional statements later to support these vendor devices or topologies.

Also note that bi-directional signals are broken apart leaving and entering the I/O module. For instance, in our microprocessor IP, the AD bus has an ad_in signal set going out of the I/O module from the connected modules, and an ad_out signal set entering the I/O module from the connected modules. Because of this separation, we can remove ad_in from the bus interface, since the bus interface will never directly use it.

Whenever possible, the control and signal sense of all I/O buffers should be the same. In other words, all I/O tristate buffers should be either bufif0 or bufif1, but not a mix of the two. They should be non-inverting in function. When this is not possible, as in the case of the buffer for rst_out_pad in our project, it is better to invert the state of the control or input signal than to change the buffer (in our example we will invert rst_in_n to generate rst_out_pad rather than inserting an inverting buffer).

    
2.2 Developing a Project Skeleton
    

The project skeleton is the first version of your Verilog code. It consists of a completed I/O module and black box modules for each of your physical architectural blocks, which will serve either as complete functions or as the top level modules in function hierarchies. Connect the known signals to these blocks. Additional signals can easily be inserted and unused signals removed as your project evolves.

he first project evolution can begin while you are creating the project skeleton. While you are writing the Verilog code, project the block functional requirements and add any additional signal resource dependencies or unused signals that you have not identified during your block diagram development. Add or remove these signals to/from your functional block, project hierarchy, and block diagram. The block diagram that I presented earlier incorporated this type of evolution.

My project skeleton for our microprocessor appears as:

Skeleton

3. Exercise
 
  1. Pick a multi-functional device datasheet or description. Peripheral chips, such as the 8254 timer/counter or the 8251 USART, are good for this particular exercise.

  2. Read any available reviews for your device. Ask your peers for ideas; for instance, one of your peers might wish for an HDLC datalink controller built into the 8251 USART. Add enhancements based upon those reviews. Write a product specification from these inputs.

  3. Develop a preliminary block diagram for your enhanced project. Write descriptions of each block.

  4. Based upon your initial architecture, attach the known or anticipated I/O to each block. Add internal block connections and busses.

  5. Compare your block diagram with the vendor's block diagram if possible.

  6. Write a skeleton project for your block diagram. Add any additional signals identified during this process to your block diagram.