The RASSP Digest - Vol. 2, 4th. Qtr. 1995


A RASSP Approach to HW/SW Codesign

by Vijay K. Madisetti and James A. Debardelaben


Abstract

At least six different hardware/software codesign methodologies have been proposed for rapid prototyping in the past few years. Some of these describe the various process steps without providing specifics for implementation. Others focus more on implementation issues without explicitly considering methodology and process flow. We propose a new industry-driven rapid prototyping codesign methodology for signal processing applications which uses parametric cost estimation tools to minimize development cost, while maximizing product profits. Mathematical programming formulations are used to model the architecture selection and partitioning process steps. Our approach, as part of ARPA’s RASSP program, utilizes a hardware-less VHDL cosimulation and co-verification methodology for rapid prototyping which supports solutions for high performance signal processing applications.

1. Introduction

The design challenge for large signal processing systems is to prototype implementations that meet high-throughput requirements, while satisfying stringent physical constraints and market pressures. The current practice design methodology, shown in Figure 1, is plagued with long prototyping times, high cost of design, in-cycle silicon fabrication and test, and limited architectural exploration.

Current rapid prototyping design methodologies have overlooked an important characteristic of software prototyping. Various parametric studies based on historical project data show that software is difficult to design and test if “slack” margins for hardware CPU and memory resources are overly restrictive. In systems in which most hardware is simply COTS parts, the time and cost of software prototyping and design can dominate the schedule and budget. If physical constraints permit, the hardware platform can be relaxed to achieve significant reductions in overall development cost and time [AN95][ME95]. This principle of software prototyping is illustrated in Figure 2.

Conventional design methodologies that optimize hardware efficiency (ASICs plus COTS hardware) do not necessarily optimize the hardware plus software development cost and product profits. Market research performed by Synopsys, Inc. has shown that the demand and potential profits for a new HW/SW product can be modeled by a triangular window of opportunity [LIU95].

This characterization of the market demand is displayed in Figure 3. In order to maximize profits, the product must be on the market by the start of the demand window. Any product release delays after the demand has begun cause a steady loss in potential revenue.

To effectively address these design challenges, a unified approach that considers the efficiency of both hardware and software options is required. This approach is called HW/SW codesign. This concept attempts to converge the hardware and software design efforts into a combined methodology that improves cycle time and quality while enhancing the exploration of the HW/SW design space.

In this paper, we present a new industry-driven codesign methodology which uses HW/SW cost and development time estimation models to drive the design process. This methodology utilizes a hardware-less, library-based cosimulation and co-verification approach for rapid prototyping. Hierarchical design verification is performed through the use of virtual prototypes [ME95], VHDL software models of the hardware executing a representation of the application code. A block diagram of the methodology is shown in Figure 4. This research mainly focuses on the conceptual prototyping stage of the methodology. This prototyping process utilizes cost estimation models as well as performance estimation models to facilitate system-level design exploration early in the design cycle. We model the architecture selection and system partitioning process steps using mathematical programming formulations to allow for commercial mathematical programming packages to be used to effectively solve these problems.

2. Survey of Previous Research

The codesign problem has been addressed in recent studies and point experiments by Thomas et al [TA93], Kumar et al [KA93], Gupta et al [GD93], Kalavade et al [KL93], and Ismail et al [IJ95]. A detailed taxonomy of HW/SW codesign was presented by Gajski et al [GV95]. In the taxonomy, the authors describe the desired features of a codesign methodology and show how existing tools and methods try to implement these features. However, the authors do not propose a method for implementing their process steps. The features and limitations of the latter approaches are illustrated in Table 1. In the table, we show how these approaches compare to our new industry-driven approach with respect to desired attributes of a codesign methodology. Previous approaches lack automated architecture selection tools, economic models to take advantage of Figure 2, and the integrated development of test benches throughout the design cycle. Very few approaches even allow for true HW/SW cosimulation where application code executes on a simulated version of the target hardware platform.

3. Proposed Approach

In this section, we present models that can be used to facilitate the automation of the conceptual prototyping process steps which include: HW/SW partitioning, cost/schedule-driven architecture selection, and software partitioning. VHDL performance models are used to verify the conceptual prototypes before they are passed on to the virtual prototyping stage. If a conceptual prototype does not meet specifications, updated model parameters are back annotated to previous process steps and the conceptual prototyping process is repeated. [DM95]

3.1 Quantitative HW/SW Partitioning

A software-oriented approach is used to perform HW/SW partitioning [EH93]. Assuming that the signal processing algorithm is depicted in the form of a dataflow graph G = (V, E), the first step in the codesign methodology is to partition the DFG tasks , into hardware and software sets, H and S, respectively. Initially, all tasks whose throughput requirement exceed the throughput capability, S = max{ }, of the programmable processor with the highest peak performance in the processor set are mapped to H and the rest to S. When no feasible implementation for a given HW/SW partition can be found, the software task j with the highest throughput requirement, = max{ }, is moved to H.

3.2 Cost/Schedule-Driven Architectural Selection

The next step in the design process involves selecting an architecture to implement the tasks partitioned into the sets H and S. We assume a target architecture consisting of N programmable processors and K dedicated hardware elements connected over a bus. Each processor communicates with other functional elements connected to the bus via shared memory. The goal of architectural selection in this context is to determine the number and type of programmable processors (i860, SHARC, etc.), the memory capacity, the type of bus (VME64, FB+, etc.), and the type of dedicated hardware (ASIC, FPGA, etc.) which optimally meet design and economic objectives.

The architecture selection problem can be modeled as a constrained optimization problem. Software development cost and time estimates, derived from the embedded mode intermediate COCOMO model [BO81] used by REVIC [AN95], are used to drive the optimization problem. The software development cost and time model equations are shown in (1) and (2), respectively, where L denotes the number of delivered source instructions (thousands), C denotes the software cost per man-month, and and denote effort adjustment factors for the execution time margin and storage margin described in [BO81][AN95]. These equations assume that the remaining adjustment factors described in [BO81] are set to their nominal values of 1.00. The recommended values for the two factors are shown in Table 2. Linear interpolation is used to determine adjustment factor values for utilizations between the given data points.

Some example programming models which can be used to perform architecture selection are shown in Table 3. The nonlinear mixed-integer programming model shown in (3) is used to determine the number and type of programmable processors in the architecture. This processor model tries to find a processor architecture which balances the software development cost against the hardware production costs and maximizes potential product profits subject to performance and form factor constraints. The main storage constraint effort adjustment factor is assumed to be 1.00. The important decision variables of the model are defined as follows: u is the overall processor utilization, c is the software development cost, is the number of processors of type i, is the amount of time after the demand starts that a product is released, and determines the utilization interval shown in Table 2. Other important model parameters include: H, the hardware production cost of processor i; , the start time of the demand curve; M, the slope of the demand curve; , the aggregate throughput required by tasks assigned to software; , the peak throughput of processor i; is the area of processor i; is the maximum area that can be occupied by the processors in the architecture; is the power dissipated by processor i; is the maximum power that can be dissipated by the processor architecture; and W, a constant equal to half the product lifetime as shown in Figure 3.

A similar mathematical programming model shown in (5) is used to select the memory capacity. The memory model uses the same objective as the processor model to determine memory capacity. The execution time effort adjustment factor is assumed to be 1.00. Decision variables and parameters unique to this model include: y, the memory capacity; , the aggregate storage required by software tasks; u, the memory utilization; A, the area per unit of storage; , the maximum allowable area for the memory architecture; P, the power dissipation per unit storage; , the maximum power that can be dissipated by the memory architecture; and K, the production cost per unit of storage. Nonlinear programming packages such as MINOS, GAMS, and GINO can be used to solve problems this form.

The 0-1 nonlinear programming model shown in (4) is used to choose the type of dedicated hardware to implement each hardware task. The objective of (4) is to minimize the combined hardware development and production costs and maximize product profits while satisfying task throughput constraints. The decision variable is 1 if hardware task i is assigned to hardware implementation j and 0 otherwise. The model parameters include: , the development cost of hardware implementation j of task i; , the production cost of hardware implementation j of task i; M, the slope of the demand curve; , the amount of time after the start of the market window that the product will be released due to the development time of hardware implementation j of task i; , the peak throughput of hardware implementation j of task i; , the required throughput of hardware task i; and W, a constant equal to half the product lifetime as shown in Figure 3.

The 0-1 linear programming model shown in (6) is used to choose the type of bus architecture. The objective of (6) is simply to minimize interconnect cost subject to interconnect speed constraints. The decision variable is 1 if bus i satisfies this objective and 0 otherwise. The model parameters are defined as follows: is the bandwidth of bus type i and is the required aggregate communication speed for the architecture. [MD95]

3.3 Software Partitioning

After the system architecture has been selected, the software tasks must be assigned to specific programmable processors to produce a complete HW/SW architecture. This problem can also be formulated as a 0-1 programming model as shown in (7). The model attempts to find an optimal partition which minimizes interprocessor communication while satisfying load balancing constraints. The decision variable is 1 if software task i is assigned to processor j, otherwise 0. and define the lower and upper bounds for the utilization of processor j. is required throughput of software task i, and is the peak throughput of processor j. The nonlinear quadratic objective function can be reformulated as a linear function using popular linearization techniques found in literature [OK92], thereby allowing integer linear programming packages such as MINTO to be used to be to solve the problem.

4. Conclusions

Our model of the conceptual prototyping process quantitatively describes the objectives and constraints involved in HW/SW codesign. Although these models show promise, there is still much more research to be done. The software development cost and time estimates have 20% and 70% error margins, respectively, for a fairly wide range of applications. However, these estimates can be improved via calibration to a specific organization’s cost functions and objectives. We need to perform more case studies using our codesign process in order to calibrate the cost models to signal processing applications. In addition, we will continue to search for efficient model formulations and look for ways of linearizing the nonlinear processor and memory models.

Interestingly, while the rapid prototyping community has largely ignored rigorous integer programming methods for “quick” simplified heuristics, the communications industry (e.g., AT&T, Airlines reservation systems) routinely uses optimization algorithms with variables numbering in a few tens of thousands and more. The authors feel that complex nonlinear and multiobjective functions cannot be optimized via the “human-in-the optimization-loop” methods, and any extra effort spent in the conceptual phase of the design process, is time well-spent.

References

[DM95] J. Debardelaben, V. Madisetti, “Hardware/Software Codesign for Signal Processing Systems - A Survey and New Results,” Proc. of the 29th Annual Asilomar Conference on Signals, Systems, and Computers, Nov. 1995

[TA93] D. Thomas, J. Adams, H. Schmit, “A Model and Methodology for Hardware-Software Codesign,” IEEE Design & Test of Computers, September 1993, pp. 6-15.

[KA93] S. Kumar, J. Aylor, B. Johnson, W. Wulf, “A Framework for Hardware/Software Codesign,” Computer, December 1993, pp. 39-45.

[GD93] R. Gupta, G. De Micheli, “Hardware-Software Cosynthesis for Digital Systems,” IEEE Design & Test of Computers, September 1993

[KL93] A. Kalavade, E. Lee, “A Hardware-Software Codesign Methodology for DSP Applications,” IEEE Design & Test of Computers, September 1993, pp. 16-28.

[KL94] A. Kalavade, E. Lee, “A Global Criticality/Local Phase Driven Algorithm for the Constrained Hardware/Software Partitioning Problem,” Proc. of the Third International Workshop on Hardware/Software Codesign, Sept. 1994

[IJ95] T. Ismail, A. Jerraya, “Synthesis Steps and Design Models for Codesign,” Computer, February 1995, pp. 44-52.

[GV95] D. Gajski, F. Vahid, “Specification and Design of Embedded Hardware-Software Systems,” IEEE Design & Test of Computers, Spring 1995

[ME95] V. Madisetti, T. Egolf, “Virtual Prototyping of Embedded Microcontroller-Based DSP Systems.” IEEE Micro, Oct. 1995

[AN95] J. Anderson, “Projecting RASSP Benefits,” Proc. Second Annual RASSP Conf., ARPA, US Dept. of Defense, Arlington, Va. 1995

[OK92] M. Oral and O. Kettani, “Reformulating Nonlinear Combinatorial Optimization Problems for Higher Computational Efficiency,” European J. Operational Research, Vol. 58, No. 2, 1992, pp. 236-249

[BO81] B. Boehm, Software Engineering Economics, Prentice-Hall, Inc. Englewood Cliffs NJ, 1981.

[MA95] V. Madisetti, “Rapid Prototyping of Application-Specific Signal Processors: Current Practice, Challenges, and Roadmap,” Proceedings of IEEE ICPP’95 Workshop on Challenges in Parallel Processing, Aug.1995

[LIU95] J. Liu, “Detailed model shows FPGAs’ true costs, EDN, pp. 153-158, May 11, 1995

[EH93] R. Ernst, J. Henkel, T. Benner, “Hardware-Software Cosynthesis for Microcontrollers,” IEEE Design & Test of Computers, Dec. 1993

[MD95] V. Madisetti, VLSI Digital Signal Processors, IEEE Press, 1995

Vijay K. Madisetti
School of Electrical & Computer Engineering,
Georgia Tech.
Atlanta, GA 30332-0250
vkm@ee.gatech.edu

James A. Debardelaben
School of Electrical & Computer Engineering,
Georgia Tech.
Atlanta, GA 30332-0250
jd@ee.gatech.edu



The RASSP Digest - Vol. 2, 4th. Qtr. 1995 newsletter/html/95q4/news_5.html