Abstract
In the advent of smaller devices, a significant increase in the density of on-chip components has raised congestion and overflow as critical issues in VLSI physical design automation. In this paper, we present novel techniques for reducing congestion and minimizing overflows. Our methods are based on ripping up nets that go through the congested areas and replacing them with congestion-aware topologies. Our contributions can be summarized as follows. First, we present several efficient algorithms for finding congestion-aware Steiner trees that is, trees that avoid congested areas of the chip. Next, we show that the novel technique of network coding can lead to further improvements in routability, reduction of congestion, and overflow avoidance. Finally, we present an algorithm for identifying efficient congestion-aware network coding topologies. We evaluate the performance of the proposed algorithms through extensive simulations.
1. Introduction
In almost any VLSI design flow, global routing is an essential stage that determines the signal interconnections. Therefore, the capability of the global router may significantly affect the design turn-around time. Moreover, the results of the global routing stage impact many circuit characteristics, such as power, timing, area, and signal integrity. Global routing poses major challenges in terms of the efficient computation of quality routes. In fact, most of the global routing problems, even special cases, tend to be NP complete [1, 2].
In the advent of smaller devices, a significant increase in the density of on-chip components results in a larger number of nets that need to be routed, which, together with more stringent routing constraints, results in increasing congestion and overflow. In this paper, we propose novel techniques for congestion avoidance and overflow reduction. Our methods are designed for the rip-up-and-reroute phase of the global routing stage. At this stage, all the nets have already been routed using a standard prerouting technique however some of the nets need to be rerouted due to high congestion and overflow. Our methods are based on ripping up nets that go through congested areas and replacing them with congestion-aware topologies. The proposed techniques facilitate even distribution of the routing load along the available routing areas. We propose efficient algorithms for finding congestion-aware Steiner trees that favor uncongested routes. In addition, we use the novel technique of network coding for further reduction of congestion and overflow avoidance.
1.1. Congestion-Aware Steiner Trees
The major goal of congestion-aware Steiner tree routing is to find a tree that connects the required set of nodes (pins of a net) while avoiding congested areas with a minimum penalty in terms of the total wirelength. In addition, the running time of the routing algorithm should scale well with the growing number of nets. These requirements pose several challenges in terms of the algorithm design. The first challenge is to select a cost function that adequately captures the local congestion conditions at the edges of the routing graph. Next, the algorithm should find a minimum cost tree within acceptable running time. Since finding a Steiner tree is an NP-complete problem, the algorithm needs to use an approximation scheme or employ a heuristic approach. Finally, the proposed algorithm should ensure that the overall performance of the rip-up-and-reroute phase is satisfactory in terms of congestion mitigation and overflow reduction. In this paper, we evaluate several cost functions which take into account various factors such as wire density, overflow, and congestion history. We propose several efficient algorithms for Steiner tree routing and compare their performance. Our algorithms are based on known approximations for the Steiner tree problem, heuristic methods, and artificial intelligence techniques.
1.2. Network Coding
The basic idea of the network coding technique [3] is to enable the intermediate nodes to generate new signals by combining the signals arriving over their incoming wires. This is in contrast to the standard approach, in which each node can only forward or duplicate the incoming signals.
For example, consider a routing instance depicted in Figure 1(a). In this example, we need to route two nets, one connecting source with terminals , , and and the other connecting source with the same set of terminals. The underlying routing graph is represented by a grid as shown in Figure 1(a). Suppose that due to congestion, each edge of this graph has a residual capacity of one unit, that is, each edge can accommodate only a single wire. It is easy to verify that using traditional Steiner tree routing, only one net can be routed without an overflow. For example, Figure 1(b) shows a possible routing of a net that connects with terminals , , and . In contrast, Figure 1(c) shows that routing of both nets results in an overflow. In this example, two nets transmit different signals, and , over separate Steiner trees. Figure 1(d) shows that the network coding approach allows to route both nets without overflows. With this approach, the terminal creates a new signal, , which is delivered to terminals and , while the signals and are delivered to terminals , and directly. It is easy to verify that each terminal can decode the two original symbols and .
(a)
(b)
(c)
(d)
The network coding technique offers two distinct advantages. First, it has a potential of solving difficult cases that cannot be solved using traditional routing. For example, for the routing instance shown in Figure 1, the traditional routing results in an overflow of value 1, whereas with the network coding technique there are no overflows. Second, the network coding technique can decrease the total wirelength. For example, in the routing instance shown in Figure 1 the total wirelength for the traditional routing solution is 8, whereas for the network coding solution, the total wirelength is 7.
1.3. Previous Work
In the past decades, researchers have strived to improve the performance of global routing algorithms (see, e.g., [4–6] and references therein). To handle the complexity of large-scale global routing, multilevel routing techniques are proposed in [7, 8]. Recently proposed BoxRouter [9, 10] is based on progressive integer linear Programming (ILP) and rip-up-and-reroute techniques. A fast routing method is presented in [11]. Reference [12] proposes an approach based on the Lagrangian multiplier technique. An effective edge shifting technique is presented in [13]. Most of these previous works adopt the rip-up-and-reroute strategy. However, they usually reroute one path (i.e., a 2-pin connection) at a time. In contrast, our method reroutes entire multipin nets. We also propose to use network coding techniques to further reduce congestion and eliminate overflows.
1.4. Contribution
The paper makes the following contributions. First, we propose several algorithms for finding efficient congestion-aware Steiner trees. Second, we show that the novel technique of network coding can lead to further improvements in routability, reduction of congestion, and overflow minimization. Finally, we provide an algorithm for identifying efficient congestion-aware network coding topologies. We evaluate the performance of the proposed algorithms through extensive simulations.
2. Model
In this paper, we adopt the most commonly used global routing model. The routing topology is represented by a grid graph , where each node corresponds to a global routing cell (GCell) [10, 12] and each routing edge corresponds to a boundary between two adjacent GCells. A set of nets are to be routed on this graph. Each net connects a source node with terminal nodes . If there is a wire connection between two adjacent GCells, the wire must cross their boundary and utilize the corresponding routing edge. Each routing edge has a certain routing capacity which determines the number of wires that can pass through this edge. We denote by the number of wires that are currently using edge .
2.1. Global Routing Metric
The goal of a global router is to minimize congestion. Some of the important metrics for a global router are defined as follows.
(i) Overflow
For each edge , the overflow of is defined as
The maximum overflow is defined as
Total overflow is defined as
(ii) Wirelength
Total wirelength is defined as
(iii) Density
The density of edge is defined as
2.2. Cost Functions
Our algorithms associate each edge in the graph with a cost function which captures its congestion and overflow. The cost of the tree is defined as the sum of the costs of all of its edges. Our goal is to identify trees that go through congested areas and replace them by Steiner trees or network coding topologies that go through areas with low congestion.
In this work, we consider several cost functions, described below.
Polynomial Cost Function
We propose a cost assignment function, where the cost of an overflowed edge is a polynomial function of the sum of its density and overflow. Formally, our proposed cost function is defined as follows:
where is a constant which determines the relative penalty for the congested edges.
Exponential Cost Function
We use the cost assignment function proposed by [12]. With this cost assignment, the cost of an edge is an exponential function of its density
where is a constant which determines the penalty for overflowed edges.
History-Based Cost Function
This cost function assigns cost to an edge based on its congestion history [10, 12]. Specifically, each edge is associated with a parameter that specifies the number of times the edge has been overflowed during the previous iterations of the algorithm. That is, each time the edge with an overflow is used, the parameter is incremented by one. Then, the modified cost of the edge is defined as follows:
Here, is either the polynomial cost function (6) or exponential cost function (7). If the density of the edge is less than or equal to one, the parameter is initially set to zero.
Since we focus on the rerouting phase, we assume that for each net , there exists a Steiner tree which connects all nodes in . Given a set of trees , we can determine the values of for each edge and identify the set of congested nets . A net is referred to as congested if its Steiner tree has at least one edge with overflow.
We propose a two-phase solution for rerouting congested nets using congestion-aware topologies. In the first phase we iteratively rip up each net and reroute it using a congestion-aware Steiner tree with the goal of minimizing the maximum overflow and the total overflow . In second phase, we deal with the nets that remain congested at the end of the first phase and rip-up-and-reroute pairs of congested nets using congestion-aware network coding topologies to further reduce congestion and minimize the number of overflows. Note that the nets considered in phase two correspond to difficult cases, where congestion avoidance was not possible even after several attempts of ripping up and rerouting individual nets. Therefore in the second phase, we consider the pairs of congested nets for further improvement. The example given in Figure 1 shows the advantage of using network coding topologies for routing pairs of nets over using standard routing techniques that handle each net separately.
3. Congestion-Aware Steiner Trees
In this section, we present several techniques for finding congestion-aware Steiner trees. Our goal is to find Steiner trees that minimize congestion with a minimum penalty in terms of the overall wirelength. We would like to achieve better tradeoffs between congestion mitigation and total wirelength. These tradeoffs are useful for practical applications as in some cases, congestion mitigation is preferable to wirelength reduction, whereas in other cases, the wirelength reduction is of higher priority.
3.1. Previous Work on Steiner Tree Routing
The Steiner tree problem is a well-studied NP-complete problem [14]. There is a wealth of heuristic and approximate solutions that have been developed for this problem. The best-known approximation algorithm has an approximation ratio of 1.55 (i.e., the cost of the tree returned by the algorithm is less than 1.55 times the optimum) [15]. The best known approximations require significant computation time, so we focus on computationally feasible and easy to implement approximation and heuristic solution for constructing Steiner trees.
3.2. Algorithms for Finding Congestion-Aware Trees
As mentioned above, our goal is to rip up and reroute nets that use congested edges of . For each net which has been ripped up, we need to find an alternative Steiner tree that uses uncongested routes. In this section, we describe five algorithms for finding congestion-aware Steiner trees. The first three algorithms use combinatorial techniques (see, e.g., [1, 16, 17]) while, the last two are based on the intelligent search techniques [18]. The performance of the algorithms is evaluated in Section 5.
Algorithm stTree1
This algorithm approximates a minimum cost Steiner tree by using a shortest path tree. A shortest path tree is a union of the shortest paths between source and a set of terminals . A shortest path tree can be identified by a single invocation of Dijkstra’s algorithm. However, the cost of the tree may be significantly higher than the optimum.
Algorithm stTree2
This algorithm constructs the tree in an iterative manner. We iteratively process the terminals in in the increasing order of their distance form . More specifically, we first find a shortest path between source and terminal . Then, we assign a zero cost to all edges that belong to and find a shortest path between and with respect to modified costs. The idea behind this algorithm is to encourage sharing of the edges between different paths. That is, if an edge belongs to , it can be used in with no additional cost. In general, when finding a shortest path to terminal , all edges that belong to paths of previously processed terminals are assigned a zero cost. This algorithm requires iterations of Dijkstra’s algorithm, but it typically returns a lower-cost tree than Algorithm stTree1.
Algorithm stTree3
This is a standard approximation algorithm with the approximation ratio of 2 (i.e., the cost of the tree returned by the algorithm is at most two times higher than the optimal cost). Specifically, with this algorithm, we find a shortest path between each pair of nodes in the set . Then, we construct a complete graph such that each node in corresponds to a node in . The weight of an edge is equal to the minimum length of the path between two corresponding nodes in . The algorithm then finds a minimum spanning tree in . Next, each edge in is substituted by the corresponding shortest path in , which results (after removing redundant edges) in a Steiner tree in that connects source with terminals in .
Algorithm stTree4
Algorithm is an intelligent search-based algorithm. Our approach is inspired by Algorithm . Algorithm is a shortest path algorithm that uses a heuristic function to determine the order of visiting nodes of the graph in order to improve its running time. Specifically, for each node , we define to be the maximum distance between node and a terminal which has not yet been visited. The distance between and is defined as the minimum number of hops that separate and in . The Algorithm follows the same steps as Algorithm , but it uses Algorithm with heuristic function to find shortest paths.
Algorithm stTree5
Algorithm is also based on Algorithm . It follows the same steps as Algorithm , but it uses Algorithm with the same heuristic function as in Algorithm .
4. Network Coding Techniques
In this section, we use the network coding techniques in order to achieve further improvement in terms of minimizing congestion and reducing the number of overflows. The network coding technique enables, under certain conditions, to share edges that belong to different nets. For example, in the graph depicted in Figure 1(c), there are two minimum Steiner trees one transmitting signal from source and the second transmitting signal from source . These two trees clash at the middle edge (emanating from ), resulting in an overflow. This conflict can be resolved by coding at node , which effectively allows two trees to share certain edges. Similarly, our algorithm will identify pairs of nets that share terminals and then apply network coding techniques to reduce overflow.
4.1. Previous Work on Network Coding
The problem of routing of multiple nets with shared terminals is related to the problem of establishing efficient multicast connections in communication networks. The network coding technique was proposed in a seminal paper by Ahlswede et al. [3]. It was shown in [3, 19] that the capacity of the multicast networks, that is, the number of packets that can be sent simultaneously from the source node to all terminals, is equal to the minimum size of a cut that separates from each terminal. Li et al. [19] proved that linear network codes are sufficient for achieving the capacity of the network. In a subsequent work, Koetter and Médard [20] developed an algebraic framework for network coding. This framework was used by Ho et al. [21] to show that linear network codes can be efficiently constructed through a randomized algorithm. Jaggi et al. [22] proposed a deterministic polynomial-time algorithm for finding feasible network codes in multicast networks. An initial study of applicability of network coding for improving the routability of VLSI designs appears in [23]. In [24], Gulati and Khatri used network coding for improving the routability of FPGAs, focusing on finding nets that are suitable for network coding. To the best of our knowledge, this is the first work that proposes efficient algorithms for finding the congestion-aware network coding topologies for VLSI circuits.
4.2. Network Coding Algorithm
We proceed to present the algorithm we use for constructing congestion-aware coding networks that reduce congestion and overflow.
The algorithm includes the following steps. First, we identify the subset of that includes nets that go through edges with overflow. Second, we identify pairs of nets in that share at least three common terminals. Next, we check, for each such pair of nets , whether we can replace the Steiner trees for and by a more efficient routing topology with respect to congestion and overflow.
More specifically, let be a pair of nets in that share at least three terminals. Let and be the source nodes of these nets. We denote the set of terminals shared by and by . We also denote by the set of terminals in that do not belong to that is, . Similarly, we denote by the set of terminals in that do not belong to that is, . Next, we find two congestion-aware Steiner trees and that connect to and to . These trees can be identified by one of the algorithms presented in Section 3. The parameter for each is updated after finding and .
Finally, we find a congestion-aware network coding topology that connects and to the common set of terminals in an iterative manner. First, we let be a Steiner tree with source and terminals . All edges of are always assigned zero cost. We then sort the terminals in the increasing order of their distance (in the original graph) from and process them in that order. For each terminal , we reverse all the edges in the path between source and terminal and find a shortest path between source and terminal . Then, for each link we perform the following procedure. If there exists a link in , we remove from otherwise, we add to . A sample execution of this procedure is shown in Figure 2. It is easy to verify that the algorithm produces a feasible network coding topology that is, a topology that ensures that for each terminal there are two-edge disjoint paths that connect and with . The formal description of algorithm for identifying the network coding topology, referred to as Algorithm NC, is given in Algorithm 1.
|
(a)
(b)
(c)
(d)
(e)
After the execution of the algorithm, we determine whether the total cost of , , and is less than the total cost of the original Steiner trees for nets and . If there is a reduction in terms of cost, the two original Steiner trees are replaced by , , and .
Our experimental results, presented in Section 5, show that the number of coding opportunities is relatively small. However, by applying the network coding technique on a limited number of nets, we can achieve a significant reduction in the number of overflows. Also, since the network coding technique is applied to a limited number of nets, the overhead in terms of the number of additional required gates is relatively small.
5. Performance Evaluation
We have evaluated the performance of our algorithms using the ISPD98 routing benchmarks [25]. All the experiments are performed on a 3.2 GHz Intel Xeon dual-core machine. In all experiments, we first run the Steiner tree tool Flute [26] in order to determine the initial routing of all nets in the benchmark. Next, we perform an iterative procedure, referred to as Phase 1, which processes each net with overflows and checks whether an alternative Steiner tree of lower cost and with smaller number of overflows exists, and if yes, it rips up the existing tree and replaces it with an alternative one. This phase uses one of the algorithms described in Section 3. Phase 1 terminates when four subsequent iterations yield the same cost and the number of overflows, indicating that further reduction in the number overflows is unlikely.
Next, we check whether the application of the network coding technique can further reduce the number of overflows. This phase is referred to as the Phase 2. We first identify pairs of nets that have overflowed edges and share at least there terminals. We then apply Algorithm NC, presented in Section 4, to find an alternative network coding topology and perform rip-up and reroute if such a topology is beneficial in terms of reducing congestion and eliminating overflows.
The experimental results are shown in Figures 3(a)–3(c). The figures present average performance over all ten benchmarks. The cost function for this set of experiments was set according to (6) with . We have observed that larger values of yield fewer overflows but result in larger running times and increased wirelength. We also note that for Phase 1, Algorithm shows the best performance in terms of reducing the total number of overflows as well as reducing the maximum overflow. In fact, Algorithm eliminates all overflows in all benchmarks, except for ibm4 as given in Table 1. We also note that Algorithms and yield Steiner trees with a smaller total wirelength. This is due to the fact the intelligent search algorithms favor paths that have small hop count.
(a)
(b)
(c)
(d)
We observe that the network coding technique results in a considerable reduction of the total number of overflows as well as reduces the maximum overflow. Furthermore, for each pair of nets for which we perform network coding, the number of required gates is small. Moreover, in all cases that we have encountered, the network coding operations can be performed over finite field , that is, each encoding node can be implemented with a single XOR gate. Such gates incur minimum overhead, because they can serve as buffers for long wires. An example of how network coding can help in reducing the wirelength of two nets in the ibm1 benchmark is shown in Figure 4.
(a)
(b)
Figure 3(d) compares the running times of the different Algorithms for Phase 1 and Algorithm NC for Phase 2. As expected, Algorithm is one of the fastest algorithms, whereas Algorithm and have running times comparable to Algorithm . Moreover, Algorithms and are faster than their counterparts (Algorithms and , resp.). This is due to the fact that intelligent search methods speed up the search by preferring nodes closer to the destination.
In the second set of experiments, we evaluated the performance of three cost functions mentioned in Section 2.2 on the ISPD98 benchmarks using Algorithm for Phase 1 and Algorithm NC for Phase 2. For cost function given by (6), we used , whereas for cost function given by (7), we used , and for cost function given by (8), was a polynomial cost function with . The results are shown in Table 2. Polynomial cost function showed the best performance in terms of overflows.
In another set of experiments, we worked on slightly modified ISPD98 benchmark files. The modification included reducing the vertical and horizontal capacity by one unit. For Phase 1, we used Algorithm and then applied Algorithm NC in Phase 2. Polynomial cost function given by (6) was used to check the performance on these more congested cases. The results are given in Table 3.
We have also conducted the same experiment on several selected benchmarks on the output of MaizeRouter [13]. In these experiments, we iteratively reduced the vertical and horizontal capacity of the ISPD98 benchmarks until we got overflows while running them through MaizeRouter. Then, we used the output of MaizeRouter as input to Phase 1 using Algorithm , and after that, we have applied Algorithm NC in Phase 2. The cost function used was polynomial with . The results are shown in Table 4. The results demonstrate that Algorithms stTree3 combined with Algorithm NC perform well and can contribute to further reduction of the number of overflows.
6. Conclusions
In this paper, we presented several efficient techniques for rip-up-and-reroute stage of the global routing process. Our techniques are based on ripping up nets that go through highly congested areas and rerouting them using efficient Steiner tree algorithms. We have considered several efficient Steiner tree algorithms as well as several cost functions that take into account congestion and overflow. We have also studied the application of the novel technique of network coding and showed that it can efficiently handle difficult routing cases, facilitating reduction in the number of overflows.
Acknowledgment
A preliminary version of this paper appeared in the proceedings of the 2009 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) [27]. This work was supported in part by NSF grant CNS-0954153.