A Study On Border Gateway Protocol Computer Science

Essay add: 26-02-2016, 13:02   /   Views: 21

The Internet is divided into domains, also called Autonomous Systems. Each AS is usually administered by a single entity, such as a company or an ISP. The protocol currently deployed to distribute routing information between domains is the Border Gateway Protocol (BGP). In BGP, external BGP (eBGP) sessions are established to exchange routes with neighbouring AS's. BGP routes are distributed inside an AS by means of internal BGP (iBGP) sessions. A BGP route is composed of a prefix, a Next-Hop (NH), and a set of attributes. The attributes are used in the BGP decision process. The NH is the address of a router at the border of the domain. This router is able to farther forward traffic toward the destinations belonging to the prefix. Initially, on iBGP sessions routers were only allowed to advertise routes that were received on eBGP sessions. Thus, re-distributing BGP routes to all the routers of an AS required to set-up a full-mesh of iBGP sessions in the AS. This leads to scalability issues in ASs with hundreds of routers. Today, the trend is to use Route Reflectors (RR) in large ASs. RR re-advertise routes learned from one iBGP peer to another iBGP peer without having to form adjacency with each other and there for reduces the number of iBGP sessions to be configured.

However, Route Reflection has a negative impact on the diversity of routes available in the network. In large networks, route reflection contributes to a situation where each BGP router lacks complete network reachability information about external networks. This is an important issue as routers may not be able to quickly swap to an alternate route in case of a route failure. There for when a route become failure, BGP routers lose reachability to the destination and have to wait for BGP to converge inside the AS prior to join the destination again. BGP convergence may take a few tens of seconds depending on the value of BGP timers and on number of routes that fail. If BGP routers had route diversity, the switch over to an alternate route would take less than a second [1].

Literature Review

The BGP slow convergence was a highly discussed area and many researchers have suggested different enhancements for improving BGP convergence speed. In [2], Labovitz et al. says that recovery from a failure impacting inter-domain routes takes three minutes in average. This research work examines the latency in Internet path failure, failover and repair due to the convergence properties of inter-domain routing. This experiment has shown that inter-domain routers in the packet switched Internet may take tens of minutes to reach a consistent view of the network topology after a fault. These delays occurred due to temporary routing table oscillations formed during the operation of BGP path selection process on Internet Backbone routers. During these periods of delayed convergence, the Internet routes will experience intermittent loss of connectivity, as well as increased packet loss and latency. This research work was carried out with time duration of two years by conducting experiments on key portions of the Internet infrastructure. The results represented several properties of convergence and duration of failures like multi-homed failover averages about three minutes and might trigger oscillations lasting as long as fifteen minutes. Further it has also shown that these delays will grow with respect to the number of AS's in the internet. This experiment was done by injecting over 250,000 routing faults into geographically and topologically diverse peering sessions with five major commercial internet service providers and measured the impact of these faults through both end-to-end measurements and by logging ISP backbone routing table changes.

In another research work [3] by Wang et al. showed that routing changes happened following a failure contribute towards significant end-to-end packet loss. This study has proved that end-to-end Internet path performance degradation is correlated with routing dynamics. This work focuses on factors such as topological properties, routing policies and iBGP configurations that affect the routing performance degradation. Extensive measurement has been done that involves controlled routing updates through two tier-1 ISP's and active probes of a diverse set of end-to-end paths on the Internet to replicate the impact of routing changes on data plane performance. This study highlights that routing changes contribute to significant packet loss and also shows that common routing policies, iBGP configurations of ISP's and MRAI timer value can directly affect the end-to-end path performance during routing changes. This experiment has been carried out using becon prefixes with regulated routing changes that actively probe a host in every one hour. Using data plane performance metrics like Packet loss, Packet Delay and Out-of-order packets are used to measure the impact of routing events on end-to-end performance.

Another research work [4] proposes technique to improve BGP convergence time. This study is based on the fact that BGP takes substantial amount of time and messages to converge and stabilize following the failure of some nodes in the Internet. This paper suggests a minor modification to BGP that eliminates the convergence problem pointed out and ease the communication complexity of BGP by facilitating a way to propagate node/link failure information faster, while route update information to propagate somewhat slower. This is achieved in BGP by allowing withdrawal messages to propagate with no delay as fast as the network forwards them, while announcements propagate as they do in BGP with a delay at each node of one minRouteAdver, where minRouteAdver is the amount of time BGP enforces between the sending of consecutive announcements from one router to its neighbours. Its default value is 30 secs. The convergence latency is due to link failure information traverse the network recursively. In other words, following the failure of a destination network or links towards destination, there are some incorrect information (route withdrawal update) floating around the network for a long period of time. During this period some routers may relay on this false information and propagate it towards other networks. This false information has significant negative impact on BGP convergence both in the case of fail-over and fail-down. The improvement was carried out using a flushing algorithm that quickly removes the withdrawal information based on the AS_PATH traversed.

In research work [5] which also aimed at improving BGP convergence time, has presented a new mechanism for improving the convergence properties of path vector algorithms, such as BGP. Using the route's path vector information, two consistency assertions have developed for path vector routing algorithms that are used to compare similar routes and identify infeasible routes. This work makes use of the information provided in the AS path to define route consistency assertions and used these assertions to identify infeasible routes. Two apply these assertions in BGP, mechanisms to signal failure, policy withdrawal and traffic engineering are provided. This approach was implemented and deployed in a BGP test bed and evaluated using simulation. In the network test bed, these enhancements improved BGP convergence time for a failure withdrawal from 30.3 seconds to 0.3 second and the convergence time after a route change improved from 64.9 seconds to 0.1 second. In simulation tests with a 60 AS network topology, the convergence time after a failure withdrawal improved from 3337.0 seconds to 19.5 seconds and the convergence time after a route failover improved from 471.2 seconds to 93.9 seconds.

Another study [6] highlights the root cause of BGP slow convergence as the path exploration which is not addressed effectively within the existing BGP framework. The path exploration phenomenon is inherent in all path vector protocols. The main reason for path exploration is the dependency among paths propagated through the network. Addressing this problem in BGP is significantly difficult as the AS paths exchanged between BGP routers are summarized. This research work explains why existing BGP framework cannot effectively counter path exploration and propose a mechanism called “Forward edge sequence numbers” to mark the AS paths with additional path dependency information. Then these researchers have also developed an enhanced path vector algorithm called EPIC which is intended to limit path exploration and facilitates faster convergence. This research work has demonstrated that EPIC was able to achieve significant improvement in routing convergence when compared to BGP and other solutions.

However, research work called “R-BGP - Staying Connected In a Connected World” [7] claims that reducing BGP convergence speed is not sufficient to ensure the level of reliability required by delay sensitive applications. This study insists on the fact that when internet links status become intermittent, the dynamics of BGP may cause of several minutes of packet loss. This loss occurs even there exists multiple paths between sender and receiver AS's. The objective of this work is to ensure that Internet AS's stay connected as long as the underlying network is connected and the proposed solution called “R-BGP” works by pre-computing a couple of strategically chosen failover paths. R-BGP guarantees that a domain will not get disconnected from any destination as long as it has a policy complaint path to that destination after convergence. This can be implemented using a few practical and simple modification to BGP, and requires to announce only one path per neighbour. “R-BGP” guarantees that AS learns redundancy paths from its neighbouring AS and these paths are not distributed with the classical implementation of BGP. The redundancy paths are used if the primary routes advertised by BGP fail. R-BGP reduces the transient traffic loss inside an AS caused by link failures from 22% for edge links and 14% for core links down to zero percent. In “R-BGP”, instead of trying to reduce the period of convergence it focus on protecting data forwarding. It isolates the data plane from any harmful impact that convergence might cause. When BGP is waiting to converge to the preferred route, “R-BGP” set the data plane to forward packets on pre-computed fail-over paths. Its fail-over design addresses two important factors: “Ensuring Low Overhead” and “Guaranteeing Continuous Connectivity”. Thus, packet forwarding can continue unaffected throughout the convergence.

In another study called “MIRO:-Multi Path Inter-domain Routing” [8], proposes a solution to receive multiple paths to external destinations for an AS. The researchers figure out that the current inter-domain routing protocol limits each router to use a single route for each destination prefix, which is not feasible for the diverse requirements of end users. Some proposals were offering route alternate for source routing where end hosts or edge routers select end-to-end paths. But source routing leaves transit AS's with very little control and introduces difficult scalability and security challenges. This work proposes a multi path inter-domain routing protocol called MIRO which offers great flexibility, while giving transit AS's control over the traffic flow through their infrastructure and avoiding state explosion in disseminating reachability information. MIRO learns default routes via the existing BGP protocol and arbitrary pairs of AS's can negotiate the use of additional paths. MIRO retains the simplicity of BGP and remains backwards compatible with BGP to allow for incremental deploy ability. This works claims, experiments with Internet topology and routing data highlights that MIRO offers substantial flexibility for path selection with reasonable overhead. The researchers criticize existing inter-domain routing methods and other methods like source routing and overlay networks for their limited control and inefficiency in intermediate AS's routing.

MIRO Protocol Design

This study proposes extending BGP into a multi-path routing protocol while keeping the scalability, control for intermediate AS's and backward compatibility. The key features of MIRO are AS-level Path Vector routing for scalability, Pull-based Route retrieval for backward compatibility and scalability, Bilateral Negotiation between AS's to contain complexity, Selective Export of extra routes for scalability and to give control to intermediate AS's and tunnelling in the data plane to direct packets along the chosen routes.

AS-Level Path-Vector Protocol:- Researchers argue that routing at the AS level is the right choice. Because each AS comes under single authority making the AS eligible for trust and policy specification and routing at the AS level is more scalable than at the link level. Apart from these, each AS can hide its internal structure and adjust the flow of traffic without affecting the AS path. Finally, following AS level routing makes it easy to verify the performance and reliability of a route conforms to an AS-level service contract.

Pull-based Route Retrieval:- The default routes provided by BGP is sufficient for many AS's. However, having each AS propagate alternate routes to every neighbour would significantly limit the scalability of inter-domain routing. In MIRO, instead of pushing extra routes to all neighbours, it has AS's actively solicit alternate routes only when needed. For example, in Figure 1, AS A is the only AS that is unsatisfied with its default route (ABEF). As a result, AS A asks AS B to advertise alternative routes, possibly including a routing policy (e.g., “avoid routes traversing AS E”) in the request. All other ASes simply use their default routes and incur no additional overhead.

ASes that have not deployed our multi-path extensions to BGP can continue to use today's version of exterior gateway routing protocol. For example, ASes C and F do not need to run the enhanced protocol for AS A to be able to query AS B for extra routing options. Each AS can decide on its own whether to deploy the enhanced protocol so that a value-added service could be offered.

Bilateral negotiation Between ASes: - MIRO is based on bilateral negotiation between ASes, where one AS asks another to advertise alternate routes. In Fig. 1 negotiating with AS B would be sufficient for AS A to learn a path to AS F that covers AS E. In bilateral negotiations, the AS that initiates the negotiation is called as requesting AS and the other AS is called as the responding AS. AS A may request several ASes (e.g. B and D) to advertise additional paths with the goal of discovering paths that avoid traversing via AS E. Also, with respond to a request an AS may provide additional paths obtained from another negotiation as new candidates.

MIRO defaults to the single-path routing provided by conventional BGP but allows ASes to negotiate alternate paths as needed. This provides flexibility where needed while remaining backwards compatible with BGP. Compared to source routing, MIRO gives transit ASes more control over the flow of traffic in their networks. An evaluation on realistic AS-level topologies shows that MIRO exposes much of the underlying path diversity in Internet.

In another study called “Quantifying the BGP routes diversity inside a tier-1 network” [9] proves the impact of BGP convergence on the network after a fault. It highlights that in a network with Route Reflectors, most routers do not possess multiple routes with alternate NHs for most destinations. Thus, if a route fails, the route loses reachability to the destination of the route. Routers have to wait for BGP to converge inside the AS before being able to join the destination again. In this work, researchers quantify the diversity of BGP routes inside a tier-1 network. The analysis shows that the use of route-reflection leads to a very poor route diversity compared to an iBGP full-mesh. Most routers inside a tier-1 network know only a single external route in eBGP origin. This research work identifies two causes for this lack of diversity. First, some routes are never selected as best by any router inside the network, but are known only to some border routers. Second, among the routes those are selected as best by at least one other router, a few are selected as best by a majority of the routers, preventing the propagation of many routes inside the AS. This study shows that the main reason for this diversity loss is how BGP chooses the best routes among those available inside the AS.

The study was carried out using a simulated model of a Tier-1 ISP. For this, researchers used the physical topology of the network like links and IGP weights, as well as the configuration of the BGP routers. In order to actually replicate the topology, researchers have obtained the “Adj-RIB-In” of the route-reflectors of the actual network. Because the BGP routes present in the “Adj-RIB-In” of internal routers do not always contain the information about which eBGP peer actually originated a route, some reverse-engineering of the route origin was necessary. The reason for routes obtained from route-reflector is because it may contain significant number of eBGP peering. Two cases are possible when trying to find the entry point of a route:

- The BGP next-hop of the route is the IP address of an external peer. In this case the attention was paid to advertise this route from the external peer found to the internal router with which the external peer has established the eBGP session.

- The BGP next-hop of the route is the IP address of an internal router because this router has been configured with next-hop-self. The originating external peer that advertised the route to the internal router has to be found. To find it, researchers rely on the AS path information. They search for eBGP peers belonging to the leftmost AS on the AS path that have an eBGP peering with the internal router. To ensure that the simulation model is correct, researchers have validated the conversion by injecting the routes in the model and then checked the routes computed by the model against the original best routes seen in the route-reflectors.

To ensure that the simulation model is correct, researchers have validated the conversion by injecting the routes in the model and then checked the routes computed by the model against the original best routes seen in the route-reflectors.

Example of Route Diversity Loss:-

When relying on an iBGP full-mesh, all the external routes selected as best by the border routers are known to all other routers inside the AS. An iBGP full-mesh is thus "ideal" in terms of the diversity of the routes known to all routers inside an AS, at the cost of a large number of iBGP sessions. Even this "ideal" situation might hide some eBGP routes when a border router has multiple eBGP sessions or when it does not choose as its best route one among its eBGP-learned ones. This would happen if one of its non eBGP-learned routes has a higher local-pref or smaller AS path length than its eBGP-learned routes. A loss in diversity will thus occur only because of this order of the rules of BGP decision process.

The two main causes for loss of diversity. Prefix p is advertised to AS4 by 3 neighbouring ASes (AS1, AS2 and AS3), two of them at border router BR2 and another at border router BR1.eBGP sessions are indicated by solid lines, while iBGP sessions by dashed lines. Arrows indicate the propagation of a route from one router to another. Only the best route chosen by BR2, let us call it pbest BR2, will be propagated inside AS4. The best route propagated by BR1, assuming it is the external one (pbest BR1), will also be propagated within AS4. Route reflector RR is on the iBGP propagation path of both routes pbest BR1 and pbest BR2, hence it will choose at most one of these as best route, which we call pbest RR. As there is one route-reflector in AS4, all other routers are clients, hence because of the iBGP propagation rules RR will redistribute its best route to all its clients except the one from which it learned the route. To prevent this loss of diversity, several solutions can be envisioned. First, one can change the location of the eBGP peerings so as to minimize the loss of routes at the border routers. Changing the location of eBGP peerings is typically not practically feasible because it depends on the slots available on the routers and the geographical constraints about where peers can connect to the routers inside the AS. Secondly, to reconfigure the iBGP graph by adding and removing iBGP peerings between routers and another solution is to redistribute more than a single route.

Studies have also been completed in the area of route resiliency towards distant destinations. In a research paper “Achieving sub-50 milliseconds recovery upon bgp peering link failures” [10] by C. Filsfils proves that BGP external peering links fail as frequently as intra domain links for short period of time. This paper proposes a new fast re-route mechanism where routers are prepared to react swiftly to intra-domain link failures. This method allows each router to pre-compute a protection tunnel (IP tunnel) to an alternate next-hop which can be used to reach the same external destinations as via the protected link. However, this method requires a new type of routes in BGP. Protected routes are advertised inside AS through iBGP sessions. The researchers propose a BGP based auto-discovery technique that allows each router to learn the candidate protection tunnels for its links. Each router selects the best protection tunnels for its links and when it detects an inter-domain link failure, it immediately encapsulates the packets to send them through the protection tunnel. This solution is applicable for the links between large transit ISPs and also for the links between multi-homed stub networks and their providers.

A different approach is taken in “Advertising Multiple Next-hop Routes in BGP” [11] by M.Bhatia to obtain higher NH diversity in participating routers by establishing an extension to BGP that allows multiple route advertisements for a single prefix. However, research work “Routing Oscillations using BGP Multiple Paths Advertisement” [12] by Van den Schrieck shows that the previously said tunnel extension may lead to BGP route flapping. A study on BGP convergence time “BGP convergence in much less than a second” by Filsfils has proposed a routing table architecture that depends on the knowledge of backup next hops to reduce BGP convergence rate. The proposed architecture is BGP Prefix Independent Convergence (BGP PIC) and has to be used along with extended IP tunnel as prescribed above to achieve desired results. Another study features “Reliability-aware iBGP Route Reflection Topology Design” [13] by L. Xiao, J. Wang, and K. Nahrstedt consider designing of robust iBGP topologies to minimize the chances of iBGP sessions failures and the number of iBGP sessions that might fail. They address the problem of finding reliable route reflection topologies for iBGP networks, which is of great importance to increase the robustness of iBGP operations. They present an initial reliability model and two metrics, iBGP expected lifetime and expected session loss to evaluate the reliability of route reflection topologies and to investigate the design problem. But this approach does not guarantee next hop diversity in routers. Packet loss may occur when iBGP sessions are taken down as part of maintenance of routers because diverse next hops are not readily available at these routers.

Several router vendors like Juniper have implemented a BGP extension called 'external best' [14] which is useful if router prefers a route learned on an iBGP session from the routes received on eBGP sessions. When 'external-best' option is enabled, router advertises its best eBGP route to its iBGP peers. This method increases the next hop diversity in some routers of the AS. However, it does not entirely resolve the next hop diversity problem.

If link between R22 and R12 is configured as secondary link using LOC_PREF attribute, the external route learned for prefix P by R22 is not advertised to other routers in AS2. There for R21, R23 and R24 would not be having Next hop diversity for prefix P via R22 in its BGP routing table. With 'external-best' option enabled on R22, it advertise its eBGP route to RR R24 resulting in R24 achieves route diversity for prefix P. However, the best route in R24 is via R21, it will not advertise route with next hop as R22 to other routers. Consequently, other routers end up with single next hop diversity for prefix P.

A novel architecture has been proposed for route distribution inside an AS by implementing a server that distributes external routes to all the BGP routers in an Autonomous System in a research paper titled “Design and implementation of a routing control platform”[15] by M. Caesar. In its current implementation, server propagates only a single route per destination to each router inside AS. Such architecture eliminates the burden of designing iBGP topologies and also promising a future for diverse routing.


1 Improving route diversity through the design of iBGP topologies, Tomonori Takeda, Eiji Oki, NTT Japan 2008

2 C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian, “Delayed internet routing convergence,” in ACM SIGCOMM 2000, August 2000.

3 F. Wang, Z. M. Mao, J.Wang, L. Gao, and R. Bush, “A measurement study on the impact of routing events on end-to-end internet path performance,” in ACM SIGCOMM 2006, September 2006.

4 A. Bremler-Barr, Y. Afek, and S. Schwarz, “Improved BGP convergence via ghost flushing,” in IEEE INFOCOM, March 2003.

5 D. Pei, X. Zhao, L. Wang, D. Massey, A. Mankin, S. F. Wu, and L. Zhang, “Improving BGP convergence through consistency assertions,” in IEEE INFOCOM, June 2002.

6 J. Chandrashekar, Z. Duan, Z.-L. Zhang, and J. Krasky, “Limiting path exploration in BGP,” in IEEE INFOCOM, March 2005.

7 N. Kushman, S. Kandula, D. Katabi, and B. M. Maggs, “R-BGP: Staying connected in a connected world,” in 4th USENIX Symposium on Networked Systems Design & Implementation (NSDI'07), April 11-13th 2007.

8 W. Xu and J. Rexford, “Miro: multi-path interdomain routing,” in ACM SIGCOMM 2006, September 2006.

9 S. Uhlig and S. Tandel, “Quantifying the BGP routes diversity inside a tier-1 network,” in Proceedings of Networking 2006, Coimbra, Portugal, May 15-19th 2006.

10 O. Bonaventure, C. Filsfils, and P. Francois, “Achieving sub-50 milliseconds recovery upon bgp peering link failures,” in Proceedings of the 2005 ACM CoNext, 2005, pp. 31-42.

11 M. Bhatia, “Advertising multiple nexthop routes in BGP,” August 2006, internet draft, draft-bhatia-bgp-multiple-next-hops-01

12 V. Van den Scrieck and O. Bonaventure, “Routing oscillations using BGP multiple paths advertisement,” June 2007, internet draft, draftvandenschrieck-bgp-add-paths-oscillations-00.txt

13 L. Xiao, J. Wang, and K. Nahrstedt, “Reliability-aware iBGP route reflection topology design,” in 11th IEEE International Conference on Network Protocols (ICNP), November 2003.

14 Juniper Networks, Inc., “Configuring BGP routing - advertising routes: bgp advertise-best-external-to-internal,” 2007, http: // html/bgp-config10.html.

15 M. Caesar, D. Caldwell, N. Feamster, J. Rexford, A. Shaikh, and J. van der Merwe, “Design and implementation of a routing control platform,” in (NSDI), May 2005.

Article name: A Study On Border Gateway Protocol Computer Science essay, research paper, dissertation