Abstracts for Randy H. Katz

The EECS Research Summary for 2003

Backup Path Allocation based on a Correlated Link Failure Probability Model in Overlay Networks

Weidong Cui
(Professors Randy H. Katz and Ion Stoica)

Communication reliability is a desired property in computer networks. One key technology to increase the reliability of a communication path is to provision a disjoint backup path. One of the main challenges in implementing this technique is that two paths that are disjoint at the IP or overlay layer may share the same physical links. As a result, although we may select a disjoint backup path at the overlay layer, one physical link failure may cause the failure of both the primary and the backup paths.

In this project, we propose a solution to address this problem. The main idea is to take into account the correlated link failure at the overlay layer. More precisely, our goal is to find a route for the backup path to minimize the joint path failure probability between the primary and the backup paths. To demonstrate the feasibility of our approach, we perform extensive evaluations under both single and double link failure models. Our results show that, in terms of robustness, our approach is near optimal and is up to 60% better than no backup path reservation and is up to 30% better than using the traditional shortest disjoint path algorithm to select the backup path.

More information (http://www.cs.berkeley.edu/~wdc/) or

Send mail to the author : (wdc@eecs.berkeley.edu)

An Architecture for Availability and Performance in Wide-Area Service Composition

Bhaskaran Raman
(Professor Randy H. Katz)

Novel value-added services will be the driving force behind future communication networks and the mobile Internet. In this context, we consider the idea of service composition, where independent service components deployed and managed by providers are assembled to enable a novel composed service. Two simple examples of composition are shown in Figure 1.

We address several challenges with respect to availability as well as performance of composed services. Using a trace-based study, we find that it is possible to define a notion of failure which lasts for several tens of seconds or more. Such long outages can be predicted with timeouts of much shorter durations--within about 1.2-1.8 seconds on most Internet paths, and such timeouts happen quite infrequently in absolute terms, only about once an hour or less. These point at the usefulness and feasibility of a failure detection mechanism.

We develop an architecture for service composition based on an overlay network of service clusters. The overlay network is virtual-circuit based, with the implication that we can do failure recovery without waiting for the failure information to propagate and stabilize across the network. We deploy a composed text-to-speech service application as well as our recovery mechanisms in a wide-area testbed and run long-running experiments to evaluate the improvement in availability due to our recovery algorithms. We observe dramatic improvements in availability. In quantitative terms, we are able to reduce the system downtime by factors of up to 2-10 in many cases. This shows the usefulness of our architecture from the point of view of the end-user.

Figure 1: Examples of service composition

More information (http://www.eecs.berkeley.edu/~bhaskar) or

Send mail to the author : (bhaskar@eecs.berkeley.edu)

Fast Optical Network Restoration

Fang Yu
(Professor Randy H. Katz)
(Micro) 02-032

Many carriers are migrating away from SONET ring restoration toward mesh restoration through intelligent O-E-O cross-connects (XC). In mesh restoration, when failure occurs, traffic along the service path needs to be quickly rerouted to the restoration path. Due to the nature of distributed protocols, several restoration paths may compete for limited capacity on the same link, which

The reason for crank back can either be different paths competing for the same channel, which we call glare, or there may not be enough free bandwidth available at all. Our research aims to develop mechanisms to reduce these two sources of crank back. We developed channel selection methods [1] that considerably reduced glare, while at the same time maintained large groups of contiguous channels (for high bandwidth connections) across multiple parallel links. In addition, we proposed and analyzed a hybrid solution [2] that utilized a centralized restoration path server to optimize the restoration path selection, yet utilized distributed control to compute service paths and set up service/restoration paths. Our objective was to achieve very fast provisioning, fast restoration upon network failure, and efficient use of capacity. To support this, we also presented an efficient restoration path selection algorithm.

US Patent, "Novel Link Selection Schemes in Cross-Connects," B. Cortez, B. D. Doverspike, R. Sinha, and F. Yu, filed December 2002.
F. Yu, D. Wang, R. K. Sinha, G. Li, B. Doverspike, and C. Kalmanek, "Hybrid Centralized/Distributed Approach to Optical Network Restoration," Proc. Optical Fiber Communication, March 2003.

More information (http://www.cs.berkeley.edu/~fyu) or

Send mail to the author : (fyu@eecs.berkeley.edu)

Traffic Matrix Estimation for the Delta Routing Protocol

George Porter and Minwen Ji1
(Professor Randy H. Katz)
Compaq/HP Systems Research Center

Large enterprises connect their locations together by building a network (sometimes called an Intranet) out of leased, private communications lines. Although these lines are dedicated to the company's traffic, and thus provide good quality of service, they are very expensive. Because of this, they are not always overprovisioned. HP's Systems Research Center (Palo Alto, California) has developed the delta routing protocol, which allows nodes on a company's network to communicate with each other using both private lines as well as virtual private network (VPN) tunnels over the shared Internet. Each node in the network makes a local decision as to which path (the private network or the shared Internet) is best and forwards data accordingly. Because each node uses only local information, it is possible that it will forward traffic to a part of the network that is congested. In this case, that distant node will have to forward such "excess" traffic over VPN tunnels rather than the private network.

Using this protocol as a base, our contribution is to install measurement agents into each node in the company's network to perform traffic matrix estimations in order to make smarter forwarding decisions. A traffic matrix is a data structure that captures the amount of traffic at each point in the network. With such data, nodes would be able to avoid sending traffic to a portion of the network that is congested. Instead, it would be able to tunnel the traffic around the congestion to a point on the private network where it will be able to continue to the destination. Unfortunately, in a large network it is impractical to give each node a consistent notion of the traffic matrix. Instead, we propose that nodes directly measure flows that they forward and use estimates for flows that travel through another part of the network. These estimates come from periodic flow measurements broadcast from other nodes. A private network using delta routing should be able to respond to congestion if it occurs by shedding excess load to the less predictable, but more flexible, shared Internet.

The delta routing protocol is currently implemented in the ns simulator, and the improved protocol which includes flow measurement and traffic-matrix estimation functionality is in progress. Simulations are going to be performed on both artifically generated networks as well as a model of Compaq's private network.

1HP Systems Research Center

Send mail to the author : (gporter@eecs.berkeley.edu)

User Experiments of Using Congestion Pricing to Allocate Access Link Bandwidth

Jimmy Shih
(Professor Randy H. Katz)

Congestion pricing, which varies prices according to load, more efficiently allocates scarce resources by modifying user behaviors. We study its effectiveness and user acceptance for allocating bandwidth at a local area network access link. We conducted two iterations of prototyping, deployment, and evaluation with about ten users. In the first iteration, we offered users three sizes of connectivity and varied their prices. We found that limiting user usages with different bandwidth sizes is not suitable for dealing with short bursts. In the second iteration, we offered users three levels of quality of service (QoS) that differed on degree of traffic smoothing and charged once every 15 minutes. We found that this scheme is effective because we can easily entice users to select a lower QoS, one with more smoothing applied to their traffic, by increasing the price for higher quality. Using a survey, users state that they would be willing to use the scheme because they only need to make a purchasing decision at most once every 15 minutes. Through simulations, we show that if half of the users in a large network can be enticed to have their traffic smoothed, then the burstiness at the network’s access link can be reduced by 20-30%.

More information (http://www.cs.berkeley.edu/~jshih) or

Send mail to the author : (jshih@eecs.berkeley.edu)

OverQoS: Offering QoS in the Internet

Lakshminarayanan Subramanian, Ion Stoica1, Hari Balakrishnan2, and Randy Katz3
(Professor Randy H. Katz)

We propose OverQoS, an architecture for providing Internet QoS using overlay networks. OverQoS empowers third-party providers to offer enhanced network services to their customers using the notion of a controlled loss virtual link (CLVL). The CLVL abstraction bounds the loss-rate experienced by the overlay traffic in order to eliminate the unexpected losses due to cross traffic in the underlying network.

We build a CLVL abstraction along every overlay link and treat all flows traversing the link as part of a bundle. OverQoS uses specific scheduling policies at the overlay nodes to distribute the available bandwidth and the loss amongst the competing flows in a bundle. We show that OverQoS can provide differential rate allocations, statistical bandwidth and loss assurances, and enable explicit-rate congestion control algorithms.

2Professor, MIT

More information (http://www.cs.berkeley.edu/~lakme) or

Send mail to the author : (lakme@eecs.berkeley.edu)

Guaranteed Protection for Independent Link Failure in Optical Networks

Li Yin and Fang Yu
(Professor Randy H. Katz)

Failure resilience is one of the desired properties in communication networks. A lot of schemes have been proposed to deal with the failure in optical layers, however, most existing protection schemes consider single link failures only. This is not realistic because it implies a dependency between link failures. Our algorithm aims to break this constraint and develop a new protection scheme for a more general model of independent link failures.

The problem can be formulated as follows: for a given primary path, find a backup path with minimum cost in terms of reserved bandwidth while providing a hard constraint on the protection success rate. We first model the problem using integer linear programming. Then we propose an approximation algorithm that can reduce the communication and computation overhead while guaranteeing the hard constraint on the protection success rate.

In our preliminary experiments, we investigate the performance of our algorithm in a network with 78 nodes and 122 links. The network topology is obtained from AT&T that is used in the ROLEX simulation. We evaluate the performance with respect to protection success rate, lightpath blocking rate, and wavelength usage ratio under various link failure rates and target protection success rates. Preliminary results show that our algorithm can always guarantee the target protection rate with a high bandwidth utilization rate. In addition, we compare our algorithm with the exclusive path protection and shared path protection schemes. Our algorithm can achieve a lower request blocking ratio and more efficient bandwidth usage while meeting the requirement of the pre-determined target protection success rate.

Send mail to the author : (yinli@eecs.berkeley.edu)

Resource Management in IP Telephony Networks

Matthew Caesar and Dipak Ghosal1
(Professor Randy H. Katz)

To provide good QoS to calls and maximize network resource utilization in IP Telephony networks, it is necessary to design a differentiated service architecture that includes call admission and call redirection policies. In this work, we first study a call admission policy based on congestion sensitive pricing. The scheme uses congestion pricing to preferentially admit users who place a higher value on making a call while simultaneously maintaining a high utilization of IP Telephony Gateway (ITG) voice ports. Second, we design a call redirection policy to select the best ITG to serve the call. The policy balances load to improve network efficiency and incorporates QoS sensitivity to improve call quality. We present a performance study of the call admission and the redirection policies using both analytical and simulation models. The techniques studied in this paper can be combined into a single resource management solution that can improve network resource utilization, provide differentiated service, and maximize provider revenue.

1Professor, UC Davis

More information (http://www.cs.berkeley.edu/~mccaesar/) or

Send mail to the author : (mccaesar@eecs.berkeley.edu)

A Scalable Broadcast Federation Architecture

Mukund Seshadri and Yatin Chawathe1
(Professor Randy H. Katz)
(NSF) EIA-0122599

Several protocols have been proposed to enable efficient multi-point communication (multicast or broadcast) both at the IP layer and at the application or overlay layer. However, no single protocol is universally deployed. In fact, we believe that none of the existing broadcast protocols will become the sole dominant Internet broadcasting technology. Instead, we expect islands of non-interoperable broadcast connectivity to exist. To address this, we propose an architecture that enables the composition of different multicast-capable networks, each with their own multicast or broadcast protocols, to provide an end-to-end broadcast service. We call this architecture the broadcast federation. In order to achieve high throughput scalability, our implementation is based on a clustered broadcast gateway (which is the centerpiece of the federation system). We design the broadcast federation architecture, implement the clustered broadcast gateway, and obtain results on the scalability of these gateways.

1ATT Research, Menlo Park, CA.

Send mail to the author : (mukunds@eecs.berkeley.edu)

An Overlay Policy Protocol for Wide Area Routing

Sharad Agarwal and Chen-Nee Chuah1
(Professor Randy H. Katz)

An increasing number of ASes have been connecting to the Internet through the BGP inter-domain routing protocol. With increasing stress on the scale of this system and increasing reliance on Internet connectivity, more participants want additional functionality from inter-domain routing that BGP was not designed to handle. For example, we believe that the recent trend toward multihomed stub networks exhibits an intent to achieve fault tolerant and load balanced connectivity to the Internet. However, BGP today offers route fail-over times as long as 15 minutes, and very limited control over incoming traffic across multiple wide area paths. More research literature and the news media are calling for stemming malicious or erroneous routing announcements. We propose a policy control architecture, OPCA, that runs as an overlay network on top of BGP. OPCA will allow ASes to negogiate and apply policy changes to routing that address these issues. The proposed architecture and protocol will co-exist and interact with the existing routing infrastructure and will allow for a scalable rollout of the protocol.

1Professor, UC Davis

More information (http://www.cs.berkeley.edu/~sagarwal) or

Send mail to the author : (sagarwal@eecs.berkeley.edu)

Detecting Shared Bottlenecks

Sridhar Machiraju
(Professor Randy H. Katz)

It is known that alternative paths to normal Internet routes could be used to improve connectivity. This can be achieved through well placed overlay routers. In order to prevent interference with each other, different overlay flows would benefit from knowledge of bottlenecks that they share. We are currently investigating techniques by which we can decide if two flows share the same bottleneck. Our goal is to do make this decision based on the correlation of the throughputs observed by the flows. Detecting such correlation is complicated because the short-term dependence of TCP throughput on packet drops is not very well understood. Also, the packet drops themselves may not be evenly distributed among flows in case of droptail queues.

Send mail to the author : (machi@eecs.berkeley.edu)

Applications and Services Infrastructure on Top of Optical Networking

Tal Lavian and John Strand1
(Professor Randy H. Katz)

The current optical infrastructure has abundant bandwidth, but limited services. A new service infrastructure is needed. Current optical networking is limited in computation and applications. We are looking at some services and architectural issues, such as: ways to build a light path among service providers; methods of building a service composition of the light-path between technologies, i.e., Access, Metro, and Long Haul; ways to share the expensive infrastructure among service providers; and finding the right available optical resource among providers. It is hard to add computation to the core of the optical infrastructure. We seek to determine types of intelligence the can be added to the edge of the optical core. We also are investigating how to build a service directory of storage, backup, computation, optical VPN, and available Lambdas.

Examples of services for research are: setting the right optical link for grid computing and finding the best instance of storage area network (SAN) and determining what is best; disaster recovery over Metro, including storage and computation; providing optical VPN on demand to a remote site needing 10 Gb/s for 25 minutes; providing a link with HDTV quality (for a remote medical operation lastin two hours); and providing periodic network backup capabilities (preferred to a backup bunker) without the need for local tapes. Finally, it is important to let the applications negotiate the resources' availability and control the network path creation.

I. Monga, B. Schofield, and F. Travostino, “EvaQ8—Abrupt, High Throughput Digital Evacuations over Agile Optical Networks,” IEEE Workshop on Disaster Recovery Networks, New York City, NY, June 2002.
“A Programmable Service Platform for Internet Service Architecture,” IEEE Int. Conf. Telecommunications, Papeete, Tahiti, February 2003.
1Outside Adviser (non-EECS), AT&T Research

Send mail to the author : (tlavian@eecs.berkeley.edu)

New Internet Architecture Resulting from Optical Networking

Tal Lavian and John Strand1
(Professor Randy H. Katz)

The optical networking infrastructure and DWDM have changed dramatically in the last few years. The bandwidth in the core (not accessed yet) has become a commodity, changing many networking assumptions. Recent advancements in optical technologies have created a radical mismatch between the optical transmission world and the electrical forwarding/routing world. Currently, a single strand of optical fiber can transmit more bandwidth than the entire Internet core. Furthermore, some of the basic characteristics of optical networking are different from traditional L3 networking.

Impendence mismatch has many dimensions. One example is the bottlenecks that are located at the peering points between the current ASs and ISPs. The peering is done in L3 and above. New sets of services can be built in L1-L2 and a dynamic light path can be created. Provisioning is a long and inefficient process. New technologies have emerged to build the capabilities of automatic switched optical networks (ASON) in the Long Haul and resilient packet rings (RPR) in the Metro. The capabilities of new optical technologies are limited by current Internet architecture. We are analyzing solutions to bridge the mismatch between optical and electrical transmission. One of our fundamental research goals is to determine what is needed to build a new Internet architecture that utilizes the new advancements in optical networking.

I. Monga, B. Schofield, and F. Travostino, "EvaQ8--Abrupt, High Throughput Digital Evacuations over Agile Optical Networks," IEEE Workshop on Disaster Recovery Networks, New York City, NY, June 2002.
"A Programmable Service Platform for Internet Service Architecture," IEEE Int. Conf. Telecommunications, Papeete, Tahiti, February 2003.
1Outside Adviser (non-EECS), AT&T Research

Send mail to the author : (tlavian@eecs.berkeley.edu)

Clustering Web Content for Efficient Replication

Yan Chen, Lili Qiu1, Weiyu Chen, and Luan Nguyen2
(Professor Randy H. Katz)
California Micro, (DARPA) N66061-99-2-8913, Ericsson, Nokia, and Siemens

Recently there has been an increasing deployment of content distribution networks (CDNs) that offer hosting services to Web content providers. In this project, we first compare the un-cooperative pulling of Web contents used by commercial CDNs with the cooperative pushing. Our results show that the latter can achieve comparable users' perceived performance with only 4-5% of replication and update traffic compared to the former scheme. Therefore, we explore how to efficiently push content to CDN nodes. Using trace-driven simulation, we show that replicating content in units of URLs can yield a 60-70% reduction in clients' latency, compared to replicating in units of Web sites. However, it is very expensive to perform such a fine-grained replication.

To address this issue, we propose to replicate content in units of clusters, each containing objects which are likely to be requested by clients that are topologically close. To this end, we describe three clustering techniques, and use various topologies and several large Web server traces to evaluate their performance. Our results show that the cluster-based replication achieves 40-60% improvement over the per Web site based replication. In addition, by adjusting the number of clusters, we can smoothly trade off the management and computation cost for better client performance.

To adapt to changes in users' access patterns, we also explore incremental clusterings that adaptively add new documents to the existing content clusters. We examine both offline and online incremental clusterings, where the former assumes access history is available while the latter predicts access pattern based on the hyperlink structure. Our results show that the offline clusterings yield close to the performance of the complete re-clustering at much lower overhead. The online incremental clustering and replication cut down the retrieval cost by 4.6-8 times compared to no replication and random replication, so it is especially useful to improve document availability during flash crowds.

1Microsoft Research
2Undergraduate (EECS)

More information (http://www.cs.berkeley.edu/~yanchen) or

Send mail to the author : (yanchen@eecs.berkeley.edu)

Internet Iso-bar: A Scalable Overlay Distance Monitoring System

Yan Chen and Lili Qiu1
(Professor Randy H. Katz)
California Micro, (DARPA) N66061-99-2-8913, Ericsson, Nokia, and Siemens

Estimating end-to-end Internet distance can benefit many applications and services, such as efficient overlay construction, overlay routing and location, and peer-to-peer systems. While there have been many proposals on Internet distance estimation, how to design and build a scalable, accurate, and dynamic monitoring system is still an open issue. To this end, we propose an overlay distance monitoring system, Internet Iso-bar, which yields good estimation accuracy with much better scalability and far less overhead for online monitoring than the existing systems. Internet Iso-bar clusters hosts based on the similarity of their perceived network distance, and chooses the center of each cluster as a monitor site. The distance between two hosts is estimated using inter- or intra-cluster distances. Evaluation using real Internet measurements shows that Internet Iso-bar achieves high estimation accuracy and stability with much smaller measurement overhead than Global Network Positioning (GNP) [1]. Furthermore, by adjusting the number of clusters, we can smoothly trade off the measurement and management cost for better estimation accuracy.

T. S. E. Ng and H. Zhang, "Predicting Internet Network Distance with Coordinates-based Approaches," Proc. IEEE Infocom, New York, NY, June 2002.
1Microsoft Research

More information (http://www.cs.berkeley.edu/~yanchen) or

Send mail to the author : (yanchen@eecs.berkeley.edu)

Route Flap Damping Exacerbates Internet Routing Convergence

Zhuoqing Morley Mao, Ramesh Govindan1, and George Varghes2
(Professor Randy H. Katz)
California MICRO Program, Ericsson, Nokia, Siemens, and Sprint

Route flap damping is a widely deployed mechanism in core routers that limits the widespread propagation of unstable BGP routing information. Originally designed to suppress route changes caused by link flaps, flap damping attempts to distinguish persistently unstable routes from routes that occasionally fail. It is considered to be a major contributor to the stability of the Internet routing system.

We show in this project that, surprisingly, route flap damping can significantly exacerbate the convergence times of relatively stable routes. For example, a route to a prefix that is withdrawn exactly once and re-announced can be suppressed for up to an hour (using the current RIPE recommended damping parameters). We show that such abnormal behavior fundamentally arises from the interaction of flap damping with BGP path exploration during route withdrawal. We study this interaction using a simple analytical model, and understand through the use of simulations the impact of various BGP parameters on its occurrence. Finally, we outline a preliminary proposal to modify a route flap damping scheme (based on ignoring monotonic route changes that flap damping triggers) that removes the undesired interaction in all the topologies we studied.

1Professor, USC
2Professor, UC San Diego

More information (http://www.cs.berkeley.edu/~zmao) or

Send mail to the author : (zmao@eecs.berkeley.edu)