My part of this tutorial summarized the current state of knowledge regarding feedback and reliability for memoryless channels. The focus was mostly on the papers below. In particular, this paper on hard deadlines, this paper on unreliable feedback for erasure channels, and this paper on soft deadlines and noisy feedback. On the side-information side, I touched on this paper and on the universality side, I touched on this paper on compound channels.
The linked papers above contain more detailed bibliography information for researchers who want to learn more about the area.
The long term goal is to better understand the nature of
information in interactive and networked contexts. My thesis just opened the door and showed
that there were many questions in this area and that we had a hope of
solving them. We have built considerably upon that since. The
following works should be of interest. For convenience, we
have divided them up into partially overlapping sub-areas:
If the above tutorial slides are too long for your taste, the second best way to understand some of the cutting edge results in this area is to view this presentation that was made by me at EPFL in July 2006. This is a version of a talk that I also have given at ETH Zurich, Berkeley, Stanford, MIT, Yale, Princeton, Caltech, USC, Cornell, BU, and UC Davis. Partial results in this direction were also presented at UMich Ann Arbor in Spring 05 and UIUC in Fall 04.
My key contribution here is the realization that if our interest is in streaming data, then block-codes can be very suboptimal, especially when feedback is available. This is true not only in the case of hard-latency requirements, but also when the deadlines are softer. Architecturally, these results suggest that block or packet-level retransmissions are not a good way to do feedback error correction. Rather, incremental redundancy and variable-length coding strategies should be used to balance forward and feedback error correction.
Delay is the most basic cost systems must pay for reliability. This paper shows that the reliability function for streaming bits with fixed delay can be very different from the reliability function for blocks with fixed block-length in the presence of feedback for hard deadlines. This corrects a result of Pinsker (in png graphics form: [1], [2], [3], [4], [5], [6]) which claims otherwise. As a part of this work, we derive a general upper bound (the uncertainty focusing bound) on the probability of bit-error with respect to delay and show that the bound is tight for the Binary Erasure Channel and any channel with nonzero zero-error capacity with feedback. This bound is much higher than the corresponding bound for block codes. We also give a different bound for cases without feedback and show how it can be beaten with feedback for all generic DMC channels. This illustrates how codes with feedback can exploit an additional degree of freedom (flow control) in order to improve reliability with respect to latency.
The uncertainty focusing bound itself first appeared (though without that name) in my student Tunc Simsek's doctoral thesis, but there are other many different results in both the paper and the thesis itself.
Gives an anytime generalization of orthogonal signaling and shows how one can theoretically achieve maximally energy-efficient communication in a delay-universal (receivers get to pick their desired delays, it is not imposed by the transmitter) way. The code itself is based on a variation of pulse position modulation and the results here are partially generalized to Verdu's capacity per unit cost formulation. The interesting feature here is how simple the analysis and proof of optimality becomes for this infinite-bandwidth channel.
This expands on the "delay not blocks" journal paper by considering joint source/channel coding in which the channel is noisy and the source is nonuniform. In the block-coding case, it had been established earlier by Csiszar that traditional separation fails to achieve the best possible block exponents. We give new (and higher) upper bounds and achievable schemes when a hard end-to-end delay constraint is considered. When feedback is available, we give even higher bounds as well as a scheme for the erasure channel that meets the bounds. This scheme suggests that for many channels, separation is actually possible in the engineering sense by using appropriate variable-length codes at the interface. It also effectively says that the delay-reliability gains we get by moving to nonblock codes with feedback for channel coding is not merely a manifestation of the traditional joint vs separate gap in block-coding.
Cheng Chang and Anant Sahai, "Sequential random coding error exponents for degraded broadcast channels," Allerton, 2005
These two papers show how anytime (delay universal) reliability can be achieved over the MAC and Degraded Broadcast channels using ML decoding and no feedback. The techniques of our Slepian-Wolf Paper are adapted to these contexts.
Calculates the exact anytime capacity for the GE channel with noiseless feedback of both the output and the channel state. Shows that the probability of error can be made to decrease doubly exponentially with delay at all rates below capacity and that furthermore, this can be done in a delay-universal, or anytime fashion. A simpler form of the hybrid-control based approach used in this paper from Allerton 2004.
Calculates the exact feedback anytime reliabilities for packet erasure channels where the packets can be of variable length but data must be communicated explicitly within the body of the packet. Shows the reliability advantage to having only an average constraint on the packet length as opposed to having to use fixed sized packets.
Considers a channel that either adds white Gaussian noise or independently erases the input. Calculates the feedback anytime reliability function for the channel and shows how to achieve it by using a hybrid control strategy involving both a queue and a time-varying linear system.
This talk sketched out a vision and back-of-the-envelope analysis for looking at the anytime relaying problem. It argued that the meta-data describing what corrections have to be made at one relay and propagated to the other relays are a significant factor and cause a reduction in the anytime reliability as we pass through multiple relays, even if we let the minimum delay increase. I was not able to make the intuition here precise however.
This work deals with getting bounds on implementation complexity for iterative decoders by leveraging the error-exponent techniques we developed in the context of hard deadline decoding and fixed delay and generalizing them from linear senses of time to the idea of graph based neighborhoods. At a conceptual level, it asks what should it mean asymptotically for a code to be capacity achieving in the context of real-world communication using systems that have limited power budgets.
My key contribution here is to show that under a very natural model for ASIC-style computation, that it is impossible to achieve Shannon's dream of arbitrarily low probability of bit error with only a finite amount of total energy per bit. Moreover, if the range of communication is short, it is not worth aiming for a capacity-achieving code since the savings in transmit power will be overwhelmed by the increased implementation power required.
This paper looks at lossless source coding and shows that the complexity of encoding plus that of decoding must satisfy a fundamental limit in order to achieve a desired probability of bit error.
This paper shows that the total energy-per-bit for any code must go to infinity as it tries to approach certainty, even if it is allowed to adjust its rate. This is what happens when we count both encoding and decoding complexity in addition to transmit power.
This paper introduces a simple model for a general iterative parallel decoder and asks how the power spent in the receiver for decoding tradesoff with the power used at the transmitter. It shows that the total power spent in the transmitter and receiver combined must go to infinity as the end-to-end probability of error tends to zero and introduces the concept of the waterslide curve that captures this. Moreover, it is shown that for AWGN channels, the optimal power used at the transmitter can depend on the application setting and the technology used in the receiver. It is generally slightly different from the predictions made by using Shannon capacity alone and can be interpreted as a soft counterpart to the computational cutoff rate that applies for iterative decoding.
My main contribution in this line of research is to show how communication systems can be made robust to limitations in the feedback path for both hard deadline and soft deadline problems. Prior approaches (like the S-K scheme) have been very non-robust to even minor degradations of the feedback path.
If the above tutorial slides are too long for your taste, the second best way to understand the cutting edge results in this area is to view this presentation that was made by me at Berkeley in Fall 2006, and combines/extends results that I unveiled at the 2006 Berkeley/Stanford BAD Day as well as the Kailath Colloquium at Stanford in Summer 2006. Further results were unveiled at the ITA workshop in San Diego in January '07 and are captured in this presentation given at the University of Tokyo in March '08.
This paper studies the hard deadline problem for streaming data in the context of packet erasure channels (a semi-practical model for networks). Instead of perfect feedback, it has unreliable feedback that must pass through an erasure channel itself. The uncertainty-focusing bound can still be asymptotically achieved at high enough rates. This approach also works for binary erasure channels in both directions (so sequence numbers are out of the question). Perhaps most interestingly, this work shows that even if the system is "half-duplex" in the sense that a feedback transmission occurs at the cost of giving up a forward one, it is still worth spending some of the channel uses on feedback because this can reduce the end-to-end delay by a very large factor relative to forward-only error correction. A side-effect of the techniques is to show that the symmetric uncertainty-focusing bound is achievable with perfect feedback at high rates for any DMC whose matrix contains a zero.
This is the converse paper to accompany this paper. It gives a proof for the hallucination bound for block codes and states the result in the fixed-delay setting.
Takes an errors and erasures perspective that corresponds to a case of communicating a bitstream with "soft" end-to-end deadlines and access to a feedback channel. Detected errors (erasures) are presumed to be relatively low cost and only undetected errors are penalized. In this context, we show that not only can the Burnashev bound be beaten, but that the system can be made robust to noise in the feedback link, as long as the feedback capacity is larger than the target undetected error exponent. The final paper on this is still being written.
This work studies the reliability function for block communication with noisy feedback relative to expected block-length -- the soft-deadline problem for block messages. There are two major issues that are addressed: how can the noisy feedback be used to reduce the probability of undetected error and how can the synchronization be maintained between the encoder and decoder for the invariable retransmissions that must occur.
The final paper shows that at all rates, upto half the Burnashev bound is attainable using any noisy DMC in the feedback path, as long as it has enough capacity. At rates close to capacity, the achievable reliability functions approach the Burnashev bound. The key ideas are to combine erasure-decoding and randomly hashed-feedback to get the most utility out of the noisy feedback link, and to use anytime codes over the feedback channel to maintain synchronization between encoder and decoder.
My key contribution here is to make precise the "delay-sensitivity" of control applications in a language compatible with the asymptotic analyses that information theory can provide and to point out that meeting control's requirements will require communication architectures to be engineered with streaming delay in mind. The surprising fact is that while control intuitively seems to be an interactive setting in which short messages are generated one after the other with each being required at the receiver before the next message is ready, the technical results suggest that it is actually intimately connected to the streaming setting.
Another key contribution has been to partially resolve the longstanding Witsenhausen counterexample by examining a vector version of the problem, deriving a new bound for it, and showing that explicit signaling strategies attain a performance within a constant factor 1.5 of optimal in the infinite dimensional limit, and a factor of 8 in the scalar case. The results have also begun to be generalized to neighboring problems.
Shows the equivalence between stabilization of linear systems over a noisy communication channel and anytime reliable communication with noiseless feedback, introduces the anytime rate region and relates it to the vector control problem, addresses the issue of intrinsic delays in control systems, and gives random constructions for memoryless observers and their corresponding controllers. Along the way, it shows how to generalize Schalkwijk/Kailath's scheme for the AWGN channel to the sequential setting. By using the vector case results in the context of an erasure channel, it shows that differentiated service can be required to satisfy control application demands.
This builds on classical linear systems tools and includes interesting ideas based on "writing-on-dirty paper" and precoding for communicating information through control systems. It also shows that Shannon capacity is not sharp enough of a concept for control over noisy communication channels.
This paper introduces an asymptotic vector version of Witsenhausen's counterexample and shows how strategies related to joint source/channel encoding as well as Dirty-Paper coding can be adapted to become nonlinear control strategies. Quite surprisingly, the problem turns out to be related to other classical information-theoretic problems and has an interpretation in terms of suppressing interference in communication systems. This is perhaps the simplest unsolved point-to-point infromation theory problem. Even so, from a distributed control perspective we are able to obtain many interesting results. First, we get the first improvement over Witsenhausen's lower bound in 40 years. Second, we're able to show that in an asymptotic sense, we can give approximately optimal strategies in terms of control performance that are no more than a factor of 2 from the optimal regardless of the values of the problem parameters. This is a significant improvement from the prior bound of infinity.
This tightens the bounds in the above paper for the asymptotic regime and shows that we can get within a factor 1.5 using the Witsenhausen cost function. Furthermore, it gets the bound completely tight for the case of zero second-stage cost.
This pair of papers follows up on our previous paper to give a valid bound for finite vector-length versions of the Witsenhausen counterexample (including the scalar case introduced by Witsenhausen himself) and show that lattice-quantization based strategies are approximately optimal for all dimensions --- achieving a cost within a constant factor of optimal. Numerically, this constant is at most 17 for the scalar case. This line of work represents the first provably positive result for Witsenhausen's counterexample in 40 years. The CDC paper gives an elementary proof of a new bound for the scalar case in the spirit of Witsenhausen's own bound, while the ConCom paper uses a more sophisticated sphere-packing argument that builds on the approach we used for studying decoder complexity.
This paper explores the neighborhood of the Witsenhausen counterexample in two different senses. First, we consider what would happen if the two controllers (that we dub "weak" and "blurry" because of their costs and observations respectively) had different relationships with each other. It turns out that all the other arrangements (acting simultaneously or in the reverse order: blurry before weak) result in linear optimal control laws. Even if the two controllers maintain their traditional order but become adversaries to each other, linear remains optimal. This strengthens the case that it is implicit communication that is making the problem hard since in all of these alternatives, there is no reasonable desire for implicit communication --- despite the information pattern being --- nonclassical. The paper also considers generalizing the cost terms and shows that the asymptotic techniques that we had developed continue to work and that the core difficulty of the problem remains even without the particularly extreme structure of the cost terms in the original Witsenhausen Counterexample.
Most of this paper is dedicated to showing that it is possible to achieve the random-coding error exponent with fixed delay for streaming data for compound DMC channels (ie channels that are known only to a certain uncertain set). The interesting aspect from a control context is that it gives an easy to check sufficient condition for the stabilizability of an unstable system over such compound channels. This sufficient condition only depends on the compound channel capacity and the alphabet size.
Anant Sahai and Hari Palaiyanur, "A simple encoding and decoding strategy for stabilization over discrete memoryless channels" presented at the 2005 Allerton Conference
These two papers give sufficient conditions for stabilization in a relatively low complexity way. In particular, the first marries the CISS 05 work on the wideband channel to the control problem of my CDC 04 paper and shows why it is possible to just use simple observers/encoders. The second adapts sequential decoding to this problem and shows that the controllers/decoders can also be simple. This simplicity comes at the cost of performance though. The schemes given here are unable to attain the performance promised by the uncertainty-focusing bound.
Early version appears at MTNS 2006
This is an application of the "fortification" ideas of the "delay not blocks" journal paper within the context of stabilization using a noisy feedback channel. This maintains a somewhat low-complexity approach while actually hitting the uncertainty-focusing bound for cases when a small noiseless "flow-control" channel is available to supplement the main noisy feedback channel. It also uses event-based sampling to both simplify the system and to overcome a technical requirement for finite channel output alphabets.
Examines the control of linear systems over noisy communication channels, with a particular focus on the LQG problem. Presents the sequential rate distortion framework and gives information patterns for which it tightly characterizes the achievable performance for the control system.
This is an early version that is mostly subsumed by this paper. It remains of interest for the work in section 7 on relaxing the information pattern. The results there are intriguing, but I have not had the chance to fully follow up on them.
Re-examines Witsenhausen's famous counterexample showing that linear control need not be optimal for LQG problems without a classical information pattern. Used ideas from quantization to show that there are nonlinear strategies that can do asymptotically infinitely better than the best linear strategies for a particular limit of problems.
This paper solves the lossy source-coding problem for exponentially unstable Markov processes, completely characterizes the nature of information within them, and shows that not all bits are alike even for a single source with a simple additive difference-distortion measure. Resolves a long-standing open problem in information theory (completing progress made by Berger and Gray in the 70s) and demonstrates the usefulness of the information-embedding perspective in understanding information.
Considers the streaming problem for distributed lossless source-coding. The role of feedback is played by the receiver side-information. Upper and lower bounds are given to the reliability function with fixed delay for this problem, and shows that despite the Slepian-Wolf result in terms of achievable rates, the delay performance of distributed coding can be much worse than non-distributed coding. This turns out to be related to an earlier result by Jelinek in the context of buffer overflows.
Extends the uncertainty-focusing bound arguments of this paper to the case of parallel source streams that desire unequal error protection.
Looks at the source-coding analog of the "anytime" or delay-universal idea. Here, the channels are noiseless and we show how to do distributed compression of streaming sources at all rates within the Slepian-Wolf rate region. These results hold for memoryless processes and the decoders can achieve the same error exponents with delay for both known statistics decoded using ML as well as universally when the statistics are unknown. Because the code is universal over delay, it allows us to achieve asymptotically zero probability of error with delay for all generic sources. In this sense, it provides a variable delay analog of truly lossless communication that continues to hold for generic distributed sources.
This paper shows how sequential decoding (stack algorithm) can be applied to decode distributed source codes where the side-information is only available at the receiver. The arguments extend directly to joint source-channel coding for distributed settings.
This expands on the "delay not blocks" journal paper by considering joint source/channel coding in which the channel is noisy and the source is nonuniform. In the block-coding case, it had been established earlier by Csiszar that traditional separation fails to achieve the best possible block exponents. We give new (and higher) upper bounds and achievable schemes when a hard end-to-end delay constraint is considered. When feedback is available, we give even higher bounds as well as a scheme for the erasure channel that meets the bounds. This scheme suggests that for many channels, separation is actually possible in the engineering sense by using appropriate variable-length codes at the interface.
This extends some of the results of this paper to the context of lossy coding with a per-letter peak-distortion constraint.
This paper studies sequential source coding in the average entropy rate and average distortion context rather than looking at an explicit representation in terms of bits. The problem is cast as an optimization problem on an appropriate convex set of probability measures. Existence and properties of optimal sequential codes are explored. A sequential rate distortion theorem is proved and a construction given to show that in general, a ``causality gap'' exists --- i.e. sequentiality will not come for free.
These notes were written to help me understand the consequences of Neuhoff and Gilbert's 1982 paper on causal source coding. They extend the argument to joint source/channel coding and are put up here because some students have found them helpful in understanding the big picture
My main contribution in this work has been to develop the beginnings of a CS complexity-theory like theory of communication problem hierarchies and explicit equivalences. This theory has revealed that Shannon's traditional source-channel separation theorem is merely one of many equivalent separation theorems that could be stated. It has also enabled us to get a more nuanced understanding of the nature of information in exponentially unstable processes. The promise of this work is to move us toward a principled approach to communication system architecture in the absence of a single separation theorem, although there is still much to be done before that promise is realized.
First formal introduction of the equivalence perspective toward information flows. This paper shows the equivalence between interactive stabilization of linear systems over a noisy communication channel and anytime reliable communication with noiseless feedback. In the vector case, it shows how a single communication problem can be equivalent to a collection of other problems thereby providing a foundation to study the need for differentiated service.
The equivalence perspective is used here in a non-interactive setting of lossy communication of exponentially unstable Markov processes. The nature of information within them is completely characterized by showing an equivalence to a collection of two qualitatively distinct communication problems --- anytime communication of certain critical bits, and classical communication of other bits that determine the end-to-end distortion. This demonstrates the usefulness of the information-embedding perspective in understanding information.
This work gives universal non-asymptotic bounds on reliability functions for lossless source coding that depend only on the alphabet size, rate, and upper-bound on the entropy. We realized that it is very natural to just know that a compound channel has a certain minimum capacity or a compound source has an upper bound on its entropy. However, such interesting sets of channels and sources are decidedly not convex, and so the usual saddle-point approaches do not give useful bounds. It is also not enough to say that asymptotically we will do as well at runtime as a code tuned to the distribution. The reliability function is most useful at design-time to adequately provision a system to make sure that the desired performance will be obtained robustly. We adapted an approach out of a problem of Gallager, but had to correct a nontrivial error along the way since a subtle non-convexity was overlooked in his original solutions. Because the bounds developed here also hold non-asymptotically, they can also be turned around to apply when alphabet sizes grow with block-length.
This shows how to robustly achieve the empirical first-order capacity of channels whose states are ``modeled'' as an individual sequence. Our contribution here was to greatly reduce the amount of feedback (relative to a scheme of Shayevitz and Feder) to asymptotically nothing by adapting the chunk/block based coding strategies that we had used earlier for improving the reliability of fixed-delay coding using unreliable feedback for erasure channels.
This resolves the rate-distortion function for Berger's classic source-coding game with a switcher that is allowed non-causal access to the underlying random source realizations. The idea here is to find a way to understand what can happen in situations where the uncertain source is neither completely free nor completely random. (For conceptual motivation, consider a video camera that is being pointed by a non-blind human. Even if independent activities are happening in different directions, the human is pointing the camera based on knowing what is going on. Modeling this as a random sampling of the different independent activities is clearly an over-idealization of the problem that assumes that the human's goals are entirely orthogonal to the source-code's goals. We give a single-letter optimization that computes the relevant rate-distortion function.
Shows how to do universal decoding for streaming codes. The traditional block-code oriented MMI decoders no longer work and are adapted to work on trees. This is done both for distributed lossless source coding and compound channels. We show it is possible to achieve the appropriate error exponents with fixed delay.
This gives a computable and non-trivial upper bound on the dirty-paper capacity when there is an unknown block fade on the "dirt." This builds on a technique of Khisti, Wornell, and Lapidoth. In many cases, our bound shows that almost the entire DPC gain is wiped out. In this sense, it is a somewhat surprising result since it is qualitatively different from the usual results for compound channels.
This is not a joke, but a serious look at the problem of speeding up the journal review process (taking more than 2 years for many disciplines) by incentivizing timely reviewing. The tension is explored between providing public reputations to encourage good citizenship and keeping the identity of referees anonymous. This is actually an application of information-theoretic security and flow-control ideas to peer-to-peer networks having selfish users. Anyway, this is very preliminary work but it gives us hope. We're preparing a longer full paper on this topic.
Gives a very general recipe for boosting the reliability function with block-length for network channel coding problems with noiseless feedback. (MAC, broadcast, relay, etc.) Essentially, it argues that block-coding reliability functions in such a context are uninteresting since the average nature of the power constraint lets us boost them as large as we would like even in the context of fixed block-lengths. It does resolve a longstanding question left unanswered in Ozarow's classical paper --- namely, what is the reliability function for the Gaussian MAC channel with feedback for points other than the symmetric rate points.
Takes a problem introduced by Jack Wolf, and gives it a secrecy dimension. Uses the idea of information-theoretic security (leaking information) to explore in what sense "binning" is all that we can do in network source coding scenarios. Has a partial result showing that any entropically-efficient scheme for broadcast source coding must be "like binning" in that it must leak significant information to eavesdroppers, even if the source code has noisefree access to the recipients' side information.
"A bit is a bit is a bit" seems to have become the motto of this digital age. Though we often take it for granted, the existence of such a common currency for information is far from obvious. While in practice it has often been justified by the rapid advancement of digital electronics and the rise of a communications infrastructure to match, its meaningfulness rests upon the fundamental theorems of information theory --- the philosophical foundation that helps us gain insight into problems of communication. This remarkable convergence of digital electronics, radio, photonics, computation, and theory certainly must be counted among the great accomplishments of the latter half of the 20th century.
Broadly speaking, there are two sides to information theory. The better known side deals with the transmission of bits over channels. The noisy channel coding theorems establish the fundamental limitations for reliable transmission of bits over noisy channels. But there is another side to information theory. It relates to the encoding of sources with respect to some criterion of fidelity. The rate distortion theorems establish fundamental tradeoffs between fidelity and the length of the encoding in bits. These two sides of information theory are joined together by the information transmission theorems relating to source/channel separation. Philosophically speaking, these theorems establish that the notion of "reliable transmission" used for discussing channels is compatible with the notion of "bits" used for encoding sources.
However, there are still many issues in modern communication systems for which information theoretic understanding eludes us. Networking in particular has a whole host of them, even though from a communication centric viewpoint the classical network layers are mathematically justified by the classical theorems of information theory that inspire us to conceptually separate the reliable transport of bits from their use by any given application. Yet Ephremides and Hajek insightfully chose to entitle their survey article "Information Theory and Communication Networks: An Unconsummated Union!" and they comment:
The interaction of source coding with network-induced delay cuts across the classical network layers and has to be better understood. The interplay between the distortion of the source output and the delay distortion induced on the queue that this source output feeds into may hold the secret of a deeper connection between information theory. Again, feedback and delay considerations are important.
These networking issues are often discussed in the context of "information streams" where the data source continues to generate information forever rather than being a single isolated message. The phrase that often arises is "Quality of Service" or QoS. We all have an intuitive sense for what this means, but in this area our intuition grasps beyond our mathematical understanding. Shannon's great success can be thought of as the establishment of bit-rate as a fundamental (one is tempted to say the fundamental) QoS parameter for communication. But in modern networks, people have noticed that streams emerging from different real-world sources seem to behave differently. What is appropriate for a classic file transfer may not be suitable for real-time multimedia signals.
Our quest then is to extend Shannon's program to go beyond bitrate. To do so, we consciously follow in Shannon's footsteps. We need simple examples on which to build our theory. The fact that few real people wish to communicate the results of coin tosses to each other does not diminish the usefulness of Bernoulli processes in building our information theoretic insights. Multimedia signals, while practically very important, are also quite complex to work with since at the moment the meaningful concepts of "fidelity" are primarily defined implicitly with reference to human preferences. In this research program, we have decided to begin instead with simple examples taken from control and dynamical systems.
In control contexts, the simplest objective is to keep the modeled system stable. To do this, communicating the values of the underlying unstable process from one place to another is an important issue since the physical location for applying actions can be physically distinct from the location where observations are being made. In practice, people are already using quantizers and digital implementations to control unstable physical systems and there has been some isolated theoretical work on trying to extend source coding theory to unstable processes. Recently, the work of Sekhar Tatikonda in his doctoral thesis has taken us significantly forward in understanding the information theoretic limitations of control systems. Even so, a major gap has remained. Till our work, there have been no information transmission theorems to tie the source coding of unstable processes to communication over noisy channels.
In fact, the entire class of unstable sources has eluded complete understanding in an information theoretic sense. By these, we are referring to the non-stationary processes which often arise in system-theoretic modeling contexts. Although they have proven themselves to be valuable parts of the system modeler's toolbox, the fundamental difficulty with unstable processes is that they tend to grow with time and they have asymptotically infinite variance. Because of this, their "streaming" character through time cannot be ignored. We believe that extending information theory to address unstable sources might be the conceptual key to understanding the deeper issue of "streaming" in general and hence provide the platform we need on which to build the next level of fundamental understanding of QoS.
In my own doctoral research with Sanjoy Mitter, I took the first steps in this program by starting with a very simple example of a random walk on the infinite line with known initial condition. I reviewed how even with a noiseless channel, a block source code cannot be used to track this unstable process. I then illustrate how any attempt to cascade a source encoder that works over a noiseless link with a conventional channel encoder is doomed to failure because the Shannon sense of reliable transmission is not sufficient for our task. This showed that in the context of unstable processes, a "bit" is not necessarily a "bit" and I made this idea much more precise by introducing a delay-dependent notion of intrinsic meaning for streams of bits that shows how bit-streams can be fundamentally different from each other even if they have the same bit-rate.
Next, I introduced a new parametric notion of reliable transmission and its associated channel capacity that we call anytime capacity and suggest how anytime decoders can be used to avoid the problem for our simple random walk. Unlike classical decoders, an anytime decoder will eventually correctly recover the values for all the bits originally sent. I then used a random coding argument and a connection with error exponents to show that the anytime capacity of a binary erasure channel and the power-constrained AWGN channel is non-zero even without any feedback.
Finally in my thesis, I showed that the random walk can be tracked using an anytime channel code to transport the bit-stream and give a partial information transmission theorem for unstable scalar Markov processes. This partial information transmission theorem establishes that the unstable processes are fundamentally different from memoryless or stationary processes, but did not fully characterize the unstable case. The traditional source-channel separation theorems imply that the bit-streams used to encode all memoryless or stationary processes are qualitatively alike as long as we are willing to tolerate delays. All they need is a channel with sufficient capacity. But even if we are willing to live with delay, unstable processes require more from a channel than merely Shannon capacity.
Special thanks to our past and present research sponsors: