This book is about the newest service to arrive on the Internet: interactive voice and video. Since its first appearance to the market in 1995, it is amazing to note how fast the IP telephony technology has evolved. But this evolution alone wouldn't account for VoIP's growing success if there weren't a general recognition of the importance of the Internet in our day-to-day activities.
In 1991, the Internet was still a maze without a map where only addicts really felt at home. As students we used to print and exchange our bookmarks - as soon as someone had found a really good ftp site with anonymous access, he would copy and distribute the listing. Any good book on the Internet inevitably ended with hundreds of pages of such bookmarks. The Internet was a nice tool, especially email and chat, but really you had to enjoy being stuck in front of your black and white computer screen for hours.
Then, in 1993, with the World Wide Web, came the first revolution. The Internet became colorful and our piles of paper bookmarks were instantly obsolete, replaced by links on web pages. Soon there were too many links for one person to keep track of. So two US students came up with the idea of keeping bookmarks up to date for other people and created Yahoo!. It was a good thing indeed. Thus, with all the bookmark sites and search engines, the Internet became less intimidating for non-UNIX gurus. It became feasible for a literature student to get an Internet account.
By 1996 every student had an email address and most of them actually used it. But for most people, the Internet was still a toy. It was too complex to reach the mass market, and would never really count outside universities and research labs. In one sentence: 'the Internet will really count when your grandmother can use it.'
It was around that time that we saw the first attempts to build an Internet telephony gateway. The first prototype was a strange hybrid. The telephone interface was a modem with speakerphone capabilities. The problem with this speakerphone modem was that there was no way to simultaneously play a sound and record from a PC. Actually you could use the modem only to dial the destination number. Some sound boards had a driver that made it possible to simultaneously play and record ('full-duplex'), but no telephone interface, so you had to wire the sound-board line-in jack to the modem microphone, and the modem speaker to the sound-board line-out jack. Of course, some software was needed to turn this into an Internet telephony gateway, but in 1996 there was already good Internet telephony freeware available, such as VAT, and adding some code to interface with the modem wasn't too difficult: when an incoming call arrives, pick up the line, play a welcome prompt, get the destination number with DTMF touchtones (the modem had that capability), relay the information to the destination gateway that would use the modem to dial the right number, and spawn the Internet telephony software.
This was only a very crude, one-line gateway, but its potential was immense. The telephone network is the example of a technology that really counts. There are over 800 million telephone lines in the world, and a five-year-old can use a telephone. Obviously, being able to carry even a small fraction of all telephone conversations over the Internet would have been a major achievement. Many research labs suddenly realized that it could be a second Internet revolution and tried to evaluate if it was possible to build a more sophisticated gateway.
This was only three years ago, but already today, there are more and more companies running their operations without the use of a single 'blackphone' of a PBX. And tomorrow, our grandmother may still use a regular analogue telephone, but the likelihood of this phone being connected to a Softswitch (IP Central Office) will be very high. By this time, there is no doubt that the full potential of Internet will eventually be reached. With its first www revolution, the Internet got a face, with the IP telephony revolution, the Internet has got a voice!
AT a first glance the technology behind the IP telephony may seem almost trivial. It isn't. In particular it is much more complex than one-way media streaming (used by TV or radio broadcasting on the net) because the latency between the talker and the listener must remain very low, while streaming applications can use very large buffers.
Here is a short list of some topics one needs to explore before understanding the subtleties of IP telephony:
the characteristics of the human ear, especially its perception of echo and delay;
the voice compression and packetization technologies;
silence suppression and comfort noise generation;
echo cancellation technologies;
the shortcomings of the Internet protocol regarding real-time applications: delay, jitter, packet loss;
the strategies to overcome these limitations: buffering, redundancy, timestamps, differentiated services;
the characteristics of packetized voice traffic, and how it cohabits with non real-time data flows;
in band data transmission in the telephone network (DTMF, fax, modem);
telephone signaling protocols (ISDN Q.931, SS7 ISUP, etc.), and all connection sequences (regular calls, calls to intelligent services, calls while the network is overloaded);
IP telephony protocols (ITU H.323 and associated protocols, MGCP, IETF, MEGACO and SIP);
politics, to better sort out technology-driven controversies and business-driven controversies.
Very soon in our exploration of VoIP, it appeared that it was really an area that brought together many different types of expertise in the networking, speech and telephony area. In order to build a good VoIP team, it would be necessary for each expert to update the others on the most essential aspects of his domain that are relevant for VoIP.
You will find this book useful if you face issues such as:
which gateway should I choose for my corporate converged network? What standard?
can I replace my backbone trunks with VoIP links transparently?
can I get rid of all the telephone wiring?
can I prevent VoIP from overloading my network? How does it get through firewalls?
will my 2 Mbit/s IP access be sufficient for my brand new 100-operator VoIP call center?
can I use VoIP on a network with dynamic DHCP addressing?
How do I fax through the net?
will I be able to use VoIP for multipoint conferences or broadcasting?
On the other hand, you will be disappointed if you look for a list of vendors and a discussion of each product, or simply a quick overview of the technology.
This book assumes a working knowledge of IP and ISDN networking. This doesn't mean that our reader should be familiar with all socket options or BGP4 routing, but we do not explain the basics of IP routing such as the significance of an IP address or TCP port. If you feel you need some clarification on the Internet protocol, we would recommend the books written by Christian Huitema. Similarly we assume the reader knows what Setup, Alerting or Connect ISDN messages mean, although we explain their role in the context of H.323.
Other than that, we have tried to avoid an excessive use of pointers to external documentation in the text itself. We know it can be extremely time consuming to download and read a full RFCs or ITU recommendations when you need to check only a detail, and therefore we have inserted small digests wherever they are useful. For instance, you will find descriptions of popular voice and video coders in the H.323 section. We have also provided a large list of definitions for the acronyms used in the book, as it seems that each new telecom application has to create its own vocabulary. Of course, the relevant pointers are gathered in the reference sections.
Relation to standards
In just three years IP telephony has evolved from one-port, do-it-yourself gateways to backbone hardware supporting 120 ports per PCI slot. And this is just the small part of the iceberg - standards are evolving twice as fast. It was even harder for us because we decided not to choose between the emerging standards and to describe H.323, SIP and MGCP. People are often confused and ask us: Who will be the winner? But in fact only H.323 and SIP can be seen as direct competitors: both aim at being implemented by 'intelligent' multimedia endpoints, and the common opinion that H.323 is 'much more complex' than SIP relates mainly to the fact that H.323 uses a binary ASN.1 encoding, which is not a serious obstacle for any programmer. MGCP is more an effort to create a light stimulus-based protocol for gateways and appliances and can be used in conjunction with both H.323 and SIP. The SIGTRAN group efforts are not described here, as they aim at transporting transparently SS7 signaling across an IP network, which means the Internet is used as a simple trunk. To us it seemed SIGTRAN would rather belong to a book on PSTN technology.
We have tried to keep the manuscript current, but there are unavoidable delays in the publication process, and by the time you read these lines we know that some standards will have evolved. In particular we have based some of the chapters on material gathered from IETF Internet drafts, which is explicitly discouraged by IETF because these documents are not stable. In practice though, people rarely rewrite drafts entirely, and we hope that the background material gathered in the book will be sufficient to allow you to catch the train and join the community of engineers who improve VoIP technology day after day.
We have provided pointers to most of the relevant standards, drafts, mailing lists and web sites. Nothing can really replace daily participation in the standardization process: for each line written in a standard document, there are perhaps 100 lines worth of emails, discussions and drafts that really put this single line into perspective. Sometimes a controversy arises and gives birth to one of those 'everybody's right' sentences (such as the paragraph regarding G.723.1 and G.729 in H.323) that nobody can understand without remembering the previous discussions. We have tried to take this 'behind the scenes' data into account in this book, but we encourage you to make your own opinion using the mailing lists.
The future of multimedia over IP
The speed at which IP telephony has become an industry is truly amazing, but we are far from having reached maturity.
There are still a lot of technical shortcomings in today's products. In general, products are about two years behind the standards. For instance, at the time of book there was just one gateway and one IP phone software that supported mid-call call redirection, although it was already defined in H.323v1. This is one of the most basic and most widely used services in today's PSTN. Everybody boasts 'added value services' and still, amazingly, nobody really has the basics right yet.
Another striking example is the lack of a standardized URL format to trigger an H.323 IP phone call. The very big software manufacturers have resisted any attempt to do so. Because of this you have to put a different button for each flavor of H.323 phone out there.
There are also some economic issues to resolve. Historically, almost all Internet sites were in the US. Consequently the ISPs in the rest of the world had to pay 100 per cent of the data leased lines to the US. The situation in the phone network is different: each carrier pays 50 per cent of the leased line. This situation is unfair and cannot last much longer: the web traffic imbalance is more in the 30/70 range, and of course interactive voice and video is symmetrical. Interestingly, voice over IP seems to develop equally fast in the US, Europe and Asia, which should help in the promotion of a truly international Internet.
Despite these minor issues that will quickly be resolved, the future of IP telephony and video seems bright. Soon the SDH and SONET transport networks will carry more data than voice (this is already the case on the BT network), and at this point using packetized voice is an obvious choice. Some will still argue that packetized voice doesn't mean IP. Why wouldn't we use frame relay or ATM? There are two simple reasons for this: soon 99.9 per cent of the data generated by individuals and corporations will be IP. Introducing an ATM or Frame Relay Layer to the end customer just for voice makes no sense. In addition, ATM and Frame Relay don't scale in terms of connectivity. Switched Virtual Circuits are too slow for today's sporadic data exchanges, where everybody pings everybody, and Permanent Virtual Circuits, although fine for backbones or intranets when there are only a few dozen nodes to connect, make no sense on an open network.
It is true that IP has a latency problem on low-speed links, but in tomorrow's networks with xDSL connections this issue will be resolved, and at this stage we will probably realize that large packets use bandwidth much more efficiently for video. If we project ourselves a few years forward, with these xDSL lines connected to multi-gigabit backbones, hardware IP phones with state-of-the-art echo cancellers, and a new wideband coder standardized by ITU, we will have a better sound quality than today's ISDN network, and of course video. This is not so far away. Already in Canada some fortunate people have 2 Mbps lines in their homes for less than $100 per month.
And we may have even greater surprises. Tomorrow's mobile telephony networks, such as UMTS, will basically be wireless data transmission units able to send and receive megabytes per second. Why invent protocols for the multimedia applications running on those phones when it is already obvious that they will have to talk IP? Maybe our next mobile phone will be an IP phone.
We need to thank many people for their contributions, support and help without which this book would not have reached its goals. Especially those who regularly attend, IETF, ETSI TIPHON and ITU SG 16 meetings and contribute to numerous IP telephony related mailing lists.
We are particularly indebted to those like Scott Petrack, Christian Huitema, Dave Oran, Louise Spergel, Dale Skran, Gur Kimchi, Jonathan Rosenberg, Henning Schulzrine, Jim Toga, Max Morris, Mike Buckley and Jeff Pulver who through their valuable contributions participated in setting up the Internet Telephony revolution.
In CNET, our thanks go to Michel Dudet, Gerard, Sylvie, Catherine, Marcel, Michel, Bernard, Bertrand, Cyril, Pierrick, Jean Jacques, Soleiman, Christope, Sebastien and Frank.
We would also like to thank folks in VocalTec, Elad Sion, Eran Barak, Lior Moscovici, Alon Cohen, Doron Zinger and Bayard Gardineer for their support and comments.