Lesson 1: TCP/IP Protocols

The TCP/IP protocols were developed in the 1970s specifically for use on a packet-switching network built by the United States Department of Defense. Their network was then known as the ARPANET but is now the Internet. The TCP/IP protocols have also been associated with the UNIX operating systems since early in their inception. Thus, these protocols pre-date the personal computer, the OSI reference model, the Ethernet protocol, and most other elements that today are considered the foundations of computer networking. Unlike the other protocols that perform some of the same functions, such as Novell's Internetwork Packet Exchange (IPX), TCP/IP was never the product of a single company, but rather has been a collaborative effort from the very beginning.

After this lesson, you will be able to

List the layers of the TCP/IP protocol stack and locate the TCP/IP protocols in the OSI reference model
Understand the function of the Address Resolution Protocol (ARP)
Describe the various functions of the Internet Control Message Protocol (ICMP)
Describe the properties of TCP/IP's various application layer protocols

Estimated lesson time: 45 minutes

In addition to not being restrained in any way by copyrights, trademarks, or other publishing restrictions, the non-proprietary nature of the TCP/IP standards also means that the protocols are not limited to any particular computing platform, operating system, or hardware implementation. This platform independence was the chief guiding principle of the TCP/IP development effort, and many of the protocol features are designed to make it possible for any computer with networking capabilities to communicate with any other networked computer using TCP/IP.

The TCP/IP standards are published in documents called Requests for Comments (RFCs) by the Internet Engineering Task Force (IETF). The list of RFCs contains documents that define protocol standards in various stages of development, but also contains informational, experimental, and historical documents that range from the fascinating to the downright silly. These documents are in the public domain and are accessible from many Internet Web and FTP sites. For links to the standards, see the IETF home page at http://www.ietf.org.

NOTE

Once a document is published by the IETF as an RFC and assigned a number, that document never changes. If the IETF publishes a revised version of an RFC at a later time, it assigns the document a new number. The RFC-INDEX file, which contains the complete listing of the published documents, contains cross-references that indicate when RFCs make other documents obsolete or when they have been made obsolete by other documents.

TCP/IP Layers

The TCP/IP protocols were developed long before the OSI reference model was, but they operate using layers in much the same way. Splitting the networking functionality of a computer into a stack of separate protocols, rather than creating a single monolithic protocol, has several advantages, including the following:

Platform independence Separate protocols make it easier to support a variety of computing platforms. Creating or modifying protocols to support new physical layer standards or networking application programming interfaces (APIs) doesn't require modification of the entire protocol stack.
Quality of service Having multiple protocols operating at the same layer makes it possible for applications to select the protocol that provides only the level of service they require.
Simultaneous development Because the stack is split into layers, the development of the various protocols can proceed simultaneously, using personnel that are uniquely qualified in the operations of the particular layers.

NOTE

For more information on the OSI model and the functions of its various layers, see Lesson 2: The OSI Reference Model, in Chapter 1, "Networking Basics."

TCP/IP has its own four-layer networking model, which is defined in RFC 1122, "Requirements for Internet Hosts—Communication Layers." The layers are roughly analogous to the OSI model, as shown in Figure 8.1.

Figure 8.1 The seven-layer OSI reference model versus the four TCP/IP protocol layers

The four TCP/IP layers are as follows:

Link The TCP/IP protocol suite includes rudimentary link layer protocols, such as the Serial Line Internet Protocol (SLIP) and the Point-to-Point Protocol (PPP). However, TCP/IP does not include physical layer specifications or complex local area network (LAN) protocols such as Ethernet and Token Ring. Therefore, while TCP/IP does maintain a layer that is comparable to the OSI data-link layer, in most cases the protocol operating at that layer is not part of the TCP/IP suite. TCP/IP does, however, include the Address Resolution Protocol (ARP), which can be said to function at least partially at the link layer, because it provides services to the internet layer above it.
Internet The internet layer is exactly equivalent to the network layer of the OSI model. IP is the primary protocol operating at this layer, and provides data encapsulation, routing, addressing, and fragmentation services to the protocols at the transport layer above it. Two additional protocols, called the Internet Control Message Protocol (ICMP) and the Internet Group Message Protocol (IGMP), also operate at this layer.
Remember, the word internet, in this instance, is a generic reference to an internetwork, not to the Internet. Be careful not to confuse the two.
Transport The transport layer is equivalent to the layer of the same name in the OSI model. The TCP/IP suite includes two protocols at this layer, the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP), which provide connection-oriented and connectionless data transfer services, respectively.
Application The TCP/IP protocols at the application layer can take several different forms. Some protocols, such as the File Transfer Protocol (FTP), are applications in themselves, while others, such as the Hypertext Transfer Protocol (HTTP), provide services to applications.

The following sections examine some of the protocols that operate at the various layers of the TCP/IP protocol stack.

SLIP and PPP

SLIP and PPP are link layer protocols that systems use for wide area connections using telephone lines and other types of physical connections. SLIP is defined in RFC 1055, "A Nonstandard for Transmission of IP Datagrams over Serial Lines." PPP is more complex than SLIP, and uses additional protocols to establish a connection between two systems. These protocols are defined in separate documents, including RFC 1661, "The Point-to-Point Protocol," and RFC 1662, "PPP in HDLC-like Framing." For more information about SLIP and PPP, see Lesson 3: SLIP and PPP, in Chapter 5, "Data-Link Layer Protocols."

ARP

ARP, as defined in RFC 826, "Ethernet Address Resolution Protocol," occupies an unusual place in the TCP/IP suite. ARP provides a service to IP, which seems to place it in the link layer (or the data-link layer of the OSI model). However, its messages are carried directly by data-link layer protocols and are not encapsulated within IP datagrams, which is a good reason for calling it an internet (or network) layer protocol. Whichever layer you assign it to, ARP provides an essential service when TCP/IP is running on a LAN.

The TCP/IP protocols rely on IP addresses to identify networks and hosts, but when the computers are connected to an Ethernet or Token Ring LAN, they must eventually transmit the IP datagrams using the destination system's hardware address. ARP provides the interface between the IP addressing system used by IP and the hardware addresses used by the data-link layer protocols.

When IP constructs a datagram, it knows the IP address of the system that is the packet's ultimate destination. That address may identify a computer connected to the local network or a system on another network. In either case, IP must determine the hardware address of the system on the local network that will receive the datagram next. To do this, IP generates an ARP message and broadcasts it over the LAN. The format of the ARP message is shown in Figure 8.2.

Figure 8.2 The ARP message format

The functions of the ARP message fields are as follows:

Hardware Type (2 bytes) This field identifies the type of hardware addresses in the Sender Hardware Address and Target Hardware Address fields. For Ethernet and Token Ring networks, the value is 1.
Protocol Type (2 bytes) This field identifies the type of addresses in the Sender Protocol Address and Target Protocol Address fields. The hexadecimal value for IP addresses is 0800 (the same as the Ethertype code for IP).
Hardware Size (1 byte) This field specifies the size of the addresses in the Sender Hardware Address and Target Hardware Address fields, in bytes. For Ethernet and Token Ring networks, the value is 6.
Protocol Size (1 byte) This field specifies the size of the addresses in the Sender Protocol Address and Target Protocol Address fields, in bytes. For IP addresses, the value is 4.
Opcode (2 bytes) This field specifies the function of the packet: ARP Request, ARP Reply, RARP Request, or RARP Reply.
Sender Hardware Address (6 bytes) This field contains the hardware address of the system generating the message.
Sender Protocol Address (4 bytes) This field contains the IP address of the system generating the message.
Target Hardware Address (6 bytes) This field contains the hardware address of the system for which the message is destined. In ARP Request messages, this field is left blank.
Target Protocol Address (4 bytes) This field contains the IP address of the system for which the message is intended.

NOTE

The Reverse Address Resolution Protocol (RARP) performs the opposite function of ARP. RARP was once used by diskless workstations because it enables a system to discover its IP address by transmitting its hardware address to a RARP server. RARP is a progenitor of the Bootstrap Protocol (BOOTP) and the Dynamic Host Configuration Protocol (DHCP), which are used to automatically configure TCP/IP clients. It is rarely used today.

The process by which IP uses ARP to discover the hardware address of the destination system is as follows:

IP packages transport layer information into a datagram, inserting the IP address of the destination system into the Destination IP Address field of the IP header.
IP compares the network identifier in the destination IP address to its own network identifier and determines whether to send the datagram directly to the destination host or to a router on the local network.
IP generates an ARP Request packet containing its own hardware address and IP address in the Sender Hardware Address and Sender Protocol Address fields. The Target Protocol Address field contains the IP address of the datagram's next destination (host or router), as determined in Step 2. The Target Hardware Address Field is left blank.
The system passes the ARP Request message down to the data-link layer protocol, which encapsulates it in a frame and transmits it as a broadcast to the entire local network.
The systems on the LAN receive the ARP Request message and read the contents of the Target Protocol Address field. If the Target Protocol Address value does not match the system's own IP address, it silently discards the message and takes no further action.
If the system receiving the ARP Request message recognizes its own IP address in the Target Protocol Address field, it generates an ARP Reply message. The system copies the two sender address values from the ARP Request message into the respective target address values in the ARP Reply and copies the Target Protocol Address value from the request into the Sender Protocol Address field in the reply. The system then inserts its own hardware address into the Sender Hardware Address field.
The system transmits the ARP Reply message as a unicast message back to the computer that generated the request, using the hardware address in the Target Hardware Address field.
The system that originally generated the ARP Request message receives the ARP Reply and uses the newly supplied value in the Sender Hardware Address field to encapsulate the datagram in a data-link layer frame and transmit it to the desired destination as a unicast message.

The ARP specification requires TCP/IP systems to maintain a cache of hardware addresses that the system has recently discovered using the protocol. This prevents systems from flooding the network with separate ARP Request broadcasts for each datagram transmitted. When a system transmits a file in multiple TCP segments, for example, only one ARP transaction is usually required, because IP checks the ARP cache for a hardware address before generating a new ARP request. The interval during which unused ARP information remains in the cache is left up to the individual implementation, but it is usually relatively short, so as to prevent the system from using outdated address information.

TIP

The TCP/IP protocol stack in the Windows-based operating systems includes a utility called Arp.exe, which you can use to manipulate the contents of the ARP cache. When you manually add a hardware address into the cache this way, it remains there permanently, which can help to reduce the broadcast traffic on your network. For more information about Arp.exe, see Lesson 2: TCP/IP Utilities, in Chapter 10, "TCP/IP Applications."

IP

IP is the protocol that is responsible for carrying the data generated by nearly all of the other TCP/IP protocols from the source system to its ultimate destination. For detailed information about IP and its functions, see Lesson 1: IP, in Chapter 6, "Network Layer Protocols."

ICMP

The Internet Control Message Protocol (ICMP), as defined in RFC 792, is another protocol that IP uses to perform network administration tasks. ICMP is considered to be an internet (or network) layer protocol, despite the fact that it carries no application data and its messages are carried within IP datagrams. Although it uses only one message format, ICMP performs many different functions, which are generally divided into errors and queries.

The ICMP message format is illustrated in Figure 8.3.

Figure 8.3 The ICMP message format

The functions of the ICMP message fields are as follows:

Type (1 byte) This field contains a code that specifies the basic function of the message.
Code (1 byte) This field contains a code that indicates the specific function of the message.
Checksum (2 bytes) This field contains a checksum computed on the entire ICMP message; it is used for error detection.
Data (variable) This field may contain information related to the specific function of the message.

ICMP Error Message Types

Reporting errors of various types is the primary function of ICMP. IP is a connectionless protocol, so there are no internet/network layer acknowledgments returned to the sending system, and even the transport layer acknowledgments returned by TCP are generated only by the destination end system. ICMP functions as a monitor of internet layer communications, enabling intermediate or end systems to return error messages to the sender. For example, when a router has a problem processing a datagram during the journey to its destination, it generates an ICMP message and transmits it back to the source system. The source system may then take action to alleviate the problem in response to the ICMP message. The Data field in an ICMP error message contains the entire 20-byte IP header of the datagram that caused the problem, plus the first 8 bytes of the datagram's own Data field. The following sections examine the various types of ICMP error messages.

Destination Unreachable Messages

When an intermediate or end system attempts to forward a datagram to a resource that is inaccessible, it can generate an ICMP Destination Unreachable message and transmit it back to the source system. Destination Unreachable messages all have a Type value of 3; the Code value specifies exactly what resource is unavailable, using values shown in Table 8.1. For example, when a router fails to transmit a datagram to the destination system on a local network, it returns a Host Unreachable message to the sender. If the router can't transmit the datagram to another router, it generates a Net Unreachable message. If the datagram reaches the destination system but the designated transport layer or application layer protocol is unavailable, the system returns a Protocol Unreachable or Port Unreachable message.

Table 8.1 ICMP Destination Unreachable error messages


Code	Description
0	Net Unreachable
1	Host Unreachable
2	Protocol Unreachable
3	Port Unreachable
4	Fragmentation Needed And Don't Fragment Was Set
5	Source Route Failed
6	Destination Network Unknown
7	Destination Host Unknown
8	Source Host Isolated
9	Communication With Destination Network Is Administratively Prohibited
10	Communication With Destination Host Is Administratively Prohibited
11	Destination Network Unreachable For Type Of Service
12	Destination Host Unreachable For Type Of Service

Source Quench Messages

Source Quench messages have a Type value of 4 and function as a rudimentary flow control mechanism for the internet layer. When a router's memory buffers are nearly full, it can send a Source Quench message to the source system, which instructs it to slow down its transmission rate. When the Source Quench messages cease, the sending system can gradually increase the rate again.

Redirect Messages

Routers generate ICMP Redirect messages to inform a host or another router that there is a more efficient route to a particular destination. Many internetworks have a matrix of routers that enables packets to take different paths to a single destination, as shown in Figure 8.4. If System 1 sends a packet to Router A in an attempt to get it to System 2, Router A forwards the packet to Router B, but it also transmits an ICMP Redirect message back to System 1, informing it that it can send packets destined for System 2 directly to Router B.

Figure 8.4 ICMP Redirect messages enable routers to inform other systems of more efficient routes.

The ICMP Redirect message's Data field contains the usual 28 bytes from the datagram in question (the 20-byte IP header plus eight bytes of IP data) plus an additional 4-byte Gateway Internet Address field, which contains the IP address of the router that the system should use from now on when transmitting datagrams to that particular destination. By altering its practices, the source system saves a hop on the packet's path through the internetwork and lessens the processing burden on Router A.

Time Exceeded Messages

When a TCP/IP system creates an IP datagram, it inserts a value in the IP header's Time To Live (TTL) field that each router processing the datagram reduces by one during the packet's journey through the internetwork. Should the TTL value reach zero during the journey, the last router to receive the packet discards it and transmits an ICMP Time Exceeded (Type 11, Code 0) message back to the sender, informing it that the packet has not reached its destination and telling it why. This is called a Time To Live Exceeded In Transit message.

NOTE

The Time To Live Exceeded In Transit message is the basis for the functionality of the Traceroute program included in most TCP/IP implementations. For more information about Traceroute, see Lesson 2: TCP/IP Utilities, in Chapter 10, "TCP/IP Applications."

Another type of Time Exceeded message is used when a destination system is attempting to reassemble datagram fragments and one or more fragments fail to arrive in a timely manner. The system then generates a Fragment Reassembly Time Exceeded (Type 11, Code 1) message and sends it back to the source system.

ICMP Query Message Types

The other function of ICMP messages is to carry requests to another system for some type of information and also to return the replies containing that information. These are called ICMP query messages. ICMP query messages are not reactions to an outside process, as error messages are. However, external programs, such as the TCP/IP utility Ping, can generate query messages as part of their functionality.

Because query messages aren't generated in response to an external problem, their Data fields do not contain the IP header information from another datagram. Instead, the various types of query messages include more divergent information in the Data field, according to their functions. The following sections examine the most important query message types.

Echo Request and Echo Reply Messages

The Echo Request (Type 8, Code 0) and Echo Reply (Type 0, Code 0) messages form the basis for the TCP/IP Ping utility, and are essentially a means to test whether another system on the network is up and running. Both messages contain two-byte Identifier and two-byte Sequence Number subfields in the Data field, which are used to associate requests and replies, plus a certain amount of padding, as dictated by the Ping program. Ping functions by generating a series of Echo Request messages and transmitting them to a destination system specified by the user. The destination system, on receiving the messages, reverses the values of the Source IP Address and Destination IP Address fields, changes the Type value from 8 to 0, recalculates the checksum, and transmits the messages back to the sender. When Ping receives the Echo Reply messages, it assumes that the destination system is functioning properly.

NOTE

For more information about the Ping program, see Lesson 2: TCP/IP Utilities, in Chapter 10, "TCP/IP Applications."

Router Solicitation and Router Advertisement Messages

Router Solicitation (Type 10, Code 0) and Router Advertisement (Type 9, Code 0) messages cannot truly constitute a routing protocol, because they don't provide information about the efficiency of particular routes, but they do enable a TCP/IP system to discover the address of a default gateway on the local network. The process begins with a workstation broadcasting a Router Solicitation message to the local network. The routers on the network respond with unicast Router Advertisement messages, which contain the router's IP address and other information. The workstation can then use the information in these replies to configure the default gateway entry in its routing table.

TCP and UDP

TCP and UDP are the TCP/IP transport layer protocols. All application layer protocols use either TCP or UDP to transmit data across the network, depending on the services they require. For more information about these two protocols, see Lesson 1: TCP and UDP, in Chapter 7, "Transport Layer Protocols."

Application Layer Protocols

The protocols that operate at the application layer are no longer concerned with the network communication issues addressed by the link, internet, and transport layer protocols. These protocols are designed to provide communications between client and server services on different computers, and are not concerned with how the messages get to the other system.

Application layer protocols use different combinations of protocols at the lower layers to achieve the level of service they require. For example, servers use HTTP and FTP to transmit entire files to client systems, and it is essential that those files be received without error. These protocols, therefore, use a combination of TCP and IP to achieve connection-oriented, reliable communications. DHCP and Domain Name System (DNS), on the other hand, exchange small messages between clients and servers that can easily be retransmitted if necessary, so they use the connectionless service provided by UDP and IP.

Some of the most commonly used TCP/IP application layer protocols are as follows.

Hypertext Transfer Protocol (HTTP) HTTP is the protocol used by Web clients and servers to exchange file requests and the files themselves. A client browser opens a TCP connection to a server and requests a particular file, and the server replies by sending that file, which the browser displays as a home page. HTTP messages also contain a variety of fields containing information about the communicating systems.
File Transfer Protocol (FTP) FTP is a protocol used to transfer files between TCP/IP systems. An FTP client can browse through the directory structure of a connected server and select files to download or upload. FTP is unique in that it uses two separate ports for its communications. When an FTP client connects to a server, it uses TCP port 21 to establish a control connection. When the user initiates a file download, the program opens a second connection using port 20 for the data transfer. This data connection is closed when the file transfer is complete, but the control connection remains open until the client terminates it. FTP is also unusual in that on most TCP/IP systems, FTP is a self-contained application, rather than a protocol used by other applications.
Simple Mail Transport Protocol (SMTP) SMTP is the protocol that e-mail servers use to transmit messages to each other across the Internet.
Post Office Protocol (POP3) POP3 is one of the protocols that e-mail clients use to retrieve their messages from an e-mail server.
Domain Name System (DNS) TCP/IP systems use DNS to resolve Internet host names into the IP addresses they need to communicate.
Dynamic Host Configuration Protocol (DHCP) DHCP is a protocol that workstations use to request TCP/IP configuration parameter settings from a server.
Simple Network Management Protocol (SNMP) SNMP is a network management protocol used by network administrators to gather information about various network components. Remote programs—called agents—gather information and transmit it to a central network management console using SNMP messages.

NOTE

For more information about TCP/IP services such as DNS and DHCP, see Lesson 1: TCP/IP Services, in Chapter 10, "TCP/IP Applications."

Exercise 8.1: TCP/IP Layers and Protocols

Specify the layer of the TCP/IP protocol stack at which each of the following protocols operates.

DHCP
ARP
IP
UDP
POP3
ICMP
SMTP
TCP
DNS
SLIP