On this page
|
| SUMMARY | |
| Protocol |
: |
Internet Cache Protocol |
| Protocol suite |
: |
TCP/IP |
| Layer |
: |
Application Layer |
| Ports |
: |
3130 (UDP) |
|
| DESCRIPTION |
The Internet Cache Protocol (ICP) is a protocol used for coordinating web caches. Its purpose is to find out the most appropriate location to retrieve a requested object from in the situation where multiple caches are in use at a single site. The goal is to use the caches as efficiently as possible, and to minimize the number of remote requests to the originating server.
Parents usually sit closer to the internet connection than the child. If a child cache cannot find an object, the query will be sent to the parent cache, which will fetch, cache, and pass on the request. While a parent server will resolve cache misses, a sibling will not. Siblings are caches of equal hierarchical status, whose purpose is to distribute the load amongst the siblings.
When a request comes into one cache in a cluster of siblings, ICP is used to query adjacent caches for the object being requested. If the adjacent cache has the object, it will be transferred from the adjacent cache, instead of being queried from the original server. This is often called a "near miss" - the object was not found in the cache (a "miss") but it was loaded from a nearby cache, instead of from a remote server.
ICP message Format
The ICP message format consists of a 20 byte header followed by a variable sized payload. All fields are represented in network byte order.
NOTE: All fields must be represented in network byte order. ICP messages must not exceed 16,384 bytes in length.
- Opcode
The following table shows currently defined ICP opcodes
| Value | Name | | 0 | ICP_OP_INVALID | | 1 | ICP_OP_QUERY. | | 2 | ICP_OP_HIT. | | 3 | ICP_OP_MISS. | | 4 | ICP_OP_ERR. | | 5 - 9 | UNUSED | | 10 | ICP_OP_SECHO. | | 11 | ICP_OP_DECHO. | | 12 - 20 | UNUSED | | 21 | ICP_OP_MISS_NOFETCH. | | 22 | ICP_OP_DENIED. | | 23 | ICP_OP_HIT_OBJ. |
- Version
The ICP protocol version number.
- Message Length
The total length of the ICP message in bytes.
- Request Number
An opaque identifier. When responding to a query, this value must be copied into the reply message.
- Options
Option flags that allows extension of this version of the protocol in certain, limited ways.
| Value | Flag | Description | | 0x40000000 | ICP_FLAG_SRC_RTT | This flag is set in an ICP_OP_QUERY message indicating that the requester would like the ICP reply to include the responder's measured RTT to the origin server. | | 0x80000000 | ICP_FLAG_HIT_OBJ | This flag is set in an ICP_OP_QUERY message indicating that it is okay to respond with an ICP_OP_HIT_OBJ message if the object data will fit in the reply. |
- Option Data
A four-octet field to support optional features. The following ICP features make use of this field: The ICP_FLAG_SRC_RTT option uses the low 16-bits of Option Data to return RTT measurements. The ICP_FLAG_SRC_RTT option is further described below.
- Sender Host Address
The IPv4 address of the host sending the ICP message. This field should probably not be trusted over what is provided by getpeer- name(), accept(), and recvfrom(). There is some ambiguity over the original purpose of this field. In practice it is not used.
- Payload
The contents of the Payload field vary depending on the Opcode, but most often it contains a null-terminated URL string.
ICP Opcodes
- ICP_OP_INVALID
A place holder to detect zero-filled or malformed messages. A cache must never intentionally send an ICP_OP_INVALID message. ICP_OP_ERR should be used instead.
- ICP_OP_QUERY
A query message. NOTE this opcode has a different payload format than most of the others. First is the requester's IPv4 address, followed by a URL. The Requester Host Address is not that of the cache generating the ICP message, but rather the address of the caches's client that originated the request. The Requester Host Address is often zero filled. An ICP message with an all-zero Requester Host Address address should be taken as one where the requester address is not specified; it does not indicate a valid IPv4 address.
- ICP_OP_SECHO
Similar to ICP_OP_QUERY, but for use in simulating a query to an origin server. When ICP is used to select the closest neighbor, the origin server can be included in the algorithm by bouncing an ICP_OP_SECHO message off it's echo port. The payload is simply the null-terminated URL.
NOTE: the echo server will not interpret the data (i.e. we could send it anything). This opcode is used to tell the difference between a legitimate query or response, random garbage, and an echo response.
- ICP_OP_DECHO
Similar to ICP_OP_QUERY, but for use in simulating a query to a cache which does not use ICP. When ICP is used to choose the closest neighbor, a non-ICP cache can be included in the algorithm by bouncing an ICP_OP_DECHO message off it's echo port. The payload is simply the null-terminated URL.
NOTE: one problem with this approach is that while a system's echo port may be functioning perfectly, the cache software may not be running at all.
One of the following six ICP opcodes are sent in response to an ICP_OP_QUERY message. Unless otherwise noted, the payload must be the null-terminated URL string. Both the URL string and the Request Number field must be exactly the same as from the ICP_OP_QUERY message.
- ICP_OP_HIT
An ICP_OP_HIT response indicates that the requested URL exists in this cache and that the requester is allowed to retrieve it.
- ICP_OP_MISS
An ICP_OP_MISS response indicates that the requested URL does not exist in this cache. The querying cache may still choose to fetch the URL from the replying cache.
- ICP_OP_ERR
An ICP_OP_ERR response indicates some kind of error in parsing or handling the query message (e.g. invalid URL).
- ICP_OP_MISS_NOFETCH
An ICP_OP_MISS_NOFETCH response indicates that this cache is up, but is in a state where it does not want to handle cache misses. An example of such a state is during a startup phase where a cache might be rebuilding its object store. A cache in such a mode may wish to return ICP_OP_HIT for cache hits, but not ICP_OP_MISS for misses. ICP_OP_MISS_NOFETCH essentially means "I am up and running, but please don't fetch this URL from me now."
- ICP_OP_DENIED
An ICP_OP_DENIED response indicates that the querying site is not allowed to retrieve the named object from this cache. Caches and proxies may implement complex access controls. This reply must be interpreted to mean "you are not allowed to request this particular URL from me at this particular time."
- ICP_OP_HIT_OBJ
Just like an ICP_OP_HIT response, but the actual object data has been included in this reply message. Many requested objects are small enough that it is possible to include them in the query response and avoid the need to make a subsequent HTTP request for the object.
- UNRECOGNIZED OPCODES
ICP messages with unrecognized or unused opcodes should be ignored, i.e. no reply generated. The application may choose to note the anomalous behavior in a log file.
ICP Option Flags
| Value | Flag | Description | | 0x80000000 | ICP_FLAG_HIT_OBJ | This flag is set in an ICP_OP_QUERY message indicating that it is okay to respond with an ICP_OP_HIT_OBJ message if the object data will fit in the reply. | | 0x40000000 | ICP_FLAG_SRC_RTT | This flag is set in an ICP_OP_QUERY message indicating that the requester would like the ICP reply to include the responder"s measured RTT to the origin server. |
|
Top of Page
|
| EXAMPLES |
|
|
Top of Page
|
| PROTOCOL RELATIONS |
■ Parent layer
■ Child layer
|
Top of Page
|
| GLOSSARY |
|
Cache Cache is a special high-speed storage mechanism. It can be either a reserved section of main memory or an independent high-speed storage device. Two types of caching are commonly used in personal computers: memory caching and disk caching
ICP ICP (Internet Cache Protocol) is a protocol used for coordinating web caches. Its purpose is to find out the most appropriate location to retrieve a requested object from in the situation where multiple caches are in use at a single site.
Payload Payload or mission bit stream is the data, such as a data field, block, or stream, being processed or transported ¡ª the part that represents user information and user overhead information. It may include user-requested additional information, such as network management and accounting information. Note that the payload does not include system overhead information for the processing or transportation system.
Remote In networks, remote refers to files, devices, and other resources that are not connected directly to your workstation. Resources at your workstation are considered local.
|
Top of Page
|
| REFERENCES |
RFCs
[ RFC 2186] Internet Cache Protocol (ICP), version 2.
[ RFC 2187] Application of Internet Cache Protocol (ICP), version 2.
[ RFC 3143] Known HTTP Proxy/Caching Problems.
|
Top of Page
|
| OTHER PROTOCOLS OF TCP/IP SUITE |
|
|
|
|
|