On this page
|
| SUMMARY | |
| Protocol |
: |
HyperText Transfer Protocol |
| Protocol suite |
: |
TCP/IP |
| Layer |
: |
Application Layer |
| Type |
: |
File transfer protocol |
| Latest Version |
: |
HTTP 1.1 |
| Ports |
: |
HTTP: 80, 8008, 8080 (TCP) server
S-HTTP: 80 (TCP) server
HTTPS: 443 (TCP) server over SSL/TLS |
| Related protocols |
: |
WebDAV, Web Distributed Authoring and Versioning |
| URI |
: |
http, https |
| MIME subtype |
: |
application/http, message/http, message/s-http |
| Working groups |
: |
HTTP, HyperText Transfer Protocol
WTS, Web Transaction Security
Webdav, WWW Distributed Authoring and Versioning |
|
| DESCRIPTION |
The HyperText Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. Messages are passed in a format similar to that used by Internet Mail and the Multipurpose Internet Mail Extensions (MIME). Although designed to be extensible to almost any document format, HTTP is the de facto standard for transferring files (text, graphic images, sound, video, and other multimedia files) on the World Wide Web.
HTTP concepts include (as the Hypertext part of the name implies) the idea that files can contain references to other files whose selection will elicit additional transfer requests. Any Web server machine contains, in addition to the Web page files it can serve, an HTTP daemon, a program that is designed to wait for HTTP requests and handle them when they arrive. Your Web browser is an HTTP client, sending requests to server machines. When the browser user enters file requests by either "opening" a Web file (typing in a Uniform Resource Locator or URL) or clicking on a hypertext link, the browser builds an HTTP request and sends it to the Internet Protocol address (IP address) indicated by the URL. The HTTP daemon in the destination server machine receives the request and sends back the requested file or files associated with the request. (A Web page often consists of more than one file.)
HTTP operates over TCP connections, usually to port 80, though this can be overridden and another port used. After a successful connection, the client transmits a request message to the server, which sends a reply message back. HTTP messages are human-readable, and an HTTP server can be manually operated with a command such as telnet server 80.
Secure HTTP (S-HTTP) is a superset of HTTP, which allows messages to be encapsulated in various ways. Encapsulations can include encryption, signing, or MAC based authentication. This encapsulation can be recursive, and a message can have several security transformations applied to it. S-HTTP also includes header definitions to provide key transfer, certificate transfer, and similar administrative functions. S-HTTP appears to be extremely flexible in what it will allow the programmer to do. S- HTTP also offers the potential for substantial user involvement in, and oversight of, the authentication & encryption activities
HTTPS is HTTP encapsulated in an SSL/TLS stream.
HTTP Message
- Methods
| OPTIONS | Represents a request for information about the communication options available on the request/response chain identified by the Request-URI..(RFC 2068) | | GET | Retrieve whatever information (in the form of an entity) is identified by the Request-URI. (RFC 2068) | | HEAD | Identical to GET except that the server MUST NOT return a message-body in the response. (RFC 2068) | | POST | Used to request that the destination server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. (RFC 2068) | | PUT | Requests that the enclosed entity be stored under the supplied Request-URI (RFC 2068) | | DELETE | Requests that the origin server delete the resource identified by the Request-URI. (RFC 2068) | | TRACE | Used to invoke a remote, application-layer loop- back of the request message. (RFC 2068) | | LINK | Establishes one or more Link relationships between the existing resource identified by the Request-URI and other existing resources. (RFC 1945) | | UNLINK | Removes one or more Link relationships from the existing resource identified by the Request-URI. (RFC 1945) |
- HTTP Status Codes
| 1xx | Informational | | 100 | Continue. | | 101 | Switching protocols. | | 2xx | Success | | 200 | OK; the request was fulfilled. | | 201 | OK; following a POST command. | | 202 | OK; accepted for processing, but processing is not completed. | | 203 | OK; partial information--the returned information is only partial. | | 204 | OK; no response--request received but no information exists to send back. | | 205 | Reset content. | | 206 | Partial content. | | 226 | IM used. | | 3xx | Redirection | | 301 | Moved--the data requested has a new location and the change is permanent. | | 302 | Found--the data requested has a different URL temporarily. | | 303 | Method--under discussion, a suggestion for the client to try another location. | | 304 | Not Modified--the document has not been modified as expected. | | 305 | Use proxy. | | 4xx | Error seems to be in the client | | 400 | Bad request--syntax problem in the request or it could not be satisfied. | | 401 | Unauthorized--the client is not authorized to access data. | | 402 | Payment required--indicates a charging scheme is in effect. | | 403 | Forbidden--access not required even with authorization. | | 404 | Not found--server could not find the given resource. | | 405 | Method not allowed. | | 406 | Not acceptable. | | 407 | Proxy authentication required. | | 408 | Request timeout. | | 409 | Conflict. | | 410 | Gone. | | 411 | Length required. | | 412 | Precondition failed. | | 413 | Request entity too large. | | 414 | Request URI too large. | | 415 | Unsupported media type. | | 426 | Upgrade Required. | | 5xx | Error seems to be in the server | | 500 | Internal Error--the server could not fulfill the request because of an unexpected condition. | | 501 | Not implemented--the sever does not support the facility requested. | | 502 | Server overloaded--high load (or servicing) in progress. | | 503 | Service unavailable. | | 504 | Gateway timeout. | | 505 | HTTP version not supported. |
|
Top of Page
|
| EXAMPLES |
The simplest HTTP message is "GET URL", to which the server replies by sending the
named document.
Here a sample HTTP/1.1 exchange:
GET URL HTTP/1.1
Host: Hostname
< HTTP/1.1 200 OK
< Date: Fri, 07 Jun 2002 10:21:59 GMT
< Server: Microsoft-IIS/5.0
< Accept-Ranges: bytes
< Connection: Keep-Alive
< Last-modified: Wed, 05 Jun 2002 22:49:15 GMT
< Content-type: text/html
< Content-length: 1579
...
Or if the document doesnĄ¯t exist:
< HTTP/1.1 404 Not Found
< Connection: Close
< Content-Type: text/plain
< 404 Not Found
< The requested file (or whatever you requested) couldn't be found!
In addition to GET requests, clients can also send HEAD and POST requests, of which
POSTs are the most important. POSTs are used for HTML forms and other operations
that require the client to transmit a block of data to the server. After sending the
header and the blank line, the client transmits the data. The header must have
included a Content-Length: field, which permits the server to determine when all the
data has been received. |
Top of Page
|
| PROTOCOL RELATIONS |
■ Parent layer
■ Child layer
TCP
|  | HTTP | |
Top of Page
|
| GLOSSARY |
|
Client Clinet is a program which requests services of another program. It is a client part of a client-server architecture. Typically, a client is an application that runs on a personal computer or workstation and relies on a server to perform some operations. For example, an e-mail client is an application that enables you to send and receive e-mail.
Daemon Daemon is a process that runs in the background and performs a specified operation at predefined times or in response to certain events. The term daemon is a UNIX term, though many other operating systems provide support for daemons, though they're sometimes called other names. Windows, for example, refers to daemons as System Agents and services. Typical daemon processes include print spoolers, e-mail handlers, and other programs that perform administrative tasks for the operating system. The term comes from Greek mythology, where daemons were guardian spirits.
Encryption The translation of data into a secret code. Encryption is the most effective way to achieve data security. To read an encrypted file, you must have access to a secret key or password that enables you to decrypt it. Unencrypted data is called plain text; encrypted data is referred to as cipher text.
There are two main types of encryption: asymmetric encryption (also called public-key encryption) and symmetric encryption.
Gateway A network device used to translate between two different protocols. Used to interconnect two networks that use incompatible protocols. It is a node on a network that serves as an entrance to another network. In enterprises, the gateway is the computer that routes the traffic from a workstation to the outside network that is serving the Web pages. In homes, the gateway is the ISP that connects the user to the internet.
In enterprises, the gateway node often acts as a proxy server and a firewall. The gateway is also associated with both a router, which use headers and forwarding tables to determine where packets are sent, and a switch, which provides the actual path for the packet in and out of the gateway.
It is also a computer system located on earth that switches data signals and voice signals between satellites and terrestrial networks and an earlier term for router, though now obsolete in this sense as router is commonly used.
Header In many disciplines of computer science, a header is a unit of information that precedes a data object. In a network transmission, a header is part of the data packet and contains transparent information about the file or the transmission. In file management, a header is a region at the beginning of each file where bookkeeping information is kept. The file header may contain the date the file was created, the date it was last updated, and the file's size. The header can be accessed only by the operating system or by specialized programs.
In word processing, one or more lines of text that appears at the top of each page of a document. Once you specify the text that should appear in the header, the word processor automatically inserts it.
Hypertext A special type of database system, invented by Ted Nelson in the 1960s, in which objects (text, pictures, music, programs, and so on) can be creatively linked to each other. When you select an object, you can see all the other objects that are linked to it. You can move from one object to another even though they might have very different forms. For example, while reading a document about Mozart, you might click on the phrase Violin Concerto in A Major, which could display the written score or perhaps even invoke a recording of the concerto. Clicking on the name Mozart might cause various illustrations of Mozart to appear on the screen. The icons that you select to view associated objects are called Hypertext links or buttons.
Hypertext systems are particularly useful for organizing and browsing through large databases that consist of disparate types of information. There are several Hypertext systems available for Apple Macintosh computers and PCs that enable you to develop your own databases. Such systems are often called authoring systems . HyperCard software from Apple Computer is the most famous.
IP address IP address is an identifier for a computer or device on a TCP/IP network. Networks using the TCP/IP protocol route messages based on the IP address of the destination. The format of an IP address is a 32-bit numeric address written as four numbers separated by periods. Each number can be zero to 255. For example, 1.160.10.240 could be an IP address. Within an isolated network, you can assign IP addresses at random as long as each one is unique. However, connecting a private network to the Internet requires using registered IP addresses (called Internet addresses) to avoid duplicates.
The four numbers in an IP address are used in different ways to identify a particular network and a host on that network. Four regional Internet registries -- ARIN, RIPE NCC, LACNIC and APNIC -- assign Internet addresses from the following three classes.
Class A - supports 16 million hosts on each of 126 networks
Class B - supports 65,000 hosts on each of 16,000 networks
Class C - supports 254 hosts on each of 2 million networks
The number of unassigned Internet addresses is running out, so a new classless scheme called CIDR is gradually replacing the system based on classes A, B, and C and is tied to adoption of IPv6.
MAC MAC (Medium Access Control) is a hardware address that uniquely identifies each node of a network. In IEEE 802 networks, the Data Link Control (DLC) layer of the OSI Reference Model is divided into two sublayers: the Logical Link Control (LLC) layer and the Media Access Control (MAC) layer. The MAC layer interfaces directly with the network medium. Consequently, each different type of network medium requires a different MAC layer.
On networks that do not conform to the IEEE 802 standards but do conform to the OSI Reference Model, the node address is called the Data Link Control (DLC) address.
MIME MIME (Multipurpose Internet Mail Extensions) is a specification for formatting non-ASCII messages so that they can be sent over the Internet. Many e-mail clients now support MIME, which enables them to send and receive graphics, audio, and video files via the Internet mail system. In addition, MIME supports messages in character sets other than ASCII.
There are many predefined MIME types, such as GIF graphics files and PostScript files. It is also possible to define your own MIME types.
In addition to e-mail applications, Web browsers also support various MIME types. This enables the browser to display or output files that are not in HTML format.
Method In object-oriented programming, a procedure that is executed when an object receives a message. A method is really the same as a procedure, function , or routine in procedural programming languages. The only difference is that in object-oriented programming, a method is always associated with a class.
Proxy An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them, with possible translation, on to other servers. A proxy must interpret and, if necessary, rewrite a request message before forwarding it. Proxies are often used as client-side portals through network firewalls and as helper applications for handling requests via protocols not implemented by the user agent.
S-HTTP An extension to the HTTP protocol to support sending data securely over the World Wide Web. Not all Web browsers and servers support S-HTTP. Another technology for transmitting secure communications over the World Wide Web -- Secure Sockets Layer (SSL) -- is more prevalent. However, SSL and S-HTTP have very different designs and goals so it is possible to use the two protocols together. Whereas SSL is designed to establish a secure connection between two computers, S-HTTP is designed to send individual messages securely. Both protocols have been submitted to the Internet Engineering Task Force (IETF) for approval as a standard.
Server A computer or device on a network that manages network resources. For example, a file server is a computer and storage device dedicated to storing files. Any user on the network can store files on the server. A database server is a computer system that processes database queries. Servers are often dedicated, meaning that they perform no other tasks besides their server tasks. On multiprocessing operating systems, however, a single computer can execute several programs at once. A server in this case could refer to the program that is managing resources rather than the entire computer.
URL URL (Uniform Resource Locator) is the global address of documents and other resources on the World Wide Web. The first part of the address indicates what protocol to use, and the second part specifies the IP address or the domain name where the resource is located.
For example, the two URLs below point to two different files at the domain pcwebopedia.com. The first specifies an executable file that should be fetched using the FTP protocol; the second specifies a Web page that should be fetched using the HTTP protocol:
ftp://www.webpage.com/example.exe
http://www.webpage.com/index.html
WWW WWW(World Wide Web) is a system of Internet servers that support specially formatted documents. The documents are formatted in a markup language called HTML (HyperText Markup Language) that supports links to other documents, as well as graphics, audio, and video files. There are several applications called Web browsers that make it easy to access the World Wide Web; Two of the most popular being Netscape Navigator and Microsoft's Internet Explorer.
Web browser Also named browser.
Web page Web page is a document on the World Wide Web. Every Web page is identified by a unique URL (Uniform Resource Locator).
WebDAV WebDAV (Web-based Distributed Authoring and Versioning) is a set of extensions to the HTTP protocol which allows users to collaboratively edit and manage files on remote web servers. WebDAV features XML properties on metadata, locking - which prevents authors from overwriting each other's changes - namespace manipulation and remote file management.
WebDav is sometimes referred to as DAV.
|
Top of Page
|
| REFERENCES |
Related links:
HTTP status codes RFCs:
[ RFC 1945] Hypertext Transfer Protocol -- HTTP/1.0.
[ RFC 2145] Use and interpretation of HTTP version numbers.
[ RFC 2169] A Trivial Convention for using HTTP in URN Resolution.
[ RFC 2227] Simple Hit-Metering and Usage-Limiting for HTTP.
[ RFC 2291] Requirements for a Distributed Authoring and Versioning Protocol for the World Wide Web.
[ RFC 2295] Transparent Content Negotiation in HTTP.
[ RFC 2296] HTTP Remote Variant Selection Algorithm -- RVSA/1.0.
[ RFC 2310] The Safe Response Header Field.
[ RFC 2518] HTTP Extensions for Distributed Authoring -- WEBDAV.
[ RFC 2560] X.509 Internet Public Key Infrastructure Online Certificate Status Protocol - OCSP.
[ RFC 2585] Internet X.509 Public Key Infrastructure Operational Protocols: FTP and HTTP.
[ RFC 2616] Hypertext Transfer Protocol -- HTTP/1.1.
Defines MIME media subtypes application/http and message/http.
Defines URI scheme http:.
Obsoletes: RFC 2068.
[ RFC 2617] HTTP Authentication: Basic and Digest Access Authentication.
Obsoletes: RFC 2069.
[ RFC 2660] The Secure HyperText Transfer Protocol.
[ RFC 2774] An HTTP Extension Framework.
[ RFC 2817] Upgrading to TLS Within HTTP/1.1.
Updates: RFC 2616.
[ RFC 2818] HTTP Over TLS.
Defines URI scheme https:.
[ RFC 2936] HTTP MIME Type Handler Detection.
[ RFC 2964] Use of HTTP State Management.
[ RFC 2965] HTTP State Management Mechanism.
Obsoletes: RFC 2109.
[ RFC 3143] Known HTTP Proxy/Caching Problems.
[ RFC 3205] On the use of HTTP as a Substrate.
[ RFC 3229] Delta encoding in HTTP.
[ RFC 3230] Instance Digests in HTTP.
[ RFC 3675] .sex Considered Dangerous.
[ RFC 3875] The Common Gateway Interface (CGI) Version 1.1. Obsolete RFCs:
[ RFC 2068] Hypertext Transfer Protocol -- HTTP/1.1.
Obsoleted by: RFC 2616.
[ RFC 2069] An Extension to HTTP : Digest Access Authentication.
Obsoleted by: RFC 2617.
[ RFC 2109] HTTP State Management Mechanism.
Obsoleted by: RFC 2965.
|
Top of Page
|
| OTHER PROTOCOLS OF TCP/IP SUITE |
|
|
|
|
|