Forum Moderators: phranque

Message Too Old, No Replies

What is HTTP?

         

Brett_Tabke

10:13 pm on Feb 13, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



With the current set of discussions over virtual hosting there have come many questions in email about "http". What is it?

HTTP is short for Hyper Text Transfer Protocol. It is an agreed upon standard that is the basis for the web. It is the communications standard for website data exchange with browsers.

On one side is the web server that listens for requests. That "listening" is done via specific ports. Ports are just assigned numbers for a computer to manage data flow. We have assigned ports for email, news, http, and other protocols. When an inbound connection is made on a "port" the port number is how the receiving machine knows which software should be handling the request. Sometimes you will see port numbers included in web urls:

[webmasterworld.com:80 ]

The web server is a "server" and a browser or other agent is defined as a "client". The client makes a request of the server via the HTTP protocol, and it is up to the web server software to fill the request.

What does an HTTP request look like? Here is a request to Webmaster World:
GET /index.htm HTTP/1.0
User-Agent: Mozilla/4.71 (Windows 98;US) Opera 3.62 [en]
Accept: image/gif, image/x-xbitmap, image/jpeg, image/png, */*
Cookie: show=1; activefolder=inbox; lastvisitinfo=[snip]; passwordcookie=****; usernamecookie=Brett_Tabke; subject=Webmaster+Jobs:; lastactivevisit=982098186
Host: www.webmasterworld.com

The http request is the "GET" which is a request for the file "index.htm" at the root of the site. The User-Agent is passed, and possibly a "referrer" string. Other fields in a request can include the "Accept" line which tells the website what media/file formats the agent supports. Any cookies that are set are returned. And finally, the "Host" field reports which site we are trying to access.

Other possible HTTP commands include
Post: which sends data from the browser to the web site. Most often used in web forms to post data to the server. As I leave this mess and click "send message", the message will be "posted" to the web site.
Head: a head request returns information about the document you are requesting. It does not return the document itself. Most often used to check the age of a web page.

There are some other http headers one can use, but those are the most common.

HTTP is used by browsers as well as spiders and other robotic programs.

After the request is made, the server will issue an HTTP response:

HTTP/1.1 200 OK
Date: Tue, 13 Feb 2001 21:59:03 GMT
Server: Apache/1.3.4 (Unix) FrontPage/4.0.4.3 PHP/3.0.14
Set-Cookie: lastvisitinfo=[snip]; expires=Thu, 15-Mar-2001 21:59:04 GMT;
Connection: close
Content-Type: text/html

First is the HTTP version support level and a status code and message. Some other common codes are "200 ok", "302 Moved", or "404 File Not found".
The date is the current server date, then the server name and often installed software, followed by the connection type, and media Content-Type.

There are many different potential headers that websites can kick out.

If you would like to play around with HTTP headers yourself, you can use Telnet to connect to your server and issue requests. Just telnet to your server on port 80, and issue a request - two returns signify the end of a request:

telnet www.yourhost.com 80
GET / HTTP/1.0

Further reading: HTTP at the W3C:
[w3.org ]

GWJ

3:54 pm on Feb 14, 2001 (gmt 0)



Just an aside. I heard HTML 4 is the last recomendation that W3C is going to make.

Brian

Xoc

7:06 am on Feb 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That is because HTML is superceded by XHTML. XHTML 1.0 is HTML 4.01 slightly reformulated so that it becomes a subset of XML. You basically can make any page qualify as XHTML by coding your HTML a little stricter. In particular, all tag should be in lower case. All attribute values must be surrounded by double-quotes. And all opening tags must have a closing tag. See [w3.org...] for all the details.

But all that is irrelevant, because HTML and HTTP are two separate things. HTTP is what gets the info from one place to another. HTML is how the info can be encoded.

grnidone

8:32 pm on Feb 15, 2001 (gmt 0)



Xoc,

Love that link. How does one do tags that don't have a "partner" ie <br>?

Do you do them as an empty tag? <br/>

-G

Xoc

10:09 pm on Feb 15, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Use <br />. The space before the / is important. This makes all of the old browsers happy as well as being compliant with XHTML. The old browsers treat the / as the start of an attribute that they don't understand. So they ignore the attribute. It's a hack that works in all cases that I know of, including all versions of Netscape, Internet Explorer, and Lynx.

Be careful about using /> to close tags that *do* have an infrequently used ending tag, such as <meta> or <input>. I have found that some search engines seem to improperly parse the tag and get confused. So I use <meta ...></meta> instead of <meta ... />.