Squeezing a webpage into one IP packet

Forum Moderators: phranque

Message Too Old, No Replies

Squeezing a webpage into one IP packet

Trying to mimic Google's page load performance

lammert

1:31 pm on Dec 15, 2007 (gmt 0)

One of the very impressive things with google is the page load time of the homepage. They realized that with multiple data centers close to the visitor, but also by making the HTML code of their homepage small enough to fit into one IP packet.

For one of my sites I am trying to do the same. (The packet compression technique, not the multiple data centers:)) I have read somewhere that the maximum size of a data packet on the internet is about 1300 bytes, but I don't know if this is dependent on settings of my network interface (My website is running in a LAMP environment) or that a page may be split in more packets even if it would fit into one larger packet.

Is there any way I can check how much packets are sent by my webserver for one page, or how many packets my browser receives? My Apache logs give information about the transfer size in bytes per page, but unfortunately not about packet numbers.

raedthakur

4:32 am on Mar 5, 2008 (gmt 0)

This is something i haven't thought of. So lets say we leave the multiple data centers for some time later and go ahead with the ip packet idea, not only will the code have to be super sleek but also the site would have minimal functionality.coz even a normal index page of a site with minimal content would be more than a single ip packet.

So i don't really see the logical use of making a page like this unless you are creating something to rival Google,in that case you better start saving up for the multiple data centers.

lammert

8:18 am on Mar 5, 2008 (gmt 0)

This is not for a typical e-commerce site :). I want to use this technology on a type of service site (question-answer dialogue like a search engine, but not a search engine) where people are expected to have a lot of interactive activity.

I have no problems if some of the site furniture like external JavaScript or CSS files or small images are loaded at first visit. I can set expire headers to get them loaded only once. But what I do need is fast response in subsequent requests. Due to the nature of the site, users may be in remote locations and use dial-up and they may loose interest in the site if loading pages takes too long. My goal is to get less than 1 second response time, independent of the location and internet connection of the user.

What I am still looking for is some sort of low level tool to see how many IP packets my webserver is sending out for each page served. Until the project really takes off, the web server is shared with other sites so just counting packets with the iptables firewall is not enough. I need something at a slightly higher level which can distinguish packets based on the domain name they are served from and the MIME type or URL. I could assign a dedicated IP address to this project though to make it somewhat easier.

jtara

6:56 pm on Mar 12, 2008 (gmt 0)

You could use a protocol analyzer,such as Ethereal.

But, really, there's no need.

Packets on the Internet are generally limited by the maximum Ethernet MTU of 1500 bytes.

While an IP packet can be as large as 65,535 bytes, packets are fragmented so as to fit into the path MTU. MTU = "maximum transmission unit". The path MTU is simply the lowest of all of the MTUs on all of the links between the end points. Each node along the way informs the next of it's MTU in the header.

Most/all IP packets will encounter an Ethernet link somewhere along the way, if nowhere else at the user's premises. Therefore, for all practical purposes, the maximum size of an IP packet (or, technically, fragment) on the Internet is 1500 bytes.

Subtracting the header, that leaves 1480 bytes for data payload.

Making allowance for encapsulation (say, the user has a DSL modem that uses PPP encapsulation, or perhaps they are connected from a home PC to a corporate network through an encrypted link) you'd need to knock it down a bit more to account for the encapsulation header(s).

Your 1300 estimate sounds good. 1400 is probably OK, and will handle most encapsulations scenarios.

Specific packets may be shorter, say, because your web server can't produce data fast enough (unlikely, though).

I wouldn't worry about it. Keep your pages to less than 1300-1400 bytes.

This is probably going to buy you less than you think. Almost all IP stacks today are "windowed". Meaning that the sender doesn't require an acknowledgment of each packet before sending the next. They just send away, using a "lazy acknowledgment" scheme. They stop only if they have sent "n" packets without an acknowledgment. Of course, if they get a NAK, they may have to back-up more than one packet.

If you need to be truly responsive, read up on the "Nagle Algorithm". But that likely won't be of interest unless you are trading stocks in milliseconds, with a fiber connection to the exchange. (Something I've done programming for.)

One second? Piece of cake!

fabricator

3:15 am on May 15, 2008 (gmt 0)

Apply gzip compression to the HTML, using perl/php. Check the header to ensure the web client support it.

Its not hard to make a webpage smaller if you dump all the whitespace and newline characters.