Forum Moderators: open

Message Too Old, No Replies

Nuts and Bolts Tips

How many of these do you know?

         

GoogleGuy

6:07 am on Jul 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay, I'm waiting for something to finish compiling, so I thought I'd write up a few webmaster tips. Most of these are on our help pages somewhere, but not many people know all of these tidbits. So here goes:

Tip #1: Use If-Modified-Since (IMS). IMS lets your webserver tell Googlebot whether a page has changed since the last time the page was fetched. If the page hasn't changed, we can re-use the content from the last time we fetched that page. That in turn lets the bot download more pages and save bandwidth. I highly recommend that you check to see if your server is configured to support If-Modified-Since. It's an easy win for static pages, and sometimes even pages with parameters can benefit from IMS.

Tip #2: You can use wildcards in robots.txt, and patterns can end in '$' to indicate the end of a name. So if you don't want Googlebot to fetch any PDF files, for example, you could say
Disallow: /*.pdf$
Don't forget that in the robots.txt file, all url patterns need a "/" anchor to be valid. That's a pretty common webmaster error (maybe the most common robots.txt mistake), so keep it in mind and save yourself some angst. :)

Tip #3: Googlebot also permits an "Allow" directive in robots.txt. This lets you specifically flag areas that are okay to crawl. When there are two directives that could apply, we follow the longest (i.e. most specific directive). See
[google.com...]
for an example.

Tip #4: Avoid session ID's. If you can, use fewer dynamic parameters and stay away from the parameter "id=" in urls--Googlebot tries to stay away from things that might be session ID's.

Tip #5: Make sure that you can reach every page on your site with a text browser like lynx. That's the best way to make sure that a spider can follow links to all of your pages. Site maps can be a really good way to help users and spiders get down into different parts of your site.

Some of these tips work mainly with Googlebot, but I hope that they help. Anybody else with nuts and bolts tips for site architecture, crawling, or robots.txt--throw 'em in! :)

kamikaze Optimizer

8:42 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



For tip number 3, does Google look explicitly for "id=" paramater or does it also use partial matching. For example, if I use a paramter like "jobid=" am I asking for trouble.

ALARM BELLS Urgent mass variable changing...

I posted about something very related last night (I had not read this thread at the time) [webmasterworld.com...] post 114, 116, 118, 120 and 123 and was told that I was wrong, but I see it happening.

kamikaze Optimizer

8:51 pm on Jul 26, 2003 (gmt 0)

10+ Year Member



In addition to that, I post on a forum that had a PR7 yesterday (in the post itself, the site has a PR9) and today the posts are PR1. So much for the anchor links in that sig... LOL

The URL looks like this: php?t=

Sticky me for more info.

This 32 message thread spans 2 pages: 32