Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Tips for Helping Google Index Dynamic Pages



2:15 am on Nov 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

4 important concepts for making sure dynamic pages are indexed:

1. All pages have a unique and descriptive title
2. The pages don't have an excessively long URL. Once you above about 50 characters, the chance of the page being indexed declines rapidly. Over 90 - almost never.
3. The page is not deep in a directory such as: domain.com/i-am-great/why/reasons/press-releases/he-is-great.php. Better: domain.com/he-is-great.php
4. There are several links to the page. In other words, you should have good site navigation.

Any other tips you would add?


2:24 am on Nov 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Ths sort of goes along with what you were saying, but every &,?, and = added in the url seems to decrease the chances of it being indexed.


9:39 am on Nov 8, 2002 (gmt 0)

10+ Year Member

Wouldn't the best way be to camouflage the pages as static by putting the arguments in the file name. That way the SEs won't realise that the pages are dynamic.

It will probably also give URLs that are easier understood by the normal user.



5:17 pm on Nov 8, 2002 (gmt 0)

10+ Year Member

Build a site map that allows a user to click through every page.

It worked well for my 20,000 page site.


5:25 pm on Nov 8, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Yup- I can confirm the sitemap- works like a charm.


5:55 pm on Nov 8, 2002 (gmt 0)

How the heck do you build a site map with 20,000 links? Is this only for the spider to see?


6:02 pm on Nov 8, 2002 (gmt 0)

WebmasterWorld Administrator rogerd is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Welcome to WebmasterWorld, Respree... I was wondering about the 20K links, too... certainly a multi-page, hiearchical site map would be needed for that many links?


6:09 pm on Nov 8, 2002 (gmt 0)

10+ Year Member

If you have several variables in your page url like:

Then try making a site map with only 1 variable in the URL. In the May '02 update our 3 variable pages were dropped from the index (which was most of our site). So I built a site map with 1 variable URL's. It was a little difficult and some info on each page was missing, but it did get the pages back into the index.

Also make sure you have NO REDIRECTS in your site navigation. Many off the shelf programs will redirect to different pages if variables are missing from the URL. For example if one category of your site has no sub categories it might be redirecting the content page after looking for sub categories. If the there are sub categories, the page is shown with the sub categories listed.

Home -> category -> sub category list-> content
Home -> category -> redirect -> content


12:39 pm on Nov 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

With Coldfusion you can get this to autogenerate for you by including query results on a dynamically generated page without any?'s or &'s. I don't know about .asp, .net or .php- buy you can probably do something simillar.

So your sitemap would look like this:

widgets/fuzzybluewidgets/links to each individual widget
/fuzzygreenwidgets/links to each individual widget
/nonfuzzybluewidgets/links to each individual widget
/nonfuzzygreenwidgets/links to each individual widget

All you would have to set up would be the widgets page and the subcat one down from that.


1:57 am on Nov 10, 2002 (gmt 0)

I'm very interested in this as well. We used to use infopop mainly due to html output for the search engines...switched to vbulletin for many reasons, but with 10-20 times as many messages, none if any of the content is indexed. I tried some hack that vb'ers created to make archives available, involving 404 pages, but google didn't bite at all.

I read that index.php?threadid=nnn should work for google, but no love...

I just spent the morning creating a hack that would allow me to output all the content in a way that each post would show up as:


where 1=postid...but after all that, I got stopped at post 35,000 or so because it looks like Linux won't allow more than 35,000 subdirectories..

The rewrite trick isn't working for me which I suppose is just as well as I read somewhere that mod-rewrite sends a certain header out and that google can detect it...

This is an extremely annoying problem.


4:58 am on Nov 10, 2002 (gmt 0)

Also, look into Apache mod_rewrite if you are using an Apache Server.



6:04 am on Nov 10, 2002 (gmt 0)

yah, well right at the top of the page sums it up:

"The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail."

I found that if the rewrite rules didn't cause the HTTPD daemon not to load anymore, then it didn't do what I expected it to... I had similar experiences with Sendmail :)


9:21 am on Nov 10, 2002 (gmt 0)

10+ Year Member


where 1=postid...but after all that, I got stopped at post 35,000 or so because it looks like Linux won't allow more than 35,000 subdirectories..

have you tried to make URLs like domain.com/1_index.php to avoid creating thousands of directories?
I don't know if it could be better...


12:22 pm on Nov 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

GilberZ - Stickymail me the URL of the PHP site in question. Usually I can spot something in two minutes on the front page of the site that's keeping Google from crawling it. It's usually just a silly coding error. Google right now LOVES ASP and PHP driven sites - if they are done right.



6:35 pm on Nov 10, 2002 (gmt 0)

Thanks G! That was a very nice offer..I sent you a sticky!

BTW, I just read an article interviewing Brett. Geeez! You wrote this forum code yourself? Awesome job Brett!

You should resell it! I'm very happy with vbulletin and have no plans to switch, but this is certainly marketable code...


11:19 pm on Nov 11, 2002 (gmt 0)

10+ Year Member

The key that I have found is to use clever coding..
Many of the sites I have seen make good use of PHP, .htaccess and $QUERY_STRING.

The key to this is having structured urls.. for example..


of course google see's this as a structured site with keyword rich url.

The script could do a $key = explode("/",$QUERY_STRING); {following me here php guys?} this would then mean that $key[1] == 12343223 which can be used to output your content accordingly..

This isnt the best explanation, but by utilising apache error404's (and forcing the header as "200 OK"[smilestopper]) Tricks google into getting the big urls (without the query strings).

You can use this as a search function (PHP.net utilises this for something such as http://www.php.net/explode, although they then header redirect)

This isnt the best explanation in the world but ive just come back from the pub :-p


12:28 am on Nov 12, 2002 (gmt 0)

I've done the 404 thing but google isn't biting...perhaps gguy can confirm but it looks like google detects the 404 error code and therefore doesn't index it?


11:28 am on Nov 12, 2002 (gmt 0)

10+ Year Member

did you force the header as 200 OK? If you force it as 200 OK then google has no way of telling ( header("200 OK"); ).
If you haven't then google will just 404tastic you and reduce your ranking for poor linkage.


7:29 pm on Nov 12, 2002 (gmt 0)

I used a standard hack...I'll have to check the code...

BTW, is there a way to tell from the browser or some other method what the header is?

Grumpus also mention to me in sticky that I have some no cache headers, which is just vbulletin stuff, could that also have an effect on google?


7:33 pm on Nov 12, 2002 (gmt 0)

10+ Year Member

if you have something like header("pragma:no-cache") i believe google would not index it (im not sure if my header syntax is 100% correct in this case).


7:33 pm on Nov 12, 2002 (gmt 0)

WebmasterWorld Administrator jatar_k is a WebmasterWorld Top Contributor of All Time 10+ Year Member

You can use this to check headers

Server Header Checker [searchengineworld.com]


7:51 pm on Nov 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

I don't believe the 'pragma:nocache' attribute will stop Google from indexing you at all.

I know at least one of my clients has this in the header of every single page (some 1k pages) and all of them are indexed.


8:04 pm on Nov 12, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

What can one do to change or manipulate the server header?


8:09 pm on Nov 12, 2002 (gmt 0)

10+ Year Member

Could anybody tell me where can I find complete explanations about the different ways to solve the problem of dynamic urls in SEO? (mod_rewrite, 404, etc...)
Thanks a lot.


9:00 pm on Nov 12, 2002 (gmt 0)

turns out that I have one domain pointing to another...although in most cases it doesn't do this, for some reason the hack was sending a redirect to the domain in the httpd.conf, so there was no direct backlink to those pages...only to the redirect...could this have caused the problem?


12:57 am on Nov 13, 2002 (gmt 0)

10+ Year Member

Purely personal take on all this based on observations and things I think just make sense.

Rule 1:
Keep you url variables in a structure so that any particular url will always (or as often as possible) return the same content. When google returns to the page it expects to see similar or at least on topice content - not an error or 404. So "once only" url strings are bad.

Rule 2:
Each page should have a unique url, so session identifiers are very bad. If google visits the links:

something.php?sessionId=465 and

and the same content is presented for each request I *think* it will get fed up with your site pretty quickly.

Fuzzy stuff:
Short and descriptive is probably good for users, but I don't think google cares too much (I think going over board with length will probably have a negative impact but I don't have proof). I have a site that uses url variables in the form something.php?x=123,456,789,123 (kind of like Vignette does/did). This is meaningless to users but google doesn't seem to mind it at all, this site does however conform to rules 1 & 2.

There are ways of "hiding" url variables, one way I have used is something.php/page/contact_us. This works by tricking to user agent into thinking that contact_us is a folder, while the web server knows that something.php is the file that is being requested. You then grab the query string environment variable and strip the variable name/variable value pairs out off the end of it (in this case page=contact_us). How you do this depends on what you are using in terms of a web server & scripting language (do a search on "search safe urls php" or whatever your language is). Anyway I think this is rapidly becoming redundant and probably not worth worrying about. Personally I don't worry about hiding the "?" but concentrate on rules 1 & 2 and things seem to be fine.

Another quick thing about that "hiding" method - the user agent thinks that its looking at the contact_us folder so you must use absolute paths for all links and images references which can be a pain, especially if you are trying to implement this on an existing site.

Anyway hope thats clear. While its not gospel, sites I've been looking at for the last 6 months it seems to work well. Maybe Brett_T or GG can confirm?


8:26 am on Nov 13, 2002 (gmt 0)

10+ Year Member

good point about absolute url's, I always negate to mention that issue


5:58 pm on Nov 13, 2002 (gmt 0)

How about a <base href="" rather than absolute urls?


6:29 pm on Nov 13, 2002 (gmt 0)

10+ Year Member

yeah <base would work.. but why pass go without collecting £200? best to use absolute url's imho


7:56 pm on Dec 2, 2002 (gmt 0)

10+ Year Member

I get around this on my site by using cleverly coded .htaccess files. For example, I have PHP files that operate like news.php?id=54 and tune.php?id=89 but my URLs are, for example, [djism.com...] which looks like there is an individual page for each item.

This is done by the following (hopefully fool-proof) code:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^news/(.*).php$?id=$1 [L]

In the same way, I redirect images to GD to be thumbnailed:

RewriteRule ^images/news/small_(.*)$ [gdfile].php?src_img=images/news/$1 [L]

I would not advise using the '404' method as this could potentially return '200 OK' when a 404 has been encountered. A good example of 404 usage is on the PHP.net site where it will redirect to the closest page or if it can't find one, send the url to the search page.


This 37 message thread spans 2 pages: 37

Featured Threads

Hot Threads This Week

Hot Threads This Month