Forum Moderators: Robert Charlton & goodroi
And do you have any experience if G needs several crawles before it indexes dynamic pages fully?
What Iīm asking is:
should I wait or try to change the dynamic pages to static (item-64.html -type URLs) or wait for another crawl and update?
Googlebot and friends may make an educated guess whether a page is being dynamically generated; URIs which have "?id=XYZ" in them are a bit of a giveaway, but there are lots of ways to disguise this, mod_rewrite being one of them.
If your pages are shown in the index, IMHO it means Google indends to list them but hasn't got round to fetching data for the snippets; in Google's datacentres the snippet info comes from a different index to that for the search results themselves.
My $0.02? Wait a couple of weeks and see what happens before you redesign your site ;-)
Would you say that static pages are still more preferable than dynamic pages?
...err, maybe I didn't make myself clear.
You can't really tell whether a page is static or dynamic.
I have servers which serve pages that appear to be static - www.example.com/keyword.html - but actually it's mod_rewrite in the background, and there's a content management system too, so after mod_rewrite's done its magic, the server gets a request for www.example.com/cms.php?search=keyword and the CMS kicks in and serves that page from a MySQL database.
As a user, or a Googlebot, you cannot tell that this is going on
If you're designing from the ground up, then doing things this way is probably a good idea, because it's always best to abstract the technologies in use from the end users. In fact, the W3C recommend doing without extensions at all - they suggest using URIs such as www.example.com/keyword
No-one can really tell whether Google prefers pages that appear to be static or pages that appear to be dynamic, but if I had the choice I'd rather serve my users content that is as easy to use as possible - and I'd say
www.example.com/keyword.html
beats
www.example.com/fetchpage.aspx?mode=search&num=100&query=doitrealquick&valueofpi=4
any day.
Which would you click on?
P.S.
Google and other SEīs might have problems when indexing dynamic pages and they prefer static ones:
Google Guidelines:
If you decide to use dynamic pages (i.e., the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few.
Serving static pages that actually are dynamic is another thing. I can do this with my blogging software, but I have noticed that Google somehow picks up both the dynamic and the static (via .htacces) page. This could trigger some duplicate content filter - maybe?
If there is a query string that url can be considered dynamic.
Does this get all dynamic pages, no but it gets a lot of them.
Then it could look at the file extension.
.cgi .pl .php .jsp, .asp ... all reek of being dynamic correct, ok they go on the pile with the query string pages as being dynamic.
Does this get them all nope but it sure gets one huge pile of them.
There are other methods that involve probing but the above is enough to start the ball rolling.
Erku,
Update frequency appears to be dependent on page PR (those pesky IBL).
Google may get some pages once a day and some once a month.
[news.google.com...]
Google Images - dynamic
Google Groups - dynamic
Google News - dynamic
Froogle - Dynamic
etc etc adnaseum
Maybe it isn't so much the fact that the pages are dynamic, but rather what the quality and uniqueness of the content is.
Something to think about anyway.
But Google is able to tell which is which to some degree.
The rate that the bots spider pages on your site has a large impact on how your site shows up.
Using the same script under a rewrite rule set will result in a different fully indexed rate which is faster than the dynamic version.
This has been repeatedly mentioned on this site, but I'll adde yet another example.
I made a pageset static and the static stuff was fully indexed in less than a week where as the dynamic pages took over 5 weeks to get less than 50% of them indexed.
There was less than a 50% similarity between any two pages in the set just based upon word occurances let alone relative word placement. So duplicate content should be a non issue.
In addition the pages when generated via the scripts all eventually became fully indexed. It just took what seemed forever,
Every site has a different rate at which it gets loving attention by the bots.
So the only real test is to do it both ways on the same site.
It is all in the IBLs.
I think you misunderstood?
I said I have a blog that by default generates dynamic pages like this:
index.php?itemid=64
When I make them search engine friendly they look like this:
item-64.html
Now I have two pages that are identical and if Google indexes both pages there might (?) be a duplicate content penalty.
webdude, I am saying that other things being the same Google seems to like the "static" version for spidering and indexing better than the one with the query string in the url.
Others on the forum have commented on this many times.
Untill I just did a page set myself I was willing to allow them that it was possible. I've seen enough now to say they are probably correct. However maybe Google was having a good bot day on the site.
It should make zero difference, however could have, should have, did have, and will have can be and frequently are different.