Forum Moderators: open

Message Too Old, No Replies

Goggle following .htm filename with query string

         

djtaverner

12:23 pm on Mar 1, 2004 (gmt 0)

10+ Year Member



Hey,

I am using a site copier to create a mirror of my .php site converting all query string pages into seperate .htm pages. But in the process it generates anchor text as such:

<a href="index3773.html?action=displaycat&catid=198&customheader=custom-header198">

I know google doesnt like long query strings, but is it okay if the query string follows a .html filename?

Cheers

Dave Taverner

mlemos

3:41 am on Mar 2, 2004 (gmt 0)

10+ Year Member



You got it wrong. Google has nothing against query strings. Google will just slow down crawling your site if their pages take too long to serve, like when they are generated dynamically with slow database queries or some other heavy task.

The reason for this is that Google does not want to cause trouble to your hosting by causing heavy load with bursts of crawling requests.

AFAIK, all the rest of the theories regarding dynamic pages are just myths.

wanderingmind

6:58 am on Mar 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not a myth!

Watch out for that long query string, big boy. Query strings that long ha sgot me into big trouble with google.

We are currently in the process of reformatting our URLs so they show only one = and one?.

This is an experiment, and if it succeeds I will let you know. Our dynamic sites which have been by default following the simpler query strings format have been indexed, but the ones with three =s in them, no way. This is the only reason we can think about, and in one discussion about dynamic URLs, where someone had shown a good URL format and a bad URL format (one with 3 =s), Googleguy agreed that the example was right. So do not touch any URL format with three +s in them. If anyone knows such a URL which has been indexed by Google, please mail me.

stargeek

7:02 am on Mar 2, 2004 (gmt 0)

10+ Year Member



In my expierince short dynamic urls (2 get vars or less, 20 characters average total) are spidered ok by google, just at a slower rate (highest I've seen was one a minute or so) than static pages.
Google reps have defined thier understanding of a dynamic page by the prescence of a question mark in the url.
The only long url I've seen google has trouble with is PHP's session id variable, but these links present duplicate content on differant urls, and google should ignore them.

Nipsy

7:05 am on Mar 2, 2004 (gmt 0)

10+ Year Member



I eliminated all evidence of a query string, or dynamic content, and watch googlebot return with vigor.

BTW, use mod_rewrite, don't make a copy of your site. It is gonna kill you to maintain both.

It may not prevent spidering, but it certainly slows complete inclusion, and perhaps deep crawling.

stargeek

7:10 am on Mar 2, 2004 (gmt 0)

10+ Year Member



Using ForceType is another easy way to accomplish this, as well as simply having a PHP function parse the request uri. As long as it doesn't contain a question mark.

wanderingmind

7:49 am on Mar 4, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I apologise.

I just got my URLs with three variables - pretty complex by Google standards - crawled by Googlebot. Didnt think this would happen. The URL is more complex that what Googleguy said was a 'good URL'.

dhaliwal

8:46 am on Mar 4, 2004 (gmt 0)

10+ Year Member



Well if you are making htm pages, then why don't you take the output of each page and save it as htm file and put on server

It takes one day full for my website having 600 pages, but its worth if i can get traffic with that.

google hasn't yet crawled my statci website, but i am sure whenever it does i will get more ranking

lol

enjoy

dhaliwal