Forum Moderators: open

Message Too Old, No Replies

Mod_Rewrite causing Google penalty

Bot getting wiser?

         

internetheaven

2:34 pm on Jul 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So far we're aware that Googlebot now actively seeks out javascript. Many theories on the subject but the majority feel it is Googlebot's way of finding 'spam/black hat' techniques.

Mod_rewrite is the very popular way to make dynamic URL's look static for the purposes of encouraging crawling by Googlebot and gaining a decent PageRank.

In my mind this is deceptive practice, you're making your site look like something it's not to Googlebot. My mod_rewrite is in the php coding and .htaccess file and I find that my users prefer it to the long strings of parameters in regular dynamic URLs. My question is though, could Googlebot be trained to recognise mod_rewrite and do you think Google will see it as a black hat practice?

ExpLarry

8:55 pm on Jul 12, 2004 (gmt 0)

10+ Year Member



mod_rewrite is a server-side thingy and as such should be totally transparent to the client. I can't see any way this could be black hat, unless the mod_rewrite is used as part of a cloaking operation (i.e. different pages for the bots).

The only general problem I see is the risk of duplicate content recognition if your parameterized links return 200 and "leak out" somehow.

trimmer80

9:03 pm on Jul 12, 2004 (gmt 0)

10+ Year Member



you're making your site look like something it's not to Googlebot.

As stated it is a server side process and thus proper mod_rewrite will result in the surfer and the googlebot both seeing the appended url. The content on the site is the same. All is the same except for some changed characters in the url.
Nothing wrong with it whatsoever.

py9jmas

9:13 pm on Jul 12, 2004 (gmt 0)

10+ Year Member



In my mind this is deceptive practice, you're making your site look like something it's not to Googlebot.

No it isn't. Go and read up about the semantics of URIs. The names 'dynamic' and 'static' URLs are misnomers. They are URLs that point to resources. You can't make any presumptions about the resource from the URL. The URL may end 'tree.jpg' but actually be a MPEG video clip of a cat, for example.