Welcome to WebmasterWorld Guest from 23.20.241.155

Message Too Old, No Replies

Google refuses to our follow Mod-Rewrite urls

     
9:35 pm on May 27, 2006 (gmt 0)

5+ Year Member



This is really maddening. We are trying to transfer our website with great rankings to another platform and split the products into two separate sites. No duplicates, of course.

The new platform is data-base driven with pages written on the fly. The old one used the good-old static pages.

Mod_rewrite is used to re-write SE friendly URLs for the new site.

Yahoo! is indexing the SE-friendly pages fine.

A sitemap is registerd with Google that has the SE-friendly URLS, not the long cgi-bin urls. Google Sitemaps says these SE-friendly pages are "Unreachable URLS" due to network issues. That's not true.

Robots are restricted from cgi-bin in robots.txt.

GOOGLE FAILS TO FOLLOW THE MOD-REWRITE pages. It is stubborn and keeping our old site pages that are static and dropping them as we transfer the static pages on the old site to the dynamic one on the new site.

We have 2 other websites with the exact same architecture and a totally different product line. These are indexed fine by google.

Can anyone help determine why Google is being so stubborn? What can we do to overcome this? We've been working on this all year and was hoping once Big Daddy rolled out, it would sort out. It hasn't.

11:07 pm on May 27, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If there are links to the old pages, and/or the old pages still return "200 OK" then there is no reason for Google to drop them.

You'll be needing a 301 redirect from the old URLs to the new URLs. That will help things along a bit.

Run Xenu LinkSleuth over the site and see what problems it reports too.

9:44 pm on May 28, 2006 (gmt 0)

10+ Year Member



Shouldn't mod_rewrite be invisible to the outside world? I find it theoretically impossible that Google would know if pages are being mod_rewritten at all. Assuming you are using is the way I think you mean, where for instance www.my-domain.com/test gets mapped to www.my-domain.com/program.cgi?var=test , then that is all internal and Google should not be able to tell it from a static page. I would check your configuration and robots.txt logic.
10:53 pm on May 28, 2006 (gmt 0)

10+ Year Member



<i>Google Sitemaps says these SE-friendly pages are "Unreachable URLS" due to network issues. That's not true.</i>

Can you look at your logs and see what response Googlebot is getting from hits on your new site? Is Googlebot hitting the new site at all?

2:13 am on May 29, 2006 (gmt 0)

10+ Year Member



I'm wondering if the problem we see with our AdWords account is related to your: "SE-friendly pages are "Unreachable URLS" due to network issues"

The problem boils down to the fact that we do block ranges of IPs that have caused us problems with hacking and fraud. Google employs world-wide staff and has global systems (e.g. INDIA) and all of our ads always reject on first pass with the "unreachable URL" problem (since lots of India from our site over time has been blocked). Their work around is to have a note on our account not to reject our ads because of this problem.. then they have a US rep verify the ad.

And, even if you do not block IP addresses, maybe your ISP does?

Is it possible that the sitemaps servers/systems are located on IP addresses which you have blocked? .. Just a thought.

Also, does anybody know where the systems handling sitemaps reside (IP's)?

9:40 pm on May 29, 2006 (gmt 0)

5+ Year Member



ISPs would not block IP addresses as that would be against the idea of "net neutrality". Especially since the owner and end user of said IP address can change over time.

So IP addr blocking would not be the best practice for defeating abusers and fraudsters because it casts the net to wide as you may be seeing.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month