Welcome to WebmasterWorld Guest from 188.8.131.52
The new platform is data-base driven with pages written on the fly. The old one used the good-old static pages.
Mod_rewrite is used to re-write SE friendly URLs for the new site.
Yahoo! is indexing the SE-friendly pages fine.
A sitemap is registerd with Google that has the SE-friendly URLS, not the long cgi-bin urls. Google Sitemaps says these SE-friendly pages are "Unreachable URLS" due to network issues. That's not true.
Robots are restricted from cgi-bin in robots.txt.
GOOGLE FAILS TO FOLLOW THE MOD-REWRITE pages. It is stubborn and keeping our old site pages that are static and dropping them as we transfer the static pages on the old site to the dynamic one on the new site.
We have 2 other websites with the exact same architecture and a totally different product line. These are indexed fine by google.
Can anyone help determine why Google is being so stubborn? What can we do to overcome this? We've been working on this all year and was hoping once Big Daddy rolled out, it would sort out. It hasn't.
You'll be needing a 301 redirect from the old URLs to the new URLs. That will help things along a bit.
Run Xenu LinkSleuth over the site and see what problems it reports too.
Can you look at your logs and see what response Googlebot is getting from hits on your new site? Is Googlebot hitting the new site at all?
The problem boils down to the fact that we do block ranges of IPs that have caused us problems with hacking and fraud. Google employs world-wide staff and has global systems (e.g. INDIA) and all of our ads always reject on first pass with the "unreachable URL" problem (since lots of India from our site over time has been blocked). Their work around is to have a note on our account not to reject our ads because of this problem.. then they have a US rep verify the ad.
And, even if you do not block IP addresses, maybe your ISP does?
Is it possible that the sitemaps servers/systems are located on IP addresses which you have blocked? .. Just a thought.
Also, does anybody know where the systems handling sitemaps reside (IP's)?
So IP addr blocking would not be the best practice for defeating abusers and fraudsters because it casts the net to wide as you may be seeing.