Welcome to WebmasterWorld Guest from 220.127.116.11
I then decided to change the domain name to one with broader appeal (it was a regional site, and I expanded the region) and submitted it to Google. newsite.com was crawled just fine for about month, then it all stopped cold. Well, almost. Google would only hit the index page then leave. It would do this a handful of times a month and it's been that way ever since.
Google still has oldsite.com cached with full descriptions. In fact, many more pages from the old URL are cached than the few that got cached from the new one before Google developed an allergic reaction to my site.
I've done everything I can think of to resolve this. I opened a second hosting account with the old domain name and set up a full 301 redirect to the new domain name. After leaving it like this for a little more than a month and noticing no difference, I shut it down. I then used Google's URL removal tool to remove the old URL. The old cache is still there, though showing as supplemental results. I used mod_rewrite to make all my URLs spider friendly. I've made sure all my html is valid as well as a compliant robots.txt. I got about a dozen high quality, high PR external links pointing at the site and some deeper pages.
The site is pure whitehat. I've never pulled any goofy black SEO tricks. I've emailed Google a couple of times, but only got back a canned automated response.
The only difference all this has made is that Google visits the site more often now - once or twice a day - but it absolutely refuses to go past the index page. The few other links in the new domain that were cached are now showing as supplemental, though the index page always has a fresh cache. :\
Other bots crawl me just fine, and I have thousands of pages cached in Yahoo and MSN. Still, my site is beginning to languish with the lack of Google love. This is seriously beginning to stress me out. I've poured over this site for several weeks now trying different fixes, but to no avail, so I finally humbled myself to post this. Any ideas on how to solve this (Googleguy, help! Anyone!) would be greatly appreciated.
Besides, isn't the sandbox an all or nothing situation? As I said, Google did crawl and index pages on the new domain for about month, then completely stopped. What it did index turned into supplemental results. This didn't strike me as being a sandbox thing - from what I understand about it from browsing around here, anyway.
How can there be a cache if you removed the URLs? You haven't removed them.
Set up a robots.txt on the old domain to disallow everything and submit the URL of the robots file to the URL console, OR submit the URL of each page of the old domain as "an outdated link" just as long it the URL produces error 404.
The 301 redirect should have worked fine. It usually does, but it does usually take several months.
As someone else said, Google could also be attributing the data to your old site and penalizing the new one if the pages from old site are still in it's index and thus the need for a 301 redirect.
In the URL removal console, the status says "Removal of oldsite.com complete". The cache is still there, though.
I dunno if the domain name previously existed or not. I'll check into with the resource you gave, thanks. :)
I did 301 the old domain for awhile, as I said, but perhaps I didn't give it long enough.
My biggest worry in all this is that my CPM banner network will drop me if I hemorrhage too many uniques from a lack of Google referrals (and I have lost quite a lot). That's why I'm just a bit panicky and not as patient as I probably should be.
Does one have to remove every individual page with the console or can whole directories be deleted? Deleting each individual page isn't an appealing thought. :\
I used the robots.txt file to drop 5 million mis-indexed pages out of the index using just a simple two line command just a few weeks ago. I would not have wanted to submit each URL one at a time.
You know, one can't help - well...I can't help - but ascribe human traits to googlebot. It hits my index page several times a day, and I'm sitting there thinking, "C'mon baby, you know you wanna go further! Daddy's got some good inbound links...yeah!" Then it just leaves and I think, "Argh! You spiteful little @#$%! Stop teasing me!"
I'm a sad little man.
I did as suggested above: Set up a new robots.txt disallowing everything on the old domain and resubmitted the site to Google. A couple of days later, indexed URLs on the new domain that were showing as supplemental began to drop from the index, and googlebot actually crawled past the index page for the first time in months.
Only a few pages were indexed - no deep crawl yet - but this is a huge improvement over what has been happening - or should I say not happening - the past few months, so hopefully the problem is on its way to being rectified.
Thanks again for everyone's input and advice. :)