It sounds like you made major content and hosting changes at the same time. My guess is that Google suspects that you sold the site and no longer trusts that the site deserves the rankings.
I see sites being upgraded/redesigned/moved/rewritten all the time and I've made - at most - 10 page changes out of 400 URLs and simply moved a UK-targeted site back to the UK.
If that's all it takes for Google to conclude that I've sold the site, then I've clearly misunderstood most of what I have read here on WebmasterWorld.
If Google has indeed drawn this (incorrect) conclusion, then what is my best course of action?
Do I undo all the changes, press on or submit a Reconsideration Request?
|There are no canonicalization issues thanks to the new .htaccess |
Does that mean there were canonicalization issues before?
If yes, fixing that would change the way Page Rank was spread around the site, and the recalculations would be almost certain to induce ranking hiccups for a while.
My advice would be to keep your nerves steady for a few days, and keep working on things that you are confident will be good for your users.
If things don't stablize in a few days, panic then! :)
BTW, selling a site does not automatically disrupt rankings.
I'd give it at least a week and see what happens.
I wouldn't mind, but I haven't sold the site, nor will I - ever!
Buckworks, there have never been any canonicalization errors. The .htaccess was just rewritten to .410 the 25 or so pages listed as .404 in Webmaster Tools (ancient, long-gone URLs) and to .301 dirty-widgets.html to clean-widgets.html.
The funny thing is this: for the first time in about a year, I had a totally free week on holiday to work my ass off to move to a UK host, rewrite some old URLs, add some rich new content and generally clean up where needed.
I literally can't believe I wasted that week for no thanks and now this - complete removal from the first 1,000 results for EVERYTHING!
Gotta love Google :)
Well, I *think* we have just figured out the problem with Google removing the homepage.
The site is simple XHTML 1.0 Strict and until last week, clean .html pages were spat out by a custom CMS.
As this perl CMS was becoming less flexible, we decided to convert the entire site to static pages - still XHTML 1.0 Strict.
It appears that for some odd reason, the conversion resulted in EVERY page URL referenced in the internal links being appended with a "/".
This meant that it pretty much created a second version of the entire site - as well as numerous other related issues.
Having just reuploaded the entire site with every single URL corrected to refer the absolute http://www.example.com/, the usual online tools return no errors and all the main pages even pass W3C validation!
What do you guys think I should do now?
1. Wait and see what happens when Mr Spider revisits;
2. Submit new sitemaps;
3. *Shudder* confess to this stupid mistake and beg Google's forgiveness in a Reinclusion Request?
I'd vote for #1 - wait and see.
In a backwards way, it's good to hear that you found a technical problem.
Oopsies can be fixed a lot easier than Google can be figured out!
Let us know when things get back to normal.
Thanks for the encouragement buckworks - much appreciated :)
Hopefully I'll be able to report back with good news soon!
You might want to make sure that your non-slash URLs are 301'd to your slash URLs.
My URLs are currently configured as follows:
http://www.example.com - .200
http://www.example.com/ - .200 (if I type the domain with the slash, it automatically removes the slash to give http://www.example.com
http://www.example.com// - .301 to above
http://www.example.com/index.html - .301 to above
http://example.com - .301 to above
http://example.com/index.html - .301 to above
http://example.com/ - .301 to above
http://example.com// - .301 to above
Or should everything .301 to http://www.example.com/
There is no technical difference between http://www.example.com and http://www.example.com/
Either way, browser send a request that looks like
GET / HTTP/1.1
Note that either way browsers send the slash. Because browsers send the slash whether or not it is on the url, I prefer http://www.example.com/ as the canonical url.
I agree. I double checked and every single internal URL link is absolute to http://www.example.com/ so at least that's fine.
Day #2. Still not back in the index :(
The slash is not sent by the browser. It is supplied by the server via the DirectorySlash directive in mod_dir [httpd.apache.org]. It is on by default; in fact Apache has dire warnings about what might happen if you turn it off.
lucy, what you say is true for directories, but at the domain home page level, the browser always sends the slash. It can't make the http request without it.
Day #3. Homepage is still missing from the SERPs for any keyword search.
Starting to worry somewhat now.
I guess there's simply no way to know if Google has imposed a penalty of some kind, or indeed what for?
JackR, I once blocked Googlebot through robots.txt by mistake, and it took about a week for them to put my pages back where they'd been. I think you probably need to just give it a bit more time.
I hope you're right diberry.
Now on Day #4 and still AWOL. I'm going to draft a Reconsideration Request so it's ready for this time next week in the event that the site hasn't returned at what will then be the two week point.
I used to think it was, but I am noticing that it is an increasingly difficult target to hit.
If you check google webmaster tools you can see if anyone submitted a removal request. I would for an errant nofollow tag in your code or a redirect loop.
No URL removal requests in Webmaster Tools - thankfully. Same for the nofollow tag.
The 301 redirect loop could potentially be the issue though. Based on this thread Homepage missing from SERPs - is there a .301 error in my .htaccess? [webmasterworld.com], what do you think?
Install something like Live HTTP Headers (it's a firefox extension) and check what your home page is actually doing.
Already done netmeg :)
Everything checks out with the correct server response codes. I'm not sure if there was a redirect loop in place briefly though.
That could well have been the issue, but I simply don't know.
@JackR if the line of questioning re: htaccess is something you implemented, I would say that can very well be the issue
Look at how much you are trying to rewrite there, if you are doing too much, you can eliminate your whole site without even had tried to.
Fetch your home page as google bot, what happens? Have you noticed any huge crawl error counts on your site in WMT?
No crawl errors in WMT. 389 URLs submitted, 386 currently indexed.
Fetch as Googlebot works fine too.
I've manually checked the HTTP headers for every page that has been rewritten or marked as .410, etc. All are -now- 100% accurate.
what is your content like? This may be a panda issue (maybe)
Have you checked your crawler access in WMT to just be sure nothing is wrong with robots.txt or htaccess?
What is your link graph like?
Content is very good and all URLs Fetch as Googlebot just fine. The site has no ads, and just a few outbound links - all marked as nofollow. Robots.txt is blank and the .htacess has been discussed in length here:
I posted about the missing homepage on the Google Webmaster Groups this week, and this reply seemed to make a lot of sense:
|You mentioned moving your hosting account ? I would suggest what you're seeing is Google attempting to make sure your new "home" is consistent and static, before restoring your previous position. |
I don't know how long it will take for your site to appear where it did before, but I'm confident it will eventually.
So, I'd suggest it's more a technical issue, and not a penalty.
I hope he's right, but a Reconsideration Request can't hurt either! All in all, a technical issue does seem both logical and timely.
Day #6 now - homepage still absent.
I had a quick look at the site. I'm no expert, but this is what I looked at.
Page headers seem OK on the homepage anyway.
Robots.txt gives a 2 byte file. No idea what that is, it should either have content or a 404.
Site: example dot com gives 400+ pages indexed. So indexing doesn't seem to be a problem.
I checked a couple of spam places for the IP, seems clear.
I took a string of text from a page and searched Google on "string of text site: example dot com" to see if there were multiple pages showing the same thing. Didn't see anything specifically.
I took the same string of text, fairly long tail, and just searched Google for the term. Site came up #9.
My conclusion: You're not banned, you're either penalized or your rankings just suck. No obvious hard core tech issue - the site is resolving and seems to be working.
I don't see how the site move would've cause the problems. I'm suspicious (but not certain) that the rankings drop is unrelated to your server move.
So while I'd consider continuing to investigate technical issues, I suggest you consider focusing on SEO or content type issues in addition to the tech issues. i.e. perhaps you've been hit by panda, or took an overoptimization penalty. Unfortunately I don't have anything intelligent to offer on those topics.
Two other points. I notice you've been penalized before (you've got a page discussing it). Did you tread the line again?
Secondly, when I do a site preview, your homepage shows a cache of Nov2. That's more recent than my site that's a pr5 :). Again, that points to Google seeing your site just fine, just not wanting to rank it.
But - I'm not getting a preview either. I'm seeing a no-preview available. I don't know if that's deliberate, or a symptom.
Thanks for taking a look wheel and for reporting your findings.
The Robots.txt is a blank file. What would you suggest I place in the file in order to permit all crawling from all search engines?
Now, strangely enough I compared the date of my previous change of host: also from the US back to the UK .. the dates largely match.
The site was indeed excluded before - back in 2008. The issues then were obvious and easy to fix - primarily related to linking and a lack of attention to detail.
Having learned my lesson then, I literally didn't make a single major change to the site for 3 years precisely because I was terrified of it being dropped again.
Zoom forward three years and the site is six years old. It's been bobbing between page one and page three for all that time and any changes have been tiny.
This time I decided to rewrite some URLs and condense and present my content better on some pages. I discussed this with a member here at WebmasterWorld and he was very gracious in offering his expert advice.
Now, it could well be that the site doesn't deserve to be ranked where it had been prior to last week. BUT - and this is the crux of the matter - there is no way it deserves to be excluded from the first 1,000 results.
Wheel: if you do a keyword search for my main keyword, which I'm sure you can guess, you'll agree that if anything my site is of a far higher quality than half of page one.
Moreover, whilst it -could- be a coincidence that the site homepage was dropped as soon as I switched hosts from the US to the UK, I'd say that's a pretty big coincidence.
The site is fully indexed as it has always been, but the homepage is simply excluded for all and any keyword searches.
I read an article recently which sums up the problem:
|With Google filters the website usually remains indexed with similar Page Rank, but SERP filtering is applied which all but removes the site for all or specific keywords. |
|Moreover, whilst it -could- be a coincidence that the site homepage was dropped as soon as I switched hosts from the US to the UK, I'd say that's a pretty big coincidence. |
It sure is. But I'm real suspicious that it is a coincidence. Your quote in the previous post is what I think is happening.
In other words, you've got a filter, non-technical. Perhaps panda, perhaps over-optimization (heck of a pile of inner links on your home page, but that probably doesn't mean anything). But I'd be talking to someone familiar with penalties at this point instead of looking for technical solutions. And that person isn't me, I don't have any expertise on penalties.
The fact that it's been bobbing between pages 1 and 3 through the years would lead me to believe you've been riding on the edge of something too.
| This 37 message thread spans 2 pages: 37 (  2 ) > > |