Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Lost 95% of Google traffic overnight

         

downhiller80

2:32 am on Mar 29, 2008 (gmt 0)

10+ Year Member



OK so I'm the webmaster for a startup that launched in August 2007. The website is a company review site. We started off with a database of 120,000 companies, but users can also add new companies very easily, and add sub-listings for different branches/locations etc.

Anyway, for a few months traffic steadily grew as Google indexed these 120,000 pages, and then it carried on growing as those pages floated up the SERPs a bit. Until 3 days ago we were getting about 1500-2000 hits a day from Google. Not a massive amount I know, but a crapload better than the 100 hits a day we've had for the last 3 days...

Have we been "sandboxed"? If it's of any note this sudden drop has occurred almost exactly 6 months after I submitted the first sitemap to google.

Personally I think the key problem is that all bar 500 of our pages (if I do a site:www.domain.com search) appear to be in the supplemental index. I don't know if that's what being "sandboxed" means though.

I fear that this is because most of our pages are effectively "empty". They only contain the company name, address, a map, phone/website etc, and an empty graph. Companies that have been rated have < a lot more content >.

Is Google basically seeing these pages as being too "thin" and killing their ranking because of it? Or is it just a sandbox issue, and either way - what should we do!?

Any advice/insight much appreciate!

Many thanks guys (and girls)

[edited by: tedster at 3:00 am (utc) on Mar. 29, 2008]

tedster

3:12 am on Mar 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The challenge in answering your question is that you are talking about bulk traffic numbers. If your stats package offers you more tailored reports, it would help your analysis very much to see which particular search terms and which urls stopped producing.

That said, your guess sounds pretty good. You are describing what Google reps have called "stubs" - and these types of pages are not what Google wants to offer to their end users. The following thread talks about comments made by Adam Lasnik of Google. Even though it's from 2006, nothing has changed in this area except that Google seems to be getting better at spotting stubs and not ranking them:

[webmasterworld.com...]

HuskyPup

3:14 am on Mar 29, 2008 (gmt 0)



Any advice/insight much appreciate!

There is a very strange Google flux occurring right now!

Do nothing, wait...if it's ANY consolation I have 15 yrs old sites being hamered and I have no idea why!

Patience is required right now.

downhiller80

3:21 am on Mar 29, 2008 (gmt 0)

10+ Year Member



Cheers Husky, I've been reading about the flux and hoping that that's the problem.... :)

Two other things:

1) 3-4 weeks ago we added "noarchive" to all pages. This was after a company asked us to remove libellous comments that users had left, and we felt that the google cache doesn't benefit us, and in the case of content removal we can get stuff offline quicker without it.

Just been reading up on that and I think the consensus is that this SHOULDN'T be a problem...

2) Also about 3-4 weeks ago, we did a tie-in with a price comparison site. They do the prices, we do the ratings for their companies. This meant that overnight about 10,000 links appeared from their site to our site. The links went to a routing page on their site and then 302-redirected to our site. I think this could have triggered alarm bells at google. I've got that site to add rel="nofollow" to all of those links, and we've submitted a reconsideration request to google based on thinking that that might be the problem.

Basically I hope it's anything other than the stub problem! Only about 1% of our pages have any ratings/comments left by customers on them at the moment, but the other 99% of pages bring a vast percentage of our traffic in, and a lot of those people then discover what the site is about and start using it. If we had to get rid of the stubs, or live with them ranking poorly, well... it's not going to help!

idolw

10:10 am on Mar 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



how can the robot remember the page's content if you ask it to forget it?

downhiller80

10:47 am on Mar 29, 2008 (gmt 0)

10+ Year Member



I was under the impression that the "noarchive" tag simply removes the "cache" link from SERPs. I'm fairly sure it shouldn't affect ranking etc... can someone clarify?

walkman

1:02 pm on Mar 29, 2008 (gmt 0)



>> 120,000 pages

that maybe a problem...depending on how unique those pages are.

downhiller80

2:22 pm on Mar 29, 2008 (gmt 0)

10+ Year Member



Well they're as unique as they can be really given the lack of content on them. What would you suggest?

tedster

5:57 pm on Mar 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was under the impression that the "noarchive" tag simply removes the "cache" link from SERPs. I'm fairly sure it shouldn't affect ranking etc... can someone clarify?

That's exactly right - it just means that Google does not serve a cached version of the URL - something I keep in mind whenever I work with a limited time sale price. Here's a quote from our archives,with an interesting extra detail about cloaking and scrutiny:

Googleguy:
This tag does not have any effect on ranking. Be aware that it may open your page to greater scrutiny however (the initial checks we've done show that many people use the noarchive tag to try to cloak etc.). If you're doing something like cloaking, the noarchive tag makes it look more deliberate to us.

NOARCHIVE for Googlebot [webmasterworld.com]

buckworks

6:15 pm on Mar 29, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



given the lack of content on them

That's likely 99% of the problem.

Promote your site in other ways besides depending on Google, and as more users submit reviews and more pages gain real content Google's opinion will improve.

Miamacs

12:30 pm on Mar 30, 2008 (gmt 0)

10+ Year Member



10.000 pages 302 redirect to you?

and to your most valuable / content loaded / relevant pages?
( with the few ratings you actually DO have )

... whee ...

I suppose you *haven't* checked just yet whether any of those redirects (took over and/or) destroyed your listings?

Perhaps it's not that evident, but could have broken up the integrity of your site's ratings in the index.

I know it's 2008 but reports of accidental hijacks still pouring in... the infamous inter-domain-temporary-redirect is still one of the worst ideas around. Except if they ( the price-site ) has these pages ( the 302 redirecting pages the links lead to ) disallowed in robots.txt. But even then there're some risks.

Have those links made direct but put in javascript...
...instead of nofollowed/302 redirected.

[edited by: Miamacs at 12:34 pm (utc) on Mar. 30, 2008]

walkman

12:48 pm on Mar 30, 2008 (gmt 0)



>> Well they're as unique as they can be really given the lack of content on them. What would you suggest?

Content.

If they have no content, why would you expect them to rank high?

Quadrille

12:58 pm on Mar 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



302s are bad, except for temporary redirects. Get rid. that may be the problem, but if it is not, then consider this:

It's not just that the pages are 'thin' - to Google, they will be virtually identical.

If you compare code, you'll find that after the ads, navigation, logos, etc., the percentage of 'unique content' will be very small indeed, so taken site wide, you have a bad attack of sick site syndrome.

Why would Google want to index such pages?

The lesson is, if you want to take advantage of Google as a referrer, you need to give in return.

If Google is important to you, rethink your content policy:

1. Minimise code bloat - avoid excessive repetition of navigation and 'marketing' logos, slogans, etc.

2. Consider deleting 'empty' pages - and merging content to create fewer, better, pages.

3. For the future, avoid creation of further thin pages; go for quality content, not quantities of pages.

4. If you use meta descriptions / keywords - be sure these are also unique

5. Plus titles should be unique, of course ;)

6. Check that your page mass producer is not also chucking out millions of alternate URLs - that alone will guarantee sick site syndrome.

Using software to substitute for human publishing appears much quicker and easier - but not only does it often produce the problems I've highlighted (and 46+ more), it often produces pages that are not human-friendly, even if the search engines survive their indigestion.

Check your 'bounce' rate - you may find most visitors are leaving rather than having to click endlessly to find significant content.

[edited by: Quadrille at 1:00 pm (utc) on Mar. 30, 2008]

tedster

1:54 pm on Mar 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



An excellent punch list, Quadrille!

It's interesting to me that this "purge" only now happened. Has Google tweaked something in the duplicate/thin page detection area? Maybe this is at least part of the current "flux" that many are reporting in the March SERPs Changes [webmasterworld.com] thread. It has been a few days since I saw a Wikipedia stub page in the results.

jimbeetle

5:09 pm on Mar 30, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The 302s from the price comparison site would definitely be one problem. A traffic crash "about 3-4 weeks" after the implementation would appear to be a fairly reasonable time frame for any affect (if any at all) to become apparent.

But it can also be that -- however the heck G's back end actually works -- it's just now sussed out the possible stub problem. Throw in the bit of flux going on and it makes it more than difficult to pin down any one factor.

The nofollow on the 302s might help. However, as these links have already been indexed without the nofollow do we really know how they'll be handled going forward? One thing we do know is that G never forgets; as such I think I'd try to wash these links away. Maybe set up a "/partner" directory for the other site to link to and disallow it in robots.txt.

And, of course, at the same time work through Q's 6 steps to no stubs.

downhiller80

11:49 pm on Mar 30, 2008 (gmt 0)

10+ Year Member



Cheers for all of the help/advice guys!

Promote your site in other ways besides depending on Google, and as more users submit reviews and more pages gain real content Google's opinion will improve

We're working on it! :)

I suppose you *haven't* checked just yet whether any of those redirects (took over and/or) destroyed your listings?

Luckily they only link to 800 of our listings (but the most important ones of course). A few are showing in google under their site now, unfortunately, but only a handful. I worry the others are in the pipeline though and just not showing yet.

They were already using a "/ourdomain" folder and that is now in their robots.txt - hopefully a combination of that, the rel=nofollow attribute and some manual URL removals will fix this particular problem...

...but even then there're some risks.

Care to elaborate?

After that it's back to worrying about stubs.

Which, to be fair, we will have to ignore for a couple of weeks at least. For all we know everything will fix itself in the meantime now we've (hopefully?) quashed the 302 issue, and as someone else pointed out the algorithm appears to be undergoing changes at the moment.

If they have no content, why would you expect them to rank high?

No expectations, but they WERE ranking ok, and now they're not ranking *at all* :(

Quadrille's points: (not being defensive, just clarifying :))

1) Code itself is pretty streamline. There is a sidebar of content that appears on *every* page. If I could tell googlebot "don't index this part" I would, but as far as I know I can't, so the solution seems to be to move the sidebar into a robots.txt-blocked iframe?
2) Not sure that's going to be an option unfortunately
3) Future pages only get added once someone submits a review alongside a company we currently dont have.
4) They are unique, but only in company name: "blah blah blah blah blah **CompanyNameHere** blah blah" - the blah's stay the same on every page...
5) Of course, currently "CompanyName (Sector): OurDomain.com"
6) only one indexable URL per page

Bounce rate: That's the only upside so far, bounce rate HAS dropped, but nowhere near enough to compensate for the traffic drop.

jimbeetle

12:18 am on Mar 31, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Luckily they only link to 800 of our listings (but the most important ones of course). A few are showing in google under their site now, unfortunately, but only a handful. I worry the others are in the pipeline though and just not showing yet.

Yep, you'll be seeing the others popup in a bit.

They were already using a "/ourdomain" folder and that is now in their robots.txt - hopefully a combination of that, the rel=nofollow attribute and some manual URL removals will fix this particular problem...

Unless I'm reading this totally wrong, what's in the other site's robots.txt at this time (the /ourdomain folder "now" in their robots.txt) -- with the nofollow added after the links were already indexed -- might not do you much good.

These links have already been found and followed, the linked to pages have been discovered and, as you have seen, the 302 "hijack" effect is starting to take hold. As we simply do not know how Google will treat these links going forward, I again think it's best that you get completely rid of these links and start anew.

AussieMike

1:25 am on Mar 31, 2008 (gmt 0)

10+ Year Member



Howdy... you are not alone, I have a similar site which also dissapeared on the same date except my problem is thought to be the -950 penalty. Is your site not -950'd? My site is 8 years old & has 650 pages, all uniquely written reviews and was getting about 12,000 visits a day... now only a thousand or so.

phranque

7:33 am on Mar 31, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



imagine a dictionary with 1000 words defined and the rest was a form to submit a definition for the word.
a quick review by google right now means it looks like you are asking for large scale free advertising so you can get perhaps a million or more pieces of free content from your visitors before those visitors will have a universally rewarding experience.

i would consider noindexing the thin pages and start an adwords campaign to get the cheapest paid clicks you can to help you get traffic to those pages and to flesh out the site.
adjust your campaigns and (no)indexing according to content.

to get some organic traffic to the thin pages perhaps you could consolidate some of the company information so you have a landing page for each sector. (cf Quadrille #2)
arguably useful and not stubby and providing a way for company names and some keywords to get indexed.

downhiller80

11:13 am on Mar 31, 2008 (gmt 0)

10+ Year Member



Cheers phranque, some good ideas there. I like the idea of noindexing any empty pages I think... again we'll probably leave it a couple of weeks and see what (if anything) happens.

AussieMike, interesting, especially since your site is much older and a lot "fatter"... how do I determine for sure whether we have a -950 penalty? Why did you get a -950?

Again, thanks for all of the help people!

rise2it

1:10 am on Apr 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So, essentially, you have 120,000 pages, most of which are absolutely useless to anyone that finds them.

The small amount of info you have on there can be readily found other places.

You think your site deserves to do well why?

(Not attacking you - just trying to get you think about what you are doing...)

Miamacs

12:29 pm on Apr 1, 2008 (gmt 0)

10+ Year Member



...but even then there're some risks.

Care to elaborate?

These links have already been found and followed, the linked to pages have been discovered and, as you have seen, the 302 "hijack" effect is starting to take hold.

as per jim said... robots.txt will only disallow the further crawling of the pages that do the redirects. If they've been discovered already, or if anyone at any given time links to them ( yeah why not, a simple right-click-copy-URL on the link will do this every time ) they'll get another check.

while linking out thru disallowed 302 redirs might be safer for the source, it's not the best for the target ... in my opinion at least.

...

but it seems you have piling troubles and this might have been the last straw. your best bet is to correct every problem, one at a time.

or abandon the whole site altogether... which might not be an option.

downhiller80

12:26 am on Apr 11, 2008 (gmt 0)

10+ Year Member



After flatlining very consistently on 5% of normal traffic for 2 weeks today has seen traffic rise to 75%+ of what it was before, which has to be a good sign. FWIW we've changed nothing apart from putting nofollow on those 302 redirects.

downhiller80

12:29 am on Apr 11, 2008 (gmt 0)

10+ Year Member



Interestingly, if I do a site:domain.com search it's still showing only 300 pages in the main index. Maybe that's how it was before tbh, and a lot of the things people are searching for are so obscure that it doesn't matter that they're in the supplementals, they still get found in the SERPs...