Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's strange behaviour - could use some help

         

punisa

10:06 pm on Oct 25, 2009 (gmt 0)

10+ Year Member



Hello friends, something very odd is happening to my site.
In last 5 days my traffic was cut by 50% and keeps going down : (

Small outline of my site:
- news network site with + 400.000 pages
- majority of visitors come from Google

I check Google Webmaster Tool for errors frequently. As we aggregate news from different sources, sometimes we end up with duplicate description tag. Google always warns me about these and I fix them asap.
So after seeing drop in traffic I went to the Google Webmaster Tool to see what is happening.
Imagine what I saw:
diagnostics > Crawl errors > Not found ‎(801)‎

Some of these apparently 404s were from previous times, but at least 500 of these pages were completely - 200 OK !
Why in the world did Google label them as "Not found"?
Even when I click directly on the link - they show up as they should.

Since then most of my search results were dropped from page 1 to page 2 - thus much smaller traffic.
How did this happen? Did anyone experience anything similar?

I immediately phoned my hosting company as the first thing that came to mind was that there was some serious downtime of the server while Google performed the crawl. But no scoop - they told me everything was fine for months now.

Sorry for the lengthy message, but I do try to provide all the details that come to me. Another funny and weird thing, according to the Google crawl rate graph - it seems to have doubled in last week (?!)

All info and help is appreciated, thanks.

dusky

1:31 am on Oct 26, 2009 (gmt 0)

10+ Year Member



Could be a number of things, such as:

-Probably out of the 400.000 pages, only the pages which are actually full of text / content are kept in the index and shown to searchers, in other words, a filter on your site has just taken place (or NOT).

- Your site could've been getting away with some or a lot of duplicate content / pages and now only the deserving pages are kept in the index

- Maybe G* is clamping down on news aggregators and lowering their importance

- When G*bot visited, it encountered errors countless times, maybe due to repetitive down times or DNS switch..., some hosts restart webservers or even reboot unnecessarily for new software or setups to take effect when almost all can take effect without a restart

- You recently made a change to titles and/or descriptions, URLs, implemented new or different rewrite rules / redirects, G* needed to offload all old pages and get the new pages, hence the thinning of traffic. 404s could mean the old pages redirected to the new, though they still exist, your redirect rule is telling gbot they don't...

- Your site warranted a manual review, then someone fired G*bot to make sure of the number of the site's pages after a recount (block most of your pages till the new freshly indexed ones accumulate)

- You or someone did something on your site which might've rang some alarm bells...

- A glitch on G*'s part and all be fine in a couple of days

- Could be good news, your old pages are getting assessed for a fresh ranking and moved to the new datacenters slowly, the current active centers will eventually be recycled, so may only be few days and bingo

-........

No need for panic just yet, just make sure all is OK with your htaccess and robot.txt files, scan the site for unwanted spyware or viruses, check for any new links posted on your site (comment spam for example), make sure the pages are accessible to NON LOGGED USERS (logout to make sure) and check the site when you are logged out, it does happen and has happen to the best of us, accidentally implemented the "show when logged.." in a perl subroutine or php function BUT forgot that SEs need to index those pages, and non-registered visitors can't see them either....

punisa

10:07 am on Oct 26, 2009 (gmt 0)

10+ Year Member



Hello Dusky, thank you very much for covering this issue in such great details.
I inspected all the possibilities you mentioned.

- first thing I did was thoroughly inspect the whole site for any actions being done without my knowledge. All seems to be in order.

- I'm very careful about meta tags, page titles etc. When I was just starting web development years ago, I saw first hand the "horrors" that can happen to your pages if you make even the tiniest changes to their meta tags or title : )
I actually never change these now, just in some particular situations as when there is huge typo or a page is actually not ranking well.

- except for some improved CSS layouts, there has been little change on the site.
Could even the small redesign make Google re-think the whole site and start indexing again?

- luckily I'm the owner and the only developer on the project (not for my working hours, lol), so I'm 99.99% sure there was no outside undermining actions from someone else.

I've spent a few hours investigating my Google Analytics in details this morning, some interesting things I've found:
This is my traffic sources overview:

* 2.46%
Direct Traffic

* 29.46%
Referring Sites

* 68.07%
Search Engines

The "Referring Sites" went down almost to zero - this is where I lost my traffic in the last few days. Main traffic bringers from "Referring Sites" are not actually other sites, but again Google - the local -my-country- Google search and the Google images search.

Again, strange detail - the "Search Engines" traffic (which includes the international "Google.com") did not decrease, it is steady as always and shows no problems at all.

Also, as you mentioned Dusky, Google can clearly define what site is a news provider and what site is just an aggregator.
I have actually reviewed Google's stance on this for long time, even before starting the project.
One can not expect to have high ranking pages if you only serve as a middle man for real news stories. But Google had this *policy* of ranking such pages pretty high, but only for a short period of time (a day, or even less). Still, it was a great source of traffic.
Unlike many aggregating sites, I have a signed contracts between all news agencies I represent and divert most of my traffic on to them.
Thus site as this always has very high bounce rates.
As mentioned before, I operate in a small non english country.

After checking everything, I'm sure no tags were changed and no new redirects were made.
Now, after all these possibilities, I'm starting to go deeper into the speculative-rather-unlikely-ones, so here goes:
- once, few days ago, I've modified my htaccess and uploaded a wrong one, figured it after 30-40 seconds and corrected my mistakes. Could something like this have a horrible affect?

- I've deleted around 50 pages in 1 day, basically because they had duplicate description tags. Could this alarm Google as some shady activity? Hmm, it would be rather strange as one would imagine it obviously happening when you deal with + 400.000 pages site.

- I have changed my pages layout, but made sure that all content stayed intact:
comments, news description, meta tags, title, links etc.
I DID however made one small change - I've changed my images used - from large ones, I now use smaller ones.
Instead of using
www.example.com/images/large_image.jpg
I now use
www.example.com/images/medium_image.jpg

But the "large" one is still on server if you type it in. Don't know if this is maybe what Google perceived as a problem.
Then again, I see many large (even larger then mine) sites make complete redesigns, I doubt Google would penalize them for doing so? Maybe I'm mistaken?
Then again - Google bugs me all the time that in order to decrease my bounce rate I should "experiment" with design etc : )

Just one more thing, could you *estimate* how long is the "waiting time" for one to see if this problem will simply go away or something went fundamentally wrong?
I've been in the business for quite some time, but never experienced this situation. I'm clueless on what to do now. Is my clock ticking sort to speak? Should I immediately "do" something before Google decides to trash my whole site, write to them on this issue perhaps?

Many thanks for reading.

- a very concerned web developer

dusky

3:31 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



I am merely a board member with more or less experience than others in the SEO and SEM game, so only qualified to give an opinion based on my own experience as well as reading and observing that of others over the years. Having said that, one thing is almost certain, any deterioration in traffic is likely to have happened due to either:

1) The changes you just made, I can see probably uploading the wrong htaccess file is one likely suspect, probably more confident in saying so as you pointed out Gbot increased its crawling rate after you've restored the situation, that tells me, they got a complete new pages or URLs, then days later your restored the situation and they had to re-crawl again BUT was too late for the latest index. If that's the case, most people report between few days to 6 weeks, some even seen sites restored after three days when a crawl issue / problem was observed if the problem was fixed within a three days or so, much longer than that and the likelihood of longer term reinstatement!

2) An accumulation of certain algo change from G* which weighs in/out your site on a long term, probably taking into account the historical contributory value against that of other sites in your niche, likely there are more of them now as your competitors amongst other factors (activity, saturation, voting power, content freshness and quality....). The latter does happen and is pointed out somewhere in one of their patents (Tedster or someone should know where to dig it out or elaborate more on it). That does not mean your site deserves its current place and you should get less traffic if that's the case, it just means you have to work maybe a little harder, probably using differentiation in marketing your services, partner with authority sites, promote more etc...

Between the two scenarios and from analyzing your situation just from what I read, I say, is likely to be due to 1). However, I may be and likely to be entirely wrong of course, and what I said is just an opinion!

It is up to you to act, but act in methodical manner, if you checked and all is OK, you may need to wait few more days and see, on the other hand if the site can not afford to wait and loose too much revenue, hire a SEO professional to look closer with access to the site, its design, scripts, server etc. He/she is likely to come to a better conclusion than any board member's advice alone here. If waiting a month, loosing 10k in revenue, I'd spend half of that either paying a pro SEO in addition to promoting and advertising the site while it is ranking low. You are not alone, 100s of thousands of sites suffer this daily, and hundreds of multi-million dollar companies' sites do have similar problems from time to time, a lot of the problems are attributed to sudden BUT ill-advised change

dusky

4:01 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



I also refer you to the thread in which you had an answer yourself [webmasterworld.com...] as an example. If and when you are confident the site has no problems after fixing everything, probably a site re-con request may be advisable, specifying the mistakes you made, what did you do to fix them, one of which is the wrong upload of the htaccess for example. G* now responds quicker and even with a reply (probably not to everyone) for the manual review. Many reported a partial and even a complete reinstatement within a month if not days later.

One last thing, make sure you don't have a copy of the site in a regional country flavor, or parked domains on top of it or are used as so-called test sites which can do a lot of damage if people link to them and end up indexed. Is your situation on Y! and B*ng the same? Y! are stricter than G* and may give you a clue or at least confirm (if the site deteriorates there as well) that you've done something wrong!

punisa

6:17 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



Dusky, thanks for everything. You gave me a lot of stuff to think about. I'll keep investigating my situation. If I find some crucial information regarding my problem I'll post it here. Maybe others will experience similar situations eventually.

The clumsy mistake of dropping the wrong htacces file was corrected within seconds (10 seconds), I doubt it could cause all of this. Surely the offline status of server itself is longer then that : )

I have internal statistics which closely follow what Gbot is doing. Google Bot is actually on my site *all the time*, every second it crawls more and more.
My new pages get indexed in cca 10 minutes or less and rank very good, almost always in top 5 if you simply copy the title into search engine.

My older pages seem to be dropped from page 1 to page 2.
No change in pagerank.

dusky

6:18 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



In short, from my experience, if changes were made by mistake and one changed back within few days, it may only take between few days to a max of a month, depending how important the site is. If changes were deliberate (and intended to last) to the structure of a site, such as the URLs themselves by rewrite rules, i.e. shortening the dynamic URLs to the more SE friendly ones, you are talking from weeks to months and in some cases over a year for the newly indexed URLs to gain the voting / PR value fully passed on.

One thing is a danger if you implement the SE friendly URL technique when the site is well known and those pages have a lot of backlinks, they will loose PR and will be treated as new until their PR is passed on from the old URLS by a 301 redirect and other sites linking to them hopefully correct their links to the new URLs (which in most cases does not happen).

As to change to page titles and descriptions, titles in my experience may take weeks to months to be changed, some may change within few days if the page in highly ranked and frequently crawled. If the site is an authority site with a lot of backlinks, the overall time is considerably shorter.

For descriptions, it varies, again if a well known trusted site, it may be from few days to six weeks or more, for a smaller site may be considerably longer for most if not all pages. G* seem to look at the overall site, then individual pages popularity separately, so if the home page title has changed which is most likely to be the most important page (not necessarily but in most scenarios, is the case), then likely a shorter time.

Some who implemented this change may help the site rank better in the long term, but some see either slight or considerable deterioration, it's a touch and go thing!

dusky

6:31 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



Perhaps in the case of the old pages, they are linked to less and less and link juice is drying up for them, hence the lower importance and therefore the lower rank. The latter is a normal process all sites go through, and many webmasters try and revive their importance by increasing readership through onsite recommendation. Some old articles are timeless in that they are important historical information and should be linked to, example "similar articles" links for people to read. That alone will strengthen their visitor rate and readership, consequently, more and more people may bookmark them, link to them etc, hence a new lease of life, well link popularity!

punisa

10:34 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



Well, I've decided to wait around and see what will happen in the next couple of days.

You are very right about linking, especially "inward" ones. Many large news sites have a hard time keeping interest for old articles.
I do this by a separate PHP process on the side. I break articles into words and small phrases.
I store this info into mysql table so whenever a user enter an article page he will see also relevant articles regardless of the date published.

Writing a good article can make it spread quickly. When I have time to write a decent article I always try to do my own research, eventually coming up with brand new content.

The only problems I have with content making is actually my overall target market - it's country specific and thus the reach is rather small (small Eastern European country).
Therefore I'm planning to create a separate project that will be in English language.

dusky

11:17 pm on Oct 26, 2009 (gmt 0)

10+ Year Member



Therefore I'm planning to create a separate project that will be in English language.

That will certainly help even though you'll have a much bigger competition, content in English would attract English speaking and English based Countries readers regardless of the site's location. With that of course there is the possibility of English speaking / based webmasters backlinking, readers bookmarking and recommending etc..

punisa

1:10 am on Oct 27, 2009 (gmt 0)

10+ Year Member



I agree, there is the whole paradigm shift when preparing content for English or domestic market :D

little update on my problem:

- all pages where design was not changed remained at their positions (usually page 1)

- only "news" pages were affected (on which I made design changes)

- except for CSS changes one *radical* change I made was to replace ALL images with smaller samples: instead of www.example.com/images/large_image.jpg
I replaced it with
www.example.com/images/medium_image.jpg

- I'm not 100% sure but I might have changed this:
<meta name="Description" content="description.." />
into:
<meta name="description" content="description.." />
Doubt that could do any harm?

- my pages are all in Google's index, none disappeared, when I do site:example.com for my site, all pages seem to be there

- when I copy/paste a certain news page TITLE tag into Google it usually shows up

- still cant figure why no traffic then (?), as all pages are there.

- GOOGLE ANALYTICS shows drastic fall for google.(mycountry), but google.com is steady

If I could only figure what triggered this, I could go to sleep finally : P

But you know what is real bummer Dusky? The redesign I did turned out to be great, the actual numbers show it - bounce rate went down 10%, pages per user increased and - get this - my google adsense revenue DOUBLED : O
What a crazy situation my friend... imagine that. You loose approximately 40-60% of your visitors and double your income at the same time.
Definitely one for the books this is.

aakk9999

1:32 am on Oct 27, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Reading this thread and what you did, I think you might be right that the reason for the lost traffic are replaced images. Even though the large image is still on the server, it is orphaned, i.e. there is no link from a page to point to it. From the experience I have had with "orphaned pages", Google tends to drop these after a while. Perhaps it is the same with images too.

Have you tried to do image search and see if your new images are ranking?

The second thing what may also be the reason is the user behaviour themselves when searching images. I do not know the size of your big images (pixels) or medium images, but if someone is searching for a nice image of something, the small images may get less clicks. For example, if I am searching for an image and the offered images are 200 x 150 and 400 x 300, I am more likely to click on 400 x 300 image.

Can you put the big images back and see if your traffic returns?

dusky

1:35 am on Oct 27, 2009 (gmt 0)

10+ Year Member



<meta name="description" content="description.." /> is a better and more acceptable if you use XHTML Transitional doctype, all lower case is correct, but that is pardonable by all search engines even if done incorrectly I believe, and certainly would not attract a penalty or a filter.

Well, you revenue has doubled, so keep it that way, it's quality that matters and probably because most of the pages are gone supplemental now, G* made sure other pages rank better, who knows. One thing comes to mind is a country algo reshuffle and may be restored soon. Other point, if you set your site setting on WMT for US or English speaking country when the language used on the site is East European, that could cause a problem, even though I would think the setting would get ignored, as long as the site does not have a regional TLD extension that is!

Most here would agree to sing their praises in your predicament instead of worrying and lacking sleep!

Definitely one for the books this is, you say, and I agree!