homepage Welcome to WebmasterWorld Guest from 54.204.94.228
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 97 message thread spans 4 pages: 97 ( [1] 2 3 4 > >     
Dropped from Google - a checklist to find out why.
Let all the sites dropped fill this checklist so we can narrow it down.
HostingDirectory




msg:710783
 3:54 pm on May 29, 2005 (gmt 0)

We all know a lot of sites have been dropped from Google but we don't know why. Perhaps this update has a long way to go, perhaps it has not. If it is finished we need to find out why we got dropped from the index.
I have assembled a check list that i feel covers all the angles. If sites dropped could fill in the checklist we might see a pattern occur. Then we can work out what we might need to change to get back in.
My checklist is below, some of the points are explained why it might be a factor.

1) Site size
It is reasonable to believe that a homepage that size is too large, will have too many people click away so may loose relevancy to show in the top results.
2) Outbound links
How many do you have on your homepage?
3) Inbound links
How many does your site have?
4) Adsense
Since it may connect innocent sites with scraper sites.. do you use it?
5) Content updated regular?
Some sites do not have content updated too much because they offer tools over info, but Google may consider sites with rare content updates to be poor quality and drop positions for them.
6) Adwords
Do you use paid advertising like adwords, maybe loosing some places will make you pay more or perhaps Google protects their paying clients?
7) Age of site?
How old is your site, perhaps older sites are likely to be better because they survived... so Google keeps them listed high?
8) Use of no follow tags on forums?
If you offer forums or blogs, do you use the no follow tag? Maybe we need to stop bad sites linking inside our sites?
9) Location of host sever
Maybe our host location plays a part in how high we rank to certain users?
10) Dedicated or shared hosts?
Are we being punished for what other sites do in a shared hosting enviroment?
11) Redirects?
Do you use any kind of redirects that Google may be having trouble with?
12) Scrapper sites linking to you / content theft?
Do you have lots of scrapper sites suddently linking to you or using parts of your content.
13) Are you listed in dmoz?
Perhaps Google pays more respect to dmoz listed sites?
14) Listed in Yahoo directory?
Perhaps Google doesnt want Yahoo directory pages to be listed high or maybe it does prefer them linking high?
15) RSS feeds on site?
Using rss feeds might be causing some kind of duplicate content penalty?
16) Pagerank ( before it dissapeared )
What was your pagerank.. maybe a high pagerank gets immune to any penalisations?
17) Extra domains pointing to main domain?
Do you have other domains pointing to your main domain that might be causing problems in Googles eyes?
18) Search engine friendly archives producing same content on different urls inside site?
Some forums like vbulletin have a search engine archive that produces the same content with a static html url.. maybe this might be picked up as duplicate content?
19) Did you bother taking LSI into consideration with onpage content?
Basically it seems Goolge is now using Latent Semantic Indexing in search results - so a search for zoo trips may look at page content and realise that zoo , wildlife and trips are related. So search results could give you wildlife trips for the term zoo trips.

It's a long list but if we all fill it in, we could then put the results in excel and compare them.. maybe see a pattern that all sites dropped might have. Then we can test that pattern aganst sites will did not get dropped.

Might be useful.

 

HostingDirectory




msg:710784
 4:23 pm on May 29, 2005 (gmt 0)

One more thing i forgot,

20) Do you use any of the following words on your homepage - under construction, updating, re-design, upgrading, etc.
Perhaps website doing an update or some kind are seen by Google not good quality... until they have updated and list them lower? A little bit of AI coming into play perhaps?

HostingDirectory




msg:710785
 5:02 pm on May 29, 2005 (gmt 0)

One last thing that has come to my attention,

21) Do you use more than 1 way to link back to your homepage from every other page in your site?
My site has recived a big drop and i use a image map on my logo to ling back to my homepage, but the image map is sliced.. so its in several parts.. each with a image map link back to the homepage - this means that each page technically has several links back to the homepage.. maybe that is a problem now?

chopin2256




msg:710786
 7:43 pm on May 29, 2005 (gmt 0)

1) Site size
1000 pages, homepage not that large and loads quickly.

2) Outbound links
Maybe about 100

3) Inbound links
How many does your site have?
3 Dmoz links, 1 musicmoz link. Running the program "link popularity check" shows about 2000 links from Alltheweb, Altavista, Google, Teoma, MSN, and Yahoo combined.

4) Adsense
Yes

5) Content updated regular?
Updated often.

6) Adwords
No.

7) Age of site?
Young site, only 10 months old.

8) Use of no follow tags on forums?
No forum.

9) Location of host sever
Atlanta, Georgia.

10) Dedicated or shared hosts?
Shared.

11) Redirects?
No.

12) Scrapper sites linking to you / content theft?
Yes, so many scraper sites since I was in many top positions before Google dropped me.

13) Are you listed in dmoz?
Yes I have 3 links in Dmoz. Obviously Google didn't care.

14) Listed in Yahoo directory?
No yahoo directory listing.

15) RSS feeds on site?
No.

16) Pagerank ( before it dissapeared )
Pagerank 4.

17) Extra domains pointing to main domain?
I only own one domain. If you mean other bad neihborhood sites linking to me, scraper sites link to me.

18) Search engine friendly archives producing same content on different urls inside site?
Haven't noticed anything like this.

19) Did you bother taking LSI into consideration with onpage content?
Not really.

20) Do you use any of the following words on your homepage - under construction, updating, re-design, upgrading, etc.
Never

21) Do you use more than 1 way to link back to your homepage from every other page in your site?
Only one link back to homepage on every page.

helleborine




msg:710787
 10:19 pm on May 29, 2005 (gmt 0)

1) Site size
450 pages, index page about 60K.
2) Outbound links
<10 on index - <50 over the entire site.
3) Inbound links
>300 when Google struck me out, now <20 on index link:command 83, 400-600 for other SEs.
4) Adsense
Yes
5) Content updated regular?
Yes
6) Adwords
No.
7) Age of site?
18 months old.
8) Use of no follow tags on forums?
No. Forum on different host and domain.
9) Location of host sever
Utah
10) Dedicated or shared hosts?
Shared.
11) Redirects?
No.
12) Scrapper sites linking to you / content theft?
Yes, both.
13) Are you listed in dmoz?
Yes.
14) Listed in Yahoo directory?
No yahoo directory listing.
15) RSS feeds on site?
No.
16) Pagerank ( before it dissapeared )
Pagerank 4.
17) Extra domains pointing to main domain?
No.
18) Search engine friendly archives producing same content on different urls inside site?
?
19) Did you bother taking LSI into consideration with onpage content?
No.
20) Do you use any of the following words on your homepage - under construction, updating, re-design, upgrading, etc.
"Update"
21) Do you use more than 1 way to link back to your homepage from every other page in your site?
One image link, one text link.

wiseapple




msg:710788
 10:50 am on May 30, 2005 (gmt 0)

1) Site size
Around 20,000 pages.
2) Outbound links
<25 over the entire site.
3) Inbound links
2800 Inbound using link command.
4) Adsense
Yes
5) Content updated regular?
Yes
6) Adwords
Yes
7) Age of site?
5 Years
8) Use of no follow tags on forums?
No.
9) Location of host sever
Atlanta
10) Dedicated or shared hosts?
Dedicated
11) Redirects?
No
12) Scrapper sites linking to you / content theft?
Yes. Probably around 75,000.
13) Are you listed in dmoz?
Yes
14) Listed in Yahoo directory?
Yes
15) RSS feeds on site?
Yes
16) Pagerank ( before it dissapeared )
Pagerank 5
17) Extra domains pointing to main domain?
No
18) Search engine friendly archives producing same content on different urls inside site?
No
19) Did you bother taking LSI into consideration with onpage content?
No.
20) Do you use any of the following words on your homepage - under construction, updating, re-design, upgrading, etc.
"Update"
No
21) Do you use more than 1 way to link back to your homepage from every other page in your site?
Two Text Links

MHes




msg:710789
 1:01 pm on May 30, 2005 (gmt 0)

>I have assembled a check list that i feel covers all the angles.

Oh come on! You have barely scratched the surface. Lets put in another 1000 questions, then consider sector and ultimately rotating algo's every month. That should keep us all occupied for the next ten years. We could come up with the answers for this month.... ten years too late. Even then a scraper site with hardly any links in, massive affiliate content and broken links out will still be number 1.

Getting involved with this type of detail is a waste of time and the quality of your questions is dubious. e.g. 'Scrapper sites linking to you' - Is that good or bad? There is no way you will ever know and eitherway there is nothing sensible you can do about it.

Google has shut the door on enabling us to make any meaningful ranking conclusions.

What do you mean by 'dropped by Google'? Are you totally out of the index or just at the bottom of the serps?

There is only one question to ask: What does google want?

Assuming Google wants the best and most relevant results, all you can do is try and make a site that is indexable and useful. The rest is up to Google and they change their mind about the answer to this question all the time. Questions you have asked could be bad news today and good news tomorrow. Any test sample you get will be too small to make a conclusion.

Links to your site softens the blow if you are dropped from Google and is all one can aim for. Being honest about the quality of your site is another factor. Most sites that drop are rubbish sites. Perhaps 5% of dropped sites are good, but they are not ringing the right bells. The chances of you finding out why are remote, so you just have to move on and accept it.

I know its harsh, but its the reality of seo today. You will have more chance of good rankings by making a good site than wasting time making checklists. If news feeds, for example, are good for your user, then Google may agree and you get good rankings. Whatever their decision you will never know for sure. Beyond that, good links in (like dmoz or yahoo) will always help.

The overiding fact in my opinion is that very few things get penalised.... they get ignored. They will get ignored because of the combination of other factors, making the analysis so complex that it is not worth doing.... and then tomorrow it changes.

LostOne




msg:710790
 1:05 pm on May 30, 2005 (gmt 0)

12) Scrapper sites linking to you / content theft?
Yes. Probably around 75,000.

Wow, you're kidding aren't you? That's amaizing! It should be a darned crime!

MHes




msg:710791
 3:25 pm on May 30, 2005 (gmt 0)

>Probably around 75,000.

and it could be the best source of traffic you have in the longterm.

2create




msg:710792
 3:50 pm on May 30, 2005 (gmt 0)

I couldn't agree with MHes more. <Author removed> wrote a book about SEO dying off...and how it will soon be dead (if it isn't already) We can talk about this till we're blue in the face and no one is ever going to know for sure unless you talk to Google engineers themselves. Heck, they might not even know! lol It's all speculation from here on out. The days of SEO are coming to an end. Build site. Build content. Get links and hope for the best!

[edited by: tedster at 7:06 pm (utc) on Nov. 3, 2006]

wiseapple




msg:710793
 3:54 pm on May 30, 2005 (gmt 0)

The scrappers provide no traffic.

Imagine this - if google counted all 75,000 as backlinks... I am sure we would dominate on the serps for a great number of pages. Maybe this is the reason we are pushed so far down in the serps. If we were allowed to run free, anything we posted would automatically rank high.

Everything collapsed for us after Feb. 2nd. We saw even a greater decrease in ranking after this updated. Basically, Altavista and Ask are now providing more traffic than Google. We used to get around 15,000 referrers a day from Google. We are now down to about 300. The only saving grace is traffic from Yahoo, MSN, ASK, and even Altavista has increased. We still rank properly on these search engines.

Not only do we have the problem with 75,000 scraper pages pointing at us - but we also have the problem where Google thinks we have four times as many pages as we actually have. We have around 20,000 pages on our site. Google thinks we have 80,000. I do not know where it has come up with the other 60,000 or so from...

Just one story in many...



nileshkurhade




msg:710794
 3:55 pm on May 30, 2005 (gmt 0)

This effort in this thread and one previous thread seems positive, there are all chances that something good might come out of this collective effort. With the mods permission somebody needs to host a form and generate some sort of reports and charts. Just a suggestion though!

benevolent001




msg:710795
 3:59 pm on May 30, 2005 (gmt 0)

4) Adsense
Since it may connect innocent sites with scraper sites.. do you use it?

can you please clarify what you mean by this.
in simple terms you mean by scraper site that one who just had keywords on one side and google ads along with and some links...

HostingDirectory




msg:710796
 6:01 pm on May 30, 2005 (gmt 0)

can you please clarify what you mean by this.
in simple terms you mean by scraper site that one who just had keywords on one side and google ads along with and some links...

I mean scrapper sites placing your link and some content from your site in a directory style fashion... along with other links and content from other sites.. placing ads around them and making copys of the same type over and over... basically thousands of the same rubbish style directories. Some even use bots to take all your code, remove the images and then make it invisible, trying to emulate your search positions from a density point of view.
Google wants shot of them but has made the mistake of shooting the original website that had nothing to do with it.
If you want to check.. try your domain name.. mywidgets in Goolge or your company name... have a look and see how many of these illegal sites have sprung up.

Personally i have over 30,000 that Google has indexed and much more on other search engines. But what can i possible do about it?

It's sad, but it seems all the sites dropped in the serps have one thing in common.. scrappy sites.

benevolent001




msg:710797
 6:30 pm on May 30, 2005 (gmt 0)

If you want to check.. try your domain name.. mywidgets in Goolge or your company name... have a look and see how many of these illegal sites have sprung up.

what should i write is search bar?

HostingDirectory




msg:710798
 6:49 pm on May 30, 2005 (gmt 0)

Goto www.google.com not the searchbar... then type in your domain name - text only without spaces,

yoursite instead of your site or yoursite.com.

Then have a look at the number of sites that comes up, you may need to scroll deep into the list to find scrapper sites if you have any.

wiseapple




msg:710799
 7:11 pm on May 30, 2005 (gmt 0)

I would guess that scraper sites have now invalidated the complete google algo. Here is my best guess:

- As sites rises in ranks - at either Google or Yahoo... Scrapers will pick off the top ten or twenty to include in the page. This is done automatically over thousands and thousands of keywords. A scraper can generate hundreds of thousands of page in a day.

- Now the issue comes in that a site will gain thousands of back links in a very short period of time. This will trigger the penalty for having to many backlinks. Therefore, the site that was once ranked high will fall due to the penalty. I am not sure how long the pentalty will last. However, the process will be repeated as the site rises and the scraper once again picks off the top ten or twenty from the serp.


This invalidates the complete google algo by using links to determine popularity.

They say that incoming links cannot damage a site - how about if a site gains 50,000 backlinks in one month? What happens then? A scraper site is not going after a few pages. I have monitored scraper sites with over 250,000 pages index in google covering every conceivable topic.

caveman




msg:710800
 7:26 pm on May 30, 2005 (gmt 0)

Agree with the general gist of MHes' post.

First, you have to be far more clear and specific about the nature of the problem. Were you dropped completely? Not indexed? Or did you just get hurt badly in the rankings. All pages? Just some pages?

Then you can start to sort out the possible causes. There lots of genreal categories of potential problems. And within the categories there are often many different items to be investigated and/or set right.

All in, there are literally hundreds of potential issues (maybe thousands) that could be affecting your sites. The original post of this thread, as MHes, notes, barely scratches the surface and lacks specificity even then.

There are lots of useful threads in WW about problems/solutions. Then there all of the algo white papers and resouces that must be read if one is really to embark on a search for the true problems any given site may be suffering from.

lostinfrance




msg:710801
 7:53 pm on May 30, 2005 (gmt 0)

Hi,
I've also been dropped, I'd just made it to number 1 position for a few important keywords and now I'm not even sure I'm in the index.
I can't be found in any search results, but very strangley if I go to google ranking page every other click it tells me I'm number 1 and in between says I'm not found anywhere and may not be indexed.

1, site size 50 or so pages,
2, outbound links, 35
3, inbound links, 50
4, adsense no,
5, content updated daily/weekly
6, adwords yes - signed up a month ago - then by coincidence reached the number 1 positions,
7, new site 4 months old.
8, don't use no follow tags - use index follow 7 days
9, eastern europe?
10, shared hosting,
11, I only use 404 redirected to home page on some pages that are no longer
12, scrapper sites/ content theft - I'm new to web sites this is my first so not sure what these are,
13, dmoz listed - yes
14, yahoo - rank is still good (a few minutes ago anyway)
15, rss feeds - yes I use 1 for news headlines, it updates twice a day,
16, pagerank was 3
17, extra domains pointing - kind of on a testing site
18, Search engine friendly archives producing same content on different urls inside site? - not sure about this my main site is static html with a php search, my forum is xoops with phpbb module.
19, LSI - probably not as I've never heard of it.

MHes




msg:710802
 8:23 pm on May 30, 2005 (gmt 0)

>I mean scrapper sites placing your link and some content from your site in a directory style fashion..

Arr, so you mean Google, Msn, Yahoo etc.

g1smd




msg:710803
 8:52 pm on May 30, 2005 (gmt 0)

>> We have around 20,000 pages on our site. Google thinks we have 80,000. I do not know where it has come up with the other 60,000 or so from... <<

A month ago I reported that adding a 301 redirect and adding a trailing / to all links had sorted out the listings for a site. Only the non-www with a trailing / were now shown in the listings. It had taken about 6 weeks for Google to drop the other three version s of the URLs.

An update to that: about two weeks ago, Google suddenly added all of those URLs back in. Only the non-www with trailing / on the URL are fully indexed with title and description. All the other versions of the URL (www with /, and www without /, and non-www without /) are URL only listings.

On the first page of a site: search it says there will be 615 entries. By the time you get to the last page of the search there were only 445 pages listed. The site really only has 118 pages.

Google has messed up their handling of 301 redirects. They have NOT fixed their handling of 302 redirects.

wiseapple




msg:710804
 10:46 pm on May 30, 2005 (gmt 0)

I will try adding the trailing slash (/) to all links to see if this straightens out the problem with 20,000 - vs. - 80,000 pages. I am willing to try anything at this point in time.

g1smd




msg:710805
 12:18 am on May 31, 2005 (gmt 0)

The trailing / is only needed where you are linking to an index page inside a folder. You end the link with the folder name followed by a trailing / on the URL, and omit the filename of the actual index page itself.

danny




msg:710806
 12:43 am on May 31, 2005 (gmt 0)

The scraper explanation fits my site. It's had good rankings and high PR (reached 8 briefly a couple of years ago), and as a result seems to have been targeted a lot by scrapers.

Also, my site has lots of Open Directory entries, and the OD data is used by lots of scraper/spammer sites trying to bulk up their pages.

Reid




msg:710807
 2:06 am on May 31, 2005 (gmt 0)

There is one very important step that you missed and it should be #1 on the list.

1 Are you indexed properly by Google?
Meaning has googlebot crawled your site and listed each page properly without any 'strange' listings.

If the answer is no then how can you think about rank and traffic?

I see it time and time again. site:droppedsite.com is a mess.

1. 404's listed (pages no longer exist)
This seems to cause a lot of URL only listings and related pages going supplemental, hence crawling gets halted.

2. Old outdated cache. If half the site was cached a month ago and the other half was cached a year ago, this also means crawling problems. Something is preventing googlebot from getting those pages.

3. Invalid URL's listed or little messages beside the listing like 'unknown file format'. This can be caused by server setup or some invalid code somewhere. Basically googlebot is being confused.

4. Only homepage listed and the cache is 'under construction' page from when you first launched your site. Googlebot can't crawl your site for some reason.

5. Duplicate listings. same page listed under 2 different URL's. Not good. This is also a bot problem and completely under the control of the webmaster to fix it.

Here is how it works.
1 googlebot crawls your site
2 googlebot makes the listing(s) for each page of your site.
3.google search engine pull's listings from this database for SERP's

If the database google has of your site is messed up then how can the SERP's not be messed up?

The site: command is the most valuable tool to see if googlebot is able to list your site properly. This is the picture google has of your site. If something is wrong in there then you can talk about links and PR till the cows come home and it won't do you a bit of good.
Concerning PR wouldn't the site: command also reveal what google thinks of the overall quality of your site?

Sweet Cognac




msg:710808
 2:57 am on May 31, 2005 (gmt 0)

Great information Reid. Also I noticed from my stats, that if googlebot comes to crawl the site, and runs into a broken link, it will stop crawling and leave.

helleborine




msg:710809
 3:02 am on May 31, 2005 (gmt 0)

I did the site:command and I noticed something I had not noticed before.

My non-www and www index page both show up.

I checked the PR for both pages.

The non page has PR 4 and 71 OR 83 backlinks.

The www page has PR 2 to 4 and 0 OR 83 backlinks. With 0 backlinks, PR is 2, with 83 backlinks, PR is 4.

I am unable to do a 301 re-direct. In any event, is this a sign that Google is "working on it?"

I need some hope.

Reid




msg:710810
 4:27 am on May 31, 2005 (gmt 0)

if googlebot comes to crawl the site, and runs into a broken link, it will stop crawling and leave.

permanent 404's don't seem to affect Yahoo and MSN, I suppose they just drop them. But googlebot cache control is different, it will keep coming and asking but it won't crawl past that 404.
This causes uncrawled pages to lose the description leaving URL only. And I'm not sure why but some other related? pages will not lose their description but they will go supplemental.

also helleboirne said
"I can't do a 301"

Use a server header checker and see what you get requesting the different versions of a page.
If it is a 302 then you need a different host.

I'm using a virtual host and I do get a 302 on the non-www version of my site. But they use cgi's and 'actual location' of my site is not www.mysite . it works fine for me though, they must be blocking robots from following the 302 on my host. But your host may not be.
I have .htaccess too (but not to 301 www vs non-www)

MHes




msg:710811
 7:48 am on May 31, 2005 (gmt 0)

>...and as a result seems to have been targeted a lot by scrapers

Any site doing well will attract scrapers. Its like saying, 'my site attracted a lot of visitors, so is this the reason I have now dropped in the serps?'

Scraper sites may replace your position if you drop but they are very unlikely to be the cause of the drop. Pages where your content appears as a description will probably be on a low value page in Googles eyes, having already been identifyed as a 'links' page with limited merit via hilltop.

danny




msg:710812
 9:57 am on May 31, 2005 (gmt 0)

MHes wrote:
Scraper sites may replace your position if you drop but they are very unlikely to be the cause of the drop.

Well, if you can think of another explanation for pages on my site no longer ranking anywhere I'd like to hear it.

My current thinking is that it's either scraper sites linking to me (which I can't do anything about), AdSense and alternate ads (which I've removed), or that my internal link structure is "bad" (unlikely but possible). I've been through all the other things people have mentioned - www/non-www, duplicate sites, shared IP address, etc. - and none of them fit.

This 97 message thread spans 4 pages: 97 ( [1] 2 3 4 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved