Forum Moderators: open

Message Too Old, No Replies

Penalized by Google

Reputable site completely booted from the index

         

stanley07

9:13 pm on May 6, 2004 (gmt 0)

10+ Year Member



Until recently we had a top 7,000 site (as ranked by Alexa) and relied on Google searches for 70% of our traffic. Our site has 80,000 pages of unique content consisting of links to articles found else where on the web. We generate summaries by visiting each link and extracting a meaningful excerpt. Each page has distinct content and we don't use any SEO strategies -- just put the pages out there and people visit. We rely on Adsense for the bulk of our revenue.

On April 2, 2004 we noticed a major drop in traffic from Google. We found that our index page no longer appeared at the top of Google results when you entered our domain name "mysite" even though it has occupied the top spot since 1998. We appeared to have been hit with a penalty. Since then it has only gotten worse -- for the past 5 days we have had 0 referrals from Google!

With our large number of pages could Google be mistaking us for a spam site? What can we do to find out more? We tried contacting Google but got no response, even though an email to Adsense was returned immediately saying we needed to contact the help address. Does anyone have any suggestions? Our traffic has completely died.

Thank you.

SlowMove

12:09 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



stanley07, why don't you try an experiment to see if the drop in traffic was all done naturally with the algo? If you have 80,000 pages, there's nothing to editing 20 or 30 pages, get some links pointing to them, and wait for google to spider and index them. If the modifications can get Google to send you traffic, it might give you some clues as to how to rewrite your scripts. If not, I'd start thinking putting pages on a new site.

SlowMove

12:22 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Another possibility is that if you have 80,000 links, are you finding the links in such a way that you can be sure that some of them are not pointing to "bad neighborhoods" or link farms?

SlowMove

1:34 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was thinking about building a directory after spidering and scraping. I don't see the difference between building a directory with scraped content and doing what the search engines do with content after spidering a site. Is the difference in the number of words that are displayed in a blurb under the link?

hutcheson

1:44 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, this is much more like a search engine than a directory. (It doesn't matter how the search results are saved, of course -- that's just an implementation detail. I've created several static "search result" pages for small databases.)

And just as Yahoo doesn't list every ODP category in the corresponding Yahoo one [and vice versa], search engines don't list every result from some other search engine in their own results.

What WOULD be useful is for directories to contain links to search engines, and vice versa -- complementing value, not competing links. And you'll find that is what they do.

So in thinking of this as a directory, which it is not, and not as a search engine, you'll naturally fall into the trap of being disappointed when the competing search engine doesn't appreciate being spammed by the competition.

Stefan

1:55 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've been avoiding this thread because I didn't want to pile on poor Stanley, but Slowmove, you'll have to excuse me... I spent a couple of hours this morning finding out how to break my site out of a "directory" that was framing my content in frames you couldn't see, (meaning all my 250 pages of content with their URL on the front). Jim, bless his soul, in the Apache forum gave me the Javascript to do it.

Slowmove, why don't you scrap the idea of a directory... I'm biting my tongue here, not wanting to tell you what I honestly think of the idea.

Google, what the hell is with the spammy stolen-content sites that you are presenting as serps? Honest to god, this morning while trying to learn Javascript, to get my site out of the URL of some thieving b*stards who had NO content on their site, just links, I couldn't find any help using Google because I found mostly "directories". It is a total joke. The internet is rapidly becoming useless because of this crap.

If you don't have any content to put on a website, you have no reason to have a website!

SlowMove

2:10 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I couldn't find any help using Google because I found mostly "directories". It is a total joke. The internet is rapidly becoming useless because of this crap.

Maybe we are talking about spam here. I think that Google is sending traffic to the directories because there are far worse spammers that create hubs of sites that are setup to look like useful neighborhoods. I don't think it's easy for a search engine to filter out all categories of spam at the same time.

Stefan

2:19 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Slowmove, all due respect, man, but it's all spam to me. These supposed directories are primarily pure dross.

Don't do it Slowmove, don't add another directory to the junk-pile... I'll donate some content if you would like to start a new, real, info website. Would you like some slightly used speleo field-notes? I'll toss in a map or two and a few cool jpg's...

Stef

SlowMove

2:44 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



lol. Maybe RSS is where it's at.

paybacksa

2:59 am on May 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Doa google search on something and step back a few feet from the monitor screen and SERP. Take a broad view....

What an ugly page. Left side all techie with truncated lines, XXk fiel size, "cached", and the like... and the first N results probably spammy directories with long hyphenated web URLS and cryptic summaries, at best.

And the right side... crystal clear creative. Brilliant in simplicity, structured, white space in between ads, no techie jargon, calls to action, friendly URLs.

It is no wonder adwords is so profitable... the simple Google search engine has been sacrificed to the ad server gods... it is no more.

wanna_learn

6:54 am on May 12, 2004 (gmt 0)

10+ Year Member



:-(

I noticed my very well performing site for 100s of Reginal Keyword lost its Index from Google Cache 3 days back.

Today whole site has disappeared from Google, NOTHING in cache, NOTHING indexed, NO referals.

Please help...i have a drowning feeling. Google has been most rude this time.

DaveAtIFG

10:59 pm on May 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most of the copyright related posts that were made in this thread have been moved to the Content, Writing and Copyright forum at [webmasterworld.com...] and I invite you to continue the discussion there.

Let's keep the focus of this discussion on stanley07 and his questions. OK? ;)

Marcia

12:58 am on May 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>was thinking about building a directory after spidering and scraping.

That's exactly what those Zeus directories were doing back when they were still plaguing people with spidering you couldn't prevent.

>>I don't see the difference between building a directory with scraped content and doing what the search engines do with content after spidering a site. Is the difference in the number of words that are displayed in a blurb under the link?

One major difference is that you can exclude the search engine with robots.txt and have your site removed if you choose to. Not so with those "directories" - whose purpose for the most part is *not* to provide a resource for millions of people a day to find relevant sites.

Another factor is that you can have your text removed from the Google cache if you choose - when it's on your own site. Not so when the cache has your text on someone else's page. That is not a good thing at all, so there's more to the violation of a person's rights than what might first appear.

A member recently has been trying to have highly sensitive information removed - and cannot, because it is NOT his site. So he is a victim, having personal data exposed in public completely against his wishes.

Macro

10:19 am on May 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Marcia, you make excellent points.

Stan, I'm sorry, but you keep thinking of your site as a victim in this. Step back a second.

My site has about 1000 pages of content. Genuine content, unique articles and test results on a certain type of widget. Content that it has costed us tens of thousands of dollars worth of research and time to put together.

Say I own a few businesses like that. All of them earn money off off Adsense. Then you come along. You setup your so called directory with 100% of your content taken from my sites (OK, only a few lines of text from each of my pages.... but 100% of your content is from my site). And you serve Adsense.

Surely, some of the visitors who are reaching you were actually looking for my site? You've been able to attract them only because of the text you've copied from me. But they've clicked an *bay ad on your site and gone away. They'll never come to me and never click an ad on my site.

And you think you're hard done by?

Let's not confuse spam with search engines and directories. You may have spent a lot of money developing something but if that something is a method of scraping websites and organising that scraped content... then mate, it's still spammy sites. You keep trying to justify it NOT being spam by saying it existed before Adsense. It doesn't prove anything other than it wasn't originally designed to exploit Adsense...
but it's doing a pretty good job of that now.

If I spend a lot of money to secure my house from thieves and you spend a lot of money developing a long straw to suck out the entire contents of my booze cabinet through the air ventilation system, that doesn't make your straw something useful to society in general. Sure some people- like the ones you sell that booze to - may write in and say thanks.

Honour robots.txt, provide a way for people to remove their listing, then call yourself a service. Till then the only service you are providing is a stanley07 money making service... and it's at "my" expense.

BeeDeeDubbleU

10:44 am on May 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A directory by any other name ...

Again I can't stress enough that we are not a directory site. People are quick to characterize our site as such without even seeing our site.

Stanley it doesn't matter how you describe your site. It still sounds like a directory to me. I appreciate that you may use cutting edge technology to build the content but you still source it from other people's original sites.

I am afraid that my sympathies are with Macro here. I consider directories (call them what you like) to be parasitic. If someone is searching for my service I want them to find my site not some directory that just revamps Google's results.

I am aware that some people actually like directories but they should be kept in a category of their own. There should be some method of separating them from the main results so that people can choose to ignore them in favour of original content.

SlowMove

12:42 pm on May 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sites that use RSS feeds are probably a better alternative to scraping without permission. There are a lot of similarities. Both types of sites often have small snippets of text. They both provide links to useful pages with complete articles or other useful information.

Portals that use RSS feeds seem to win out when it comes to providing some real useful content, if used correctly, there is no taking of information without permission, and nobody seems to be saying that sites like topix, which I believe is a RSS portal, are spam sites.

Mikkel Svendsen

1:19 pm on May 16, 2004 (gmt 0)

10+ Year Member



I think one of the problems with discussions like this is that there is no consensus on the term "spam". Personally, I don't think it is a very good term to describe search engine spam because it has really nothing to do with other kinds of spam, such as e-mail spam. In lack of anything better I personally prefer the term Danny Sullivan used to use: "Spamdexing" (spamming indexes).

I think it's important to point out that search engine spam is not (and will most like never be) illegal and is most often not evil. It is just a term to describe content and websites search engines determine (editorial or algorithmic) they do not want in their index. Nothing more, nothing less. The editorial or algorithmic rules can (and do) change any time the search engines wish so and they do not have to tell us, or anyone else, about it.

So, saying that a website could be determined "spammy" by search engines is not the same as saying it's bad, evil or illegal - just that search engines might refuse it in their index or de-rank it.

From a searcher and a search engine point of view I find very little reason to have directories in search results. You can off course find a few rare occasions where I am not right but in general I think I would be happier as a searcher without them. I think some of the reasons they are not all (or mostly) wiped out today is the ongoing focus on having the largest index (junk count as much as good stuff counting the index size!). Also, it is not always easy to accurately determine, algorithmically, if a website is a directory.

One way to avoid the “directory death” is by adding unique content such as user comments, ratings, editorial reviews etc. If the directory adds value to the content it is pointing to I personally begin to see the value of having it in search results.

Macro

5:29 pm on May 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mikkel, I see your point exactly. The act of spamming search engines itself may not be illegal. But put a good lawyer on to it and he'll probably find several illegal acts in stanley's use of "my" content. For example he may well argue that using 1000 snippets of information from my site is not fair use. (There's a spin-off thread on the copyright issues). He could find other illegal acts including unauthorised use of my trademarked terms etc. So, I'm sorry, but the it's-not-illegal argument doesn't work with me.

I may be happy for Google to post snippets of my content, and my trademark. But if I request them they'll take it off. In fact, I could block them putting it on in the first place by using a robots.txt. If he's taking it without permission, using it at his sole discretion to - contrary to Google's use - deny me traffic, I'm sure I can make out a case that he's acting illegally.

Heck, a competitor put up a page on his site called Macro-trademark.htm. I served him a cease and desist and got him to remove it. What's stanley doing that's different?

Mikkel Svendsen

8:16 pm on May 16, 2004 (gmt 0)

10+ Year Member



Macro, I agree with you but I find it important that the two issues are seperated: Search engine spam does not have to be illegal to be determined spam by the engines and just because something is illegal does not automatically turn it into spam. It's two different issues.

paybacksa

2:31 pm on May 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You must be careful when you encourage the big SEs to collude on issues which are not based in law, and which involve economic sanctions against the public.

If G and Y collude on some standards for banning actions which are not illegal but merely impart an economic advantage to the more clever players (think gray SEO), a good lawyer can make the case for racketeering -- especially if G and Y make money as a result.

hutcheson

5:40 pm on May 17, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



paybacksa, when you start accusing other people of colluding -- without any evidence whatsoever -- you'd better be looking to the law on your own account.

Yahoo and Google independently received reports of abuse. (We know where at least one of each came from.) The abuse was of such a heinous kind that any search engine will have been constrained to put it on its list of proscribed activities. And Yahoo and Google both took the action that anyone concerned with integrity of search results would take. (Just as Yahoo and the ODP both refuse to list affiliate doorway sites -- although they wouldn't deign to collude on ANYTHING with us.)

WhenU, by the way, has for several weeks gloried in one of the larger subcategories in the ODP "Allegedly Unethical Firms" category. Is the ODP colluding with webmasterworld to expose unethical firms?

Antivirus programs right and left are busily recoding to detect and remove of WhenU. Collusion?

Is it collusion if lots of different state Attorney Generals investigate the same company for violations of privacy laws related to the illegal capture of information beyond what was described in the company's privacy policy?

Or is it just possibly in everyone's best interest to expose unethical and criminal behavior? Well, almost everyone. The criminal himself usually protests.

This 50 message thread spans 2 pages: 50