|Does Google Ban or Filter Web Directories?|
I think the subject worth a thread itself. It's a suspision so far. Yet I don't see dmoz, yahoo nor any major web directory were banned/filter nor PRed zero as my web directory did. I tried to check it in Alexa (powered by google) and I see some results from my site. Appearently, Alexa brings old results from Google but something weird is that Alexa itself has PR0 now. But that's another story!
If you run a web directory, feel free to post your experience here.
Can someone post a few sites URLs that were pr and backlinked dropped, that have not been changed or coded since googles hatchet man got on the job at the end of July, there are a few things I want to compare.
|Once again I suggest not to touch your sites as we know nothing yet. If ODP was the problem, then how all major portals survived? What? Google is beeing a racist? There is absolutely nothing in their TOS against using ODP feeds, and ODP itself, and its well respected from Google, is happy to provide these feeds and also provides a link to whom using their contents! |
Its just in case, I just don't want to risk anything to be honest - I just want to get back in ASAP as I can't cover hosting bills with this trickle of visitors.
|Once again I suggest not to touch your sites as we know nothing yet. |
You should know by now exactly why your site has been hit – if you don't, you are in denial.
|There is absolutely nothing in their TOS against using ODP feeds, and ODP itself, and its well respected from Google, is happy to provide these feeds and also provides a link to whom using their contents! |
The keywords in the above is its well respected. CNN's website is well respected also by many SE's along with 100's of other portals. That doesn't mean if I make a copy of their site I would think I will get the same recognition as the original site.
Google has made it quite clear of what it thinks:
Read [google.com...] starting at and continuing down from Quality Guidelines - Specific recommendations: including "Don't create multiple pages, subdomains, or domains with substantially duplicate content" and "These quality guidelines cover the most common forms of deceptive or manipulative behavior, but Google may respond negatively to other misleading practices not listed here" also "It's not safe to assume that just because a specific deceptive technique isn't included on this page, Google approves of it."
Then read [webmasterworld.com...] msg#4 " I have a strong hunch that we're going to be taking a closer look at sites that are just scraper sites, or throwing up a copy of the ODP with no value added."
I'm not sure what else you are waiting for? Ian is making the right decision in my opinion. If the majority of your content (count the pages) is from a feed or duplicate content you are probably suffering right now. If you are not…great!
I would not ask for reinclusion if the majority of your content is duplicate. Even if they manually unbanned you in all reality you will be hit by the algo again. How many times do you think they will manually reinclude your site that is duplicated content?
I'm not trying to come-off as you may think. It doesn't bother me what tactic, content, or method anyone uses to try to rank their site – it's their business. I am simply amazed at people that are blinded by their own tactics and then complain when they don't work out as planned.
brianbear no url dropping is allowed for individual sites anywhere in this forum(s).
sorry about that Contractor, reading between the lines then would you say a hand edited directory that uses a search engines xml feed for sponsored links gets caught in the trap then?
You mentioned this, "Most of these share the same IP and/or go to the same datacenter.".
I am not saying you are incorrect in your findings, I haven't access to the list of sites you looked at.
I can however state that Google has a major problem handling sites on shared IP addresses.
In another recent thread a client of someone got a call about they had hijacked another company's home page.
The caller got the hijack direction wrong (oops) and in fact there were at least 10 domains (different owners except for 2 domains, which btw also got split). These domains had pages from each other and embedded links pointing to other sites' pages.
However the sites were not crosslinked when you looked under the hood.
|If the majority of your content (count the pages) is from a feed or duplicate content you are probably suffering right now. If you are not…great! |
While I only consider the ODP portion a very small part of my site I guess if you just count pages it is actually the majority of the site - Like it probably will be for 95% of the sites with ODP sections.
Google doesn't - sorry didn't - index my ODP pages, but Yahoo does and I have 500.000+ pages listed from the ODP section - I probably ought to exclude the spider from indexing that section to save a little bandwith ;o) - Even with the high amount of pages indexed that is not where my traffic is comming from (The indexed ODP pages)
|reading between the lines then would you say a hand edited directory that uses a search engines xml feed for sponsored links gets caught in the trap then |
Personally I can't say what you should do. Whatever works for you works. If you can throw up a feed and make money on it that's fine. If you get hit down the road only you can decide if it was worth it or not. The thing is you shouldn't complain when/if you do.
What would I do? If I had a script on part of a site that used a PPC plug-in for search results or returned info from a feed I would not allow it to be crawled. I would try to attract visitors and rank with other original content and then steer them into using the part of the site that has those features.
Those that use substantial duplicate content (ODP, scraped, newsgroups, feeds) or multiple domains containing basically the same content as 100's of other sites need to rethink their strategy imho. Again, if it works, ride it out as long as you can. Also, I keep hearing from people stating that it doesn't bother Yahoo a bit. This thread and the complaints are about being dropped from Google – not Yahoo. I can search for almost anything on Yahoo and see complete duplicate content/sites taking at least the whole 1st page of results. It's irrelevant as the topic is about what Google has decided to do. If I had duplicate sites taking up the top two pages of Yahoo I would simply block Googlebot from those domains.
Can I ask a simple question (rhetorical) from those including duplicate content (ODP, scraped, newsgroups, feeds) or multiple domains containing basically same content?
Why did you use the feed on your site?
Why did you use the ODP on your site?
Why did you use the newsgroups on your site?
Why did you use the scraped results on your site?
Why did you use the same content on multiple sites?
I can answer the above. Because it was fast/easy..period. If you had to build out all of that content by hand (even if you simply had to retype it or copy/paste) it wouldn't be on your site no matter how useful it is to the user. Again, I'm not judging anyone, but your common sense should tell you that if it works at all, it won't for long.
|I am not saying you are incorrect in your findings, I haven't access to the list of sites you looked at. I can however state that Google has a major problem handling sites on shared IP addresses. |
Yes, these were all controlled or setup by the same person/people.
What bugs me the most is that there are two sorts of people, one saying "google is an evil, I cant find my site anywhere, they must be broken" and the other saying "you're banned? you must have been duplicating contents!".
For the first sort, I am sure they will be out of business for good, if you through your failure upon the shoulders of someone else, then this is the first step to complete failure. For the second sort of people, I wish them getting banned from google, then and only then they would know that there is something wrong with google.
Personally speaking - running a major web directory - I had two choices, one is to leave some of my directory categories empty, and then been accused of duplication content and runnind adsense on pages with no useful content (in this case no content at all), and the other choice is to fill in these categories with old data from ODP, then proceeding with accepting users submissions and my editors hunting for new professional websites.
I really don't care what people are thinking of my business model, it's my business model. And if my complain was personal, this thread wouldnt be that popular I guess. There is something going on with google more major than any of their updates (florida, allegra, borboun, etc..).
Finally, my brother had his own theory, "What if google was hacked or modified from an insider or even an outsider?", I gave him a sarcastic look for seconds then thought in myself, "Well, google homepage was hacked not so long ago. Adsense login page went out of service for hours. A few days ago, adwords login page went out of service too. There are no official statements from google or their spokesman GG regarding this web directory filteration issue or any other issues from what I mentioned. Then why do we assume that the google empire is ammune from being intrued?"
Forgive me for stating my thoughts on public :)
"Do you just sit around and complain about how unfair the whole thing is or do you find ways to adapt to the new reality and become profitable again?"
That's a load of crap. Its that nefarious "its all on you" kind of subtle double speak that seems true on the surface, but doesn't hold any practical value. Business doesn't work like that. Projects [and choices in life] are evaluated under risk-adjusted expected value. What you have said is "profitability is possible again, its all on you to achieve it", not the truth, which is "profitability *may* be possible again, but risk of channel destruction reduces the expected future value of that profitability". As a business, a project's value, through the website, google, or any other idea is the sum of the returns with reasonable risk over time. Google has shown, through its *unaccountability* (i.e. lack of reasonable recourse) that the risk of doing business online or with them is too high to continue doing it. You don't know when next they will decided to "punish" you arbitrarily, so any earnings online have a high risk and low reliability about them. That would be true even in a new domain.
Your comments overall have been good, and I'm not trying to insult you, I'm just also trying to point out that legit businesses invest a lot of time and effort and expense into branding themselves, or their domain, and moving to a new one simply is not possible. For a person to suggest it is, is to suggest that they don't have a lot invested in their domain, and are *fly by night* operations. I'm not saying you are, just that that kind of mobility suggests low investment, not a real business, and more closely a scammer.
I am saying that google is irresponsible because they do not have any reasonable recourse for legitimate operations. And also, all of the stuff about google being "free" so they can do what they want is a bunch of nonsense as well. Nothing is free, people have to invest lots of time and effort. At this point, I prefer to pay for my #1 position, rather than wait for one day them to decided that my business is some kind of directory, flybynight scraper with their idiotic technology.
We are working on a press release to let the media know about this, entitled "Google misses the point". Right now it is just us, but we'd like to include/mention several other high quality, *clear value* to the *general public*, mostly distinct sites who have been hit hard/removed by this incident. Google's shares have been going down, and I'm sure we can make them more widely aware of the problem by risking them financially. Maybe we can get them to add some positive recourse, or means by which mistakes can be corrected in a timely manner.
i know it breaks some anonymity, but if you want, sticky me your website/company.
oh yeah, i also agree with moftary -- all our content is ours [we are *not* a link site/directory] and under high demand, and still we got removed. We were #1 in most results, nice place in DMOZ, around since 2000, and 300,000 people link to us. to widely blanket everyone as either a scammer, dupe, or lazy is annoying.
[edited by: Pre_Emptor at 4:17 pm (utc) on Aug. 2, 2005]
Looks like I'm late to the party, but here's my situation:
I run several web sites, the oldest has been around since 1995, another since 1998 and yet another since 2003. All of these three pre-date AdSense and have unique content.
In 2003 (again, before AdSense launched) I added a section on each site for ODP listings. This was done both as a service to visitors and to hopefully get more pages indexed, bringing in more traffic. The ODP section on each site was the same, except for the site layout and theme which was naturally different for each site. Since there are hundreds of thousands of pages in the ODP, this formed the bulk of the sites' pages by default.
When AdSense came out I placed it on each of the three sites. I also created some additional sites, some with ODP, others without.
Since Spring 2004 my Google traffic has diminished, but Y/MSN has rise. Before the plunge a few days ago, Google was less than 5% of the referrals on my biggest site. So I hardly feel this latest purge. But obviously I would like to get re-listed. And I'm not going to drop the ODP sections since they still bring in traffic and revenue from other sources.
None of the sites are "scraped". The ODP RDF dump was used, and content is delivered via a MYSql database. The ODP copyright notice is on all pages, which meets the usage requirements. The simple fact that I use ODP on my sites doesn't mean I'm a scraper, especially since each site can stand on it's own content without ODP pages.
The ODP encourages others to use it's data, which is why the publish the RDF dump and even have a section in their directory for sites using ODP data.
Of course, it's Google's decision to keep sites or drop them. I just think that a better course of action would have been to:
1) Publicly announce (via WW and elsewhere) that they would be taking steps to eliminate ODP clones.
2) Only remove the sections of sites that contained ODP data, leaving the rest of the sites intact.
3) Determine which sites used ODP data only, and which only used ODP data to "seed" their directory, perhaps keeping the latter indexed.
This is a sorry state of affairs indeed. I'm getting very tired of the constant Google upheavals.
JohnKelly, I think your three points say it all.
If ODP is against google TOS, they must announce it clearly. Being a searching engine that index our sites with no prior consent from us does not give them the right to drop certain sites from their index according to a factor that does not apply on other sites. But I am not an attourney!
I don't think to include ODP listings can be the reason for be banned.
Google directory is a COPY of ODP, then has not unique content at all...They will ban Google Directory also?
My banned directory has nothing with ODP, is absolutely unique.
Has no doorways, scam, ot any other bad practice. Has not, in my honest opinion, any excuse for ban it.
It is so unique, that way to do is Patented in Europe and in the way to get the patent in US. We did spent thousends of dollars geting legal rigths and doing patents, and of course we will continue with the project, with or without the Big Brother.
I agree that ODP listings are the only reason to be banned. I have 2 sites, SiteA and Site-B. On SiteA there was only a home page very similar to that of Site-B and an Xcent based directory. Site-B has thousands of pages plus an ODP scripted clone. SiteA was banned but not Site-B. Fortunately most of my traffic comes from Site-B but I was planning on building up the directory on SiteA but probably will not put much effort into that now. Just hope that Site-B does not get banned.
Although still being banned, I receive one referral from google search everyday. Needless to say, when I copy/paste the exact url I dont find anything related to my website. I am pulling my hair out.
Has anyone ever considered the fact google may be eliminating the potential competion in years to come. Lets face it and lets be honest, most of our directories are good enough to stand on their own without google. Its a simple saying "dont put all your eggs in one basket". I see google rapidly losing favour with webmasters and online marketing companies, the trend will buck to MSN or Yahoo and google will be playing catch up possibly. This one will be interesting to watch, we ourselves are in a PR 0 google status with no backlinks, and we do not use ODP data, look at the look on my face do I look worried, NO. The big G can go sit and rotate!
One last thought maybe Google are going to take over/buyout the ODP, now i am raving lol
|Personally speaking - running a major web directory - I had two choices, one is to leave some of my directory categories empty, and then been accused of duplication content and runnind adsense on pages with no useful content (in this case no content at all), and the other choice is to fill in these categories with old data from ODP, then proceeding with accepting users submissions and my editors hunting for new professional websites. |
Couldn't you just use robots.txt block googlebot from empty directories or pages that Google understandably considers worthless to its users?
|I really don't care what people are thinking of my business model, it's my business model. |
You have a right to pursue your business model, but Google has the right to pursue its business model, too. And if that means Google needs to purge its index of empty directory pages, duplicate content, etc., wouldn't it make sense for you to use tools such as robots.txt that will let you pursue your business model without running afoul of Google's (at least, if you want to rely on Google for traffic)?
Are some of these posts for real? It seems to be a lot of reaching here with things like "Do you see this?", "I see it too?", "Oh yeah that must be it!", "Yeah Google must have been hacked and they went after my site!"
I mean really... Lets lay it all out on the table people with some credible evidence. If any of you have spent any time here at WebmasterWorld you would certainly be able to look at your site and know where you have gone wrong.
The Contractor has it right on the nose - if anyone has really read his posts. So lets put this to bed!
There is no quick way to the top, solid-original-content-rules! Do it right or get out of the business - or at least stop pointing fingers.
In Closing, I personally would like to see more meat to this topic that is supported where Google has gone wrong. Everything I see here has been confirmed what GoogleGuy said what would be going on.
|Finally, my brother had his own theory, "What if google was hacked or modified from an insider or even an outsider?", I gave him a sarcastic look for seconds then thought in myself, "Well, google homepage was hacked not so long ago. Adsense login page went out of service for hours. A few days ago, adwords login page went out of service too. There are no official statements from google or their spokesman GG regarding this web directory filteration issue or any other issues from what I mentioned. Then why do we assume that the google empire is ammune from being intrued?" |
Yep, them being hacked is a much more likely scenario than you being penalized for your ODP clone. I would contact them immediately!
|Couldn't you just use robots.txt block googlebot from empty directories or pages that Google understandably considers worthless to its users? |
You mean manually feed robots.txt with thousands of pages with no content, it's not even near possible to do automatic. Besides, your thought is based again on the ban/odp relation which has not proven.
|but Google has the right to pursue its business model, too. |
Dont quote some of my words and forget the other words please. I said clearly that if it's against their TOS they arent free and we are forced to comply with it as long as we are interested in google referrals. But if it's not against their TOS, then no they are free, and that's not even a business model.
If we are going to bluh bluh about these issues, arguments wouldnt end. For an example, have I asked google to index my website? I dont remember using their submit url! So they indexed it with no prior consent and they are breaking my TOS! Put them in robots.txt you say? Well, why havent they put my robot that used to index them in their robots.txt to exclude it? They just state in their TOS that it's forbidden to scrape their serps.
I simply can modify my TOS to "indexing our site with no written consent will lead you to one hundreds dollars penalty according to laws of the courts of Cairo, Egypt", right? TOS arguments would never end and should be clear and theirs is clear IMO. If using ODP isnt mentioned there, then it's legal.
C'mon let's get real and stick with our experiences regarding this directory issue.
|I mean really... Lets lay it all out on the table people with some credible evidence. If any of you have spent any time here at WebmasterWorld you would certainly be able to look at your site and know where you have gone wrong. |
We try this really, but every once and while someone makes it personal and out of topic. If you, me and they have been penalized for "something", we would go to our sites and fix it. 33 pages so far and not even a logic assumption about that "something" (along with my silly assumptions too).
|In Closing, I personally would like to see more meat to this topic that is supported where Google has gone wrong. Everything I see here has been confirmed what GoogleGuy said what would be going on. |
Elaborate please about what GG has said and has anything to do with this issue.
Can anyone here tell me the difference between a clone site or a site that scraped from another?
[edited by: Slone at 6:53 pm (utc) on Aug. 2, 2005]
|Elaborate please about what GG has said and has anything to do with this issue. |
Have you not read anything?
Please read this thread [webmasterworld.com...] msg#4
edited: I'm not posting to this thread any more. If people need a hand delivered personalized letter informing them how to build a website....they can talk to themselves and wait for Google to deliver the message to them personally. Good luck and don't stay up too late waiting for that letter…
Great, this is your first useful post, the contractor.
|or throwing up a copy of the ODP with no value added. |
Assuming that google PR is the value added for ODP to google directory, my directory is using seeds of ODP back to may 2004 with thousands of unique submissions.
|- Yahoo 579 661 |
- MSN 32 32
- Unknown search engines 22 22
- AltaVista 20 20
- Overture 12 12
- AllTheWeb 5 5
- Google 3 3
- InfoSpace 2 2
- Excite 1 1
- Web.de 1 1
Those are my awstats for 1-2 august, explainations?
and yes my site is still banned.
And may I add also "msg#16"
This should tell you enough about where Google going.
|Can anyone here tell me the difference between a clone site or a site that scraped from another? |
Define that and I think you may have your answer.
|edited: I'm not posting to this thread any more. If people need a hand delivered personalized letter informing them how to build a website....they can talk to themselves and wait for Google to deliver the message to them personally. Good luck and don't stay up too late waiting for that letter… |
I hear ya...
|Great, this is your first useful post, the contractor. |
OK, I have to respond
You answered my question with that remark. I have posted about that thread (http://www.webmasterworld.com/forum30/30530.htm msg#4) almost 300 posts ago and also on several other of my posts in this thread. I have also linked to that thread and quoted from it throughout my posts. Which proves my point. You haven't read a thing. Please go back and read msg#43, #88, and #303 to name a few...geez
If you wasn't so involved with your conspiracy theories you may have saved everyone a bunch of time...
I think the real problem here is lack of recourse. Its obvious that Google, despite all their obnoxious posturing in every market has no idea what a particular "type" of webpage "looks like" or what their job is. I think people are right that they are trying to remove directories, but they have no idea what constitutes a directory. We look something like a directory, and something like a scraper, with 32,000+ pages, but we are neither -- all of our content is completely unique and owned by us.
They trust their technology so much that they have abandoned reason. It is because they have so much arrogant faith in their technology that they won't properly support people, or have a "fallback" solution for mistakes that the technology makes.
It is because they are becoming confused about what "google" does that they are eliminating directories. Directories are around because they are (generally) human edited pointers to content that are more credible than search results. Google is a *keyword search* engine, but now they think they want to be a directory too -- where THEY are the credible resource of what sites are included or not. But their technology cannot possibly take the place of the humans who make their own directories pointing to credible content. Directories are *important*.
Anyway, if they were a responsible, accountable company, what they would do is test-apply all of the index changes, and then email all of the adversely effected (i.e. banned) people before the change is applied. (they surely have all emails listed on your webpage).
I think the real problem here is people have gotten away with reproducing content for so long, they think it is their right to reproduce it, then SEO it, then rank for it.
Really, how could another DMOZ add anything to the web, except Google ads?
How could search results that anyone can find in a major SE add anything to the web, except clutter?
Recourse? Put yourself in the shoes of Google for a minute - You have 8,058,044,651 pages to choose from do a couple of hundred thousand matter that much, or are there more than enough to go around? Maybe the number of pages to choose from is too big for most people to understand how many that is?
And the reality of hand reviewing the index? Priceless
8 bil pages, by reviewing 100 pages per hour * 8 hours per day * 1000 people reviewing = 800,000 pages per day * 365 = 292,000,000 pages per year : so 8,058,044,651 / 292,000,000 = 27.59 years to get all the way through the index once.
jd01, it's google right to ban/filter all sites that contains ODP. But that includes other websites that use ODP like excite, lycos, alexa, as well as google directory. I have said that so many times and I have also said that they should update their TOS and webmasters guiedlines accordingly.