Forum Moderators: Robert Charlton & goodroi
If you run a web directory, feel free to post your experience here.
Sorry, I was wrong... actually, it is not googlebot, it is Mediapartners-Google.
Frankly I had my doubts. That makes more logic now.
Are you sure the problem is witb term: "directory"?
Just a theory, refer to msg #45 in this thread.
Briefly, two of five directories I have got banned. The only semilarity between the two and in the same time only difference between them and the other unbanned three, is the term "directory" in meta keywords tag.
Again, whether my theory is true or not. It doesnt make a difference since gbot stopped dropping by so there is no way for it to index the updates. But then again, what to lose?
Sites with a large number of outbound links in a list format (eg like Directories and Scrapers) - I would have thought that this includes links going through a redirect/cgi bin - G must be smart enough to work that out.
Maybe that's the case.
Sites with content virtually identical to another site - eg Datafeed sites with virtually no unique content, or Newsgroups with no unique content (very very thin pages)
Not my site. My site is simply a web directory, with unique submissions, and meta searching engine.
ODP clones
Google directory itself is an ODP clone. Also, I dont see excite, alexa or other major portals that clones ODP being banned.
BTW, I have asked this question hundred time with no answer. Is it just me or alexa is PR0 really?
I don't know why I feel that alexa being PR0 is connected to this thread subject.
[edited by: moftary at 12:34 pm (utc) on July 29, 2005]
Someone mentioned about a manual ban. If that was the case why is it happening on one day (July 28) on many sites?
I was mentioning it. It could also be a semi-automatic ban. When looking at this and other threads:
This could be caused by an off-line spam detection spider, quality spider or whatever you want to call it, which spiders suspected sites, checks the site according to internal rules (linking, duplicate content e.a.) and then pushes the BAN button if the site lacks a certain quality level. This is a totally different approach from the current Google SE algorithm which has tunable parameters which influence the position in the SERPS, but not the existence in the index.
Here's Googleguy's post, in the interest of accuracy. So we are to believe that GG does not understand what a scraper site is? Why would he have any trouble targeting the correct sites? They have a left-based menu which contains the keywords. Each menu item is a hyperlink to a page which contains "scraped results" which are snippets from websites, usually culled using the Google API. All AdSense Scraper Sites (A.S.S.) have 4 Google Adsense ads directly above the fold.
If Google does not like FindWhat SERPS in their SERPS, then use a FILTER. This would not bring a dealth penalty to the domain.
So, in reality, it appears that Google went after websites that are not Adsense Scrapers. Why the policy change? What policy? What do you do when you've been executed? Is there life after death? There are more questions than answers in this update. And you know what goes "up" when Google updates? Their revenue.
If what we have done is considered "no value added" then I am lost. We still rank high on yahoo and msn only google dropped us completely.
And one more thing. I have another similar website from 2001 which is totally abandoned by us since we did launch the new site in 2002. Hence, this website ranks #1 for a very competitive term in the last 3 years! No matter what happened, it was not affected by search engine fluctuations at all! And, just to mention, I ftp to this site about once in 4-5 months just to do minor changes... (15 minutes work).
I have not seen any of the sites that have been hit so I can't really comment on why they have. I will give you my long-winded approach on what I would do and not do when building a niche directory. These are my opinions only!
Here are my do's and don'ts :
Do not build a niche directory on a topic that does not interest you or is built solely on high$ keywords. They are too much work to build correctly and if you are not interested in the topic they will become stale or crap, as you really don't care.
Seed the site yourself with sites you value. Do not scrape results from other directories or SE's. Set forth a clear set of guidelines for titles and descriptions and stick to them. Never let the submitter control the titles/descriptions or you end up with titles like keyword keyword keyword keyword and descriptions to match. Screams low quality. Do not let users modify listings or that is all you will spend your time on.
If you use an off-the-shelf script to maintain your directory (which you probably should), make sure you change all paths etc as to not identify yourself with that script.
Do not have one template that runs your whole site with the only thing changing is the call tags for titles, categories etc. Build the categories with at least a descriptive paragraph or so of text on the page which describes the category and the sites/listings a visitor may find in that category.
Have clear submission guidelines. Do not include sites you don't feel comfortable with whether they are paid inclusion or not.
Do not make a reciprocal link directory. It's a dead-end road imho.
Do not use all that ratings crap… the only ones who will use that are the site owners and it further identifies yourself as an automated directory whether you are or not.
Do link out without hiding behind some script/counter. You should not be afraid to link directly to any site that you have included. If you are, then that site shouldn't be in there.
Do build your site for visitors/users. If it's built for users, all the navigation, descriptive text, etc. will be in place for search engines.
Do not build for quantity, but instead for quality. I would rather have a 200 page quality directory that takes 6-months to build than a 50,000 page directory that is generated in 7-minutes with crap. Guess what, so would users…
If you are building a directory with only listings you have a tough road ahead of you to differentiate yourself from what's already been done many times before. I would build out content based upon the topic to supplement the listings.
Do not rent, sell, or buy ROS links.
Do not create a "shell" directory. By this I mean, don't create a bunch of categories that are empty. If you have nothing to put in a category, don't create it. They should be created only as needed.
Do not think of PR – ever! It will come eventually.
Do not include text that mentions PR, search engines, rankings, links, etc. unless that is the topic of your directory.
Do not build a network of related directories that are on the same topic.
Well, that's all I can think of off the top of my head. A directory is nothing more/less than a way to organize content, information, and resources. It's much like your folder structure you create on your own computer system to organize emails, documents etc. If you are one that can't organize your own information, you probably won't do very well categorizing an online directory.
[google.com...]
or this - if you are hitting a different dc. Although it does look like all dcs at the mo.
[66.102.9.104...]
Other things are happening at Google at the moment - so I cant be sure if we are talking about a ban or not for the sites being discussed in this thread.
Dust is still in the air.
[edited by: Dayo_UK at 1:37 pm (utc) on July 29, 2005]
Does anyone know for certain if a 301 redirect from a banned site would 'taint' the directed-to site?
Sure it would. A 301 tells Google or any other search engine that the domain/page has been moved permanently to the redirected site/page. No different than if you 301 a domain/page that has backlinks/PR that those will transfer.
# Sites with a large number of outbound links in a list format (eg like Directories and Scrapers) - I would have thought that this includes links going through a redirect/cgi bin - G must be smart enough to work that out.# Sites with content virtually identical to another site - eg Datafeed sites with virtually no unique content, or Newsgroups with no unique content (very very thin pages)
# ODP clones
Ok - some of the side effects of the above - normal directories will get hit (even ones with unique user submitted listing - the user submitting probably does not vary the text to much between directories), aswell as sites which have a large number of seemingly outbound links as page content.
Dayo_UK - this is the most logical theory I've seen put forth in this discussion.
Let me put a second vote in for the following:
- large number of scraped/dup outbound links - including links being redirected, or opened up in frames
- sites with content virtually identical to another site
- sites with little modified content, consistant template, only switching out keywords
Let me say that the following are NOT the problem:
- purely directory sites (do a google search and find millions of directories still online)
- recip link directories (again, google it) If recip directories and link pages were being kicked off... practially the entire internet would have disappeared.
Does anyone have additions to, or problems with this list?
What are the pain thresholds for banning one site with these paramaters, but not another?
I guess this signals the end of an era - the other sites in the test searchs I run are all corporate mega-sites, spending millions on advertising and SEO. I have seen other people post this - but the end of the smaller, 3 or 4 person (or less) business sites - relevent sites prospering - what the public searches for in organic SERPs (or PPC) - is ending.
If I wanted to go to a corporate mega-site, I just use the comapny name, if I want to find a unique site, I use a search engine - but now, the megasites are ruling the serach engines as well.
Good luck to all, I am e-mailing Google with a re-inclusion request, but unless they do a manual site inspection, I think my requests will go unanswered : (
1. It has links to it called "mytopic directory" (2 words). It ranks in the first 5 for that key phrase.
2. The title of the page is "mytopic directory and resources."
3. All incoming links have "directory" as the link text.
4. It is very Yahoo like with categories and sub categories. Users submit their sites and are included after a review (to make sure it is on-topic). They choose their own categories and their categories are changed if there is a better category, much like DMOZ.
5. The whole thing runs on a template and database with all metatags, pages/ directories and other info being loaded.
6. I ask for a reciprocal link, but it is not neccessary to get into the directory.
7. I link directly out.
Things that do not buck what is being discussed would be...
1. The scripts that run the directory were written by me - no off the shelf stuff.
2. No adsense on the site.
3. No cgi-bin or those type directories.
4. No ratings.
5. I do not not rent, sell, or buy ROS links.
But, I am keeping a close eye on my access log (see: tail -n) and I am getting a couple of Google referrals every now and then, like 5 times per hour. From other sites though, like google.de. I goto the referring page and am no where to be found.
I am also seeing Googlebot looking at some pages but not often. IE:
66.249.65.78 - - [29/Jul/2005:12:41:09 -0400] "GET /forums/forum-6.html HTTP/1.1" 200 79634 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Note that it is right now 12:54:00 on my server so it wasn't long ago since Google looked. I am going to convince myself that this is a good sign so that I can sleep tonight.
I sincerly wish all clean webmasters best of luck in getting through this. I have never had anything like this happen in the years I've been on google.