Welcome to WebmasterWorld Guest from 220.127.116.11
If you run a web directory, feel free to post your experience here.
"Our spiders regularly crawl the web to rebuild our index, but keeping tabs on eight billion pages is tough work, and they may miss a few". They go on to say that if your site is well-linked it should be picked back up.
I sent several emails regarding scraper sites that had stolen my content and after several templated responses about contacting the webmaster of the site (as if it were possible!) I eventually got threatened by them (Google) saying that I'd better have a water-tight case and demanded an extensive list of documents and signed waivers before they would even listen to me about removing the offending sites.
I don't know if I can paste the exact wording here but take it from me - they couldn't give a rats a** about scrapers or copyright infringement unless the offending site doesn't have Adsense on it.
I can believe this to be true with sites that have few incomming links, but I have never seen it happen to any of my sites that are heavily linked.
Well to a degree you can see Googles stand point - yep there should be a water tight case for removing sites at request of us webmasters. It is up to Google whether sites go past the quality control guidelines and if they feel they dont then DMCA etc must be necessary for the removal of the site.
However, Google might have moved the goalpost for the quality control and a lot of sites can then suddenly get effected by a "flip of the switch"
However, I am still open minded about what is happening at the moment - obviously sites that have been totally removed from the serps (No results in site:example.com check) I would be very worried.
But there does seem to be some evidence of a index rebuild going on starting about mid July. So at the same time as these sites being totally removed we have the situation where Google might be revamping the index - whether the removal of these sites is a side effect of this or the scraper fall out I am not sure.
Have you got total removal - eg nothing for site:www.example.com check?
Did you ask G if you had a penalty?
Also retracing back to your Adsense point.
What you need to remember about MediaPartners bot is that it visits the page after someone has visited it.
Quick example. If your site is indexed in G you should get relevant ads. If your site is not indexed then the first time someone visits - unrelevant ads - mediapartners gbot visits - the page will then show relevant ads for a period of time (weeks, months? - if no-one visits the page again then unrelevant ads will appear)
So you may find that pages which are not often visited may get unrelevant ads.
IMO - I am not a Adsense guru though.
(...) (Google) saying that I'd better have a water-tight case and demanded an extensive list of documents and signed waivers before they would even listen to me about removing the offending sites.
Although not a real friendly response, I think Google's attitude was the right one. If it was so easy to just send an email to Google to get a site removed from the SERPs, then they would be bombed by emails from webmasters accusing other webmasters of copyright violations and asking for immediate removal of those sites from the index. You can always file a DMCA complaint, which Google has to act on, but that requires a document with your signature. That will block at least 95% of all bogus complaints.
My experience until now is, that the only reason for a removal of a site from the SERPs is a clear violation of the webmaster rules. Cloaking sites and those that violated Google copyrights/trademarks were the fastest to disappear.
Bad linking strategies is mentioned in the webmaster rules and maybe they turned a control knob a little higher resulting in sudden removal of several sites.
Well to a degree you can see Googles stand point
I wasn't making a complaint, I was simply suggesting that those messages saying how easily they have been removed for "no good reason" may not have substantial interaction with the actual Google complaints process to make such a claim.
I have never, ever had a page/site removed from an index where the reason why wasn't clear to me - so obviously I find it very hard to believe that sites are removed despite having done "nothing wrong". I have lost rankings due to duplicate content issues, 301/302 issues but have never been banned from an index without a cause - even if that cause was accidental like the time I put a black background in <body bg=> but put the white writing code in a styles.css file.
I know sorry - just trying to explain how a flip of the switch can lead to a load of sites being banned.
>>Bad linking strategies is mentioned in the webmaster rules and maybe they turned a control knob a little higher resulting in sudden removal of several sites.
Of the sites I have seen that is the only thing that I have not really looked at - the linking structure.
I have seen some people mention an auto page generator (which I have never heard of) - but surely certain link programs, page generators leave footprints which Google might find and then lead to tightning of any filter.
High Traffic + High Inbound = Higher chance to get banned
It is really confusing why the more traffic and the more inbound links the higher the chance of a directory style website to get banned.
This is the inverse of the properties of MOST scraper sites.
Did some one, inverted the
If then BAN statment in algo ? :)
High Traffic + High Inbound = Higher chance to get banned
The site of mine that got banned had high inbound links, but it is one of the lowest traffic sites that I have.
Funny how nearly everyone that got hit comments that they have multiple similar sites..... I guess this blows the "satellite site" approach to pieces.
I for one not sure it's about DMOZ as my three unbanned directories use DMOZ. The only semilarity between the two banned directory is the term "directory" in meta keywords tag, but then again it's proved that many other directories - not mine - have this and were not touched.
I wont get myself tired asking why, what or how as even if I knew the answers there is no chance to get anything fixed after googlebot stopped spidering/indexing the banned sites.
The only thing I can do is praying for a google rep reply my email enquiry. Which leads me to a question; where is GG?
I just hope things work out, but a site: command still yields NO results. No reply from Google yet, but it's Saturday now so I expect any replies on Monday, if I get any.
I have not seen this - but reports of this auto generated thing being on some sites which have been hit fits.
These types of programs like the directory software and certain link programs leave footprints - if the software was being used primarly to build pseudo directories then sites which have these footprints of the software use may get hit. Even if you are a genuine directory.
Although as I posted earlier even genuine directory are limited in the level of unique content they can achieve - when people submit to directories they dont change the title and description of their site that much.
[edited by: Dayo_UK at 3:22 pm (utc) on July 30, 2005]
joined:Oct 27, 2001
When you enquire about your adwords account, you will get a personal reply in hours. When you enquire about your adsense account, you get a personal reply in days. When you enquire about your website, you should pray to get a reply.
You're comparing apples and oranges and apricots. Even if the AdWords, AdSense, and Google Search teams weren't different departments, wouldn't you expect paying customers to get priority treatment?
Er - it just makes perfect sense in this not so perfect world too.
Although Google may prioritise the reports very highly - they may not priorotise the reply as highly. After all they are interested in making sure they have the best serps.
- The page is _still_ cached!
- The toolbar PR is greyed.
- Site: command yieleds 0 results.
- Googlebot visits are here and there, but nothing special. I can't tell if its the news bot or the web bot, they look the same. It has looked at a few non-news-article pages.
That strange - just checking the sites that I know of and they do still have a cache and not just the homepage. Is this normal for a ban (Looks like recently crawled cache too)
Could be just cache database out of line.
Well, another weird thing about caches is this: I disallowed a certain page from being indexed via robots.txt last year and the page is still cached of the old version, March 2004. Strange.
I'm anxious to see what comes of this. My living is on the line.
joined:Oct 27, 2001
They are also very interested in Adsense/Adwords revenue... much more so than search results I would expect. Makes me wonder what their shareholders might see in their next earnings report if this dropped site thing is as big as I think it is.
Why would dropped directory or scraper sites have a negative effect on AdSense/AdWords revenues?
If mysite.com drops in the SERPs, yoursite.com or anothersite.com will move up to take its place. And if Google is correct in its assumption that yoursite.com or anothersite.com is more relevant to the user than mysite.com was, then its AdSense clicks will probably convert better and will yield more revenue for Google because of smaller "smart pricing" discounts to advertisers.
Now, it's possible that some of the dropped sites that have been discussed here are high-quality sites that don't have duplicate content, don't use automated page-generation or link-submission scripts, and will be missed by users if they aren't in the SERPs. If that's the case, and if the problem is widespread enough to be noticed by users, then presumably Google will identify the collateral damage and try to fix it (just as it's done with other types of sites in other updates). But the process may take time: Google doesn't move at breakneck speed, and a number of us watched our sites or individual pages go missing for months.
I understand what you are saying and agree that removing scraper sites is in the best internet of the web. But I have been in contact with a couple of webmasters and have seen some very high quality sites that have been dropped, mine included.