Forum Moderators: open
By far the most likely *initial* way they use is by looking at the domain creation date (followed by certain signature tests to validate the initial test). However I am told that this date is frequently changed by some registrars when their database changes for example. In other cases, I am told that this date can be masked. If this is true for example, then Google have a LOT to answer for.
In an effort to try to "improve the user experience", Google is going to hurt a lot of innocent webmasters, who provide both the content and revenue on which their future success depends. I think it is in ALL of our interests to expose ways to DEFEAT Google's new expired domain catcher and teach them that trying to automate this process, rather than dedicating human resources to accomplish it manually is NOT a viable long term option for them.
Google's power allows it to hurt the businesses of anybody they see fit without any explanation whatsoever. Our power as SEOs has always been in keeping search engines honest by exposing and using their own unfair tactics against them.
Lol!
You're right. SEO's are heros.
And email spammers force email providers to constantly improve their spam filters - they must be noble people too. ;)
>I think it is in ALL of our interests to expose ways to DEFEAT Google's new expired domain catcher and teach them that trying to automate this process, rather than dedicating human resources to accomplish it manually is NOT a viable long term option for them.
It's not in my interests to defeat anyone.
I'm not conceited enough to think that I have much to teach a legion of Phds.
You're saying that human intervention is better than automation? Even considering the size and scope of the web?
To be frank Artful, I think if you are looking to rally the masses against Google, then you came to the wrong place.
I can't help thinking that you have been burned along the line by the big G and are now somehow looking to retaliate.
You have some valid points in your arguement - it is completly unfair that innocent sites get accidentally penalised, but a SEO Call to Arms isn't going to help that.
An age old arguement that is repeated here is that you don't have to use Google. There are other ways to build a business. If you don't like how Google works, then don't use it.
My 2 c! :)
Scott
Our power as SEOs has always been in keeping search engines honest by exposing and using their own unfair tactics against them
SEOs, their tactics and/or technique are no concern to Google users, who just want to find what they are looking for... and I don't believe many go looking for expired domains that have been resurrected, under another company's name.
That's pure rubbish! (politely) ;)
As such Google will protect those users as best as it can -- that's why the engine is the most used. Protecting what matters!
I can't help thinking that you have been burned along the line by the big G and are now somehow looking to retaliate.
Nope not me. And I can't help thinking you benefited because some of your competition got knocked off by the recent expired domain changes and you are sitting pretty this update and to hell with the innocent webmasters who got burnt.
[Shhhhh : Let's not ask Google to change anything. I'm top 5 for my pet keywords now, right?]
To admins : there are two copies of this post now. One I think was moved from the Toolbar forum where it was originally posted. Can someone please correct this?
And I only run free information sites, so I don't have much in the way of competition! :)
I have seen a 700% increase in traffic over 10 weeks though - partly from Google but mostly from my own marketing efforts.
Although, it might be all those poor innocent SEO's getting penalised that helped boost my traffic - who knows? ;)
Scott
If you have a site that is so important then start PAYING for exposure. There are plenty of ways to do that. It is the ONLY way for a site that absolutely needs eposure to be guaranteed it.
"Innocent sites"? Come on now.. Google ranks pages according to how they think the index best suits their searchers, so they keep coming back and see Adwords on the side. Whether they are innocent or guilty in the end does not matter - what matters is whether searchers are given good results. We are just fodder for that, and thats perfectly understandable as we dont pay. We can withdraw our pages from Google but there are plenty who will gladly take our place.
Bottom line if a certain filter actually reduces the value of the index for people who search with it Google WILL fix it or refine it, not because of a bunch of webmasters on the rampage.
We are just fodder for that, and thats perfectly understandable as we dont pay
That is completely false. Google doesn't pay to crawl and index all of the webmaster content that its entire survival depends on either. Google is simply an indexer of all the hard work the millions of webmasters put into their sites. It doesn't create one scrap of content. Why isn't it paying US for providing interesting content for its users? People don't go to Google to see Adwords - they go there to see our content and NOTHING else.
Whether they are innocent or guilty in the end does not matter
Nobody is preaching the holy gospel of ethical search engine behavior here. But we each have our power - Google has theirs and webmasters have ours and if one side takes advantage of its power, they usually end up paying a price for that.
That is completely false. Google doesn't pay to crawl and index all of the webmaster content that its entire survival depends on either. Google is simply an indexer of all the hard work the millions of webmasters put into their sites. It doesn't create one scrap of content. Why isn't it paying US for providing interesting content for its users? People don't go to Google to see Adwords - they go there to see our content and NOTHING else.
You are completely right -- BANNED Google from your site and show them who's boss (a lttile tit for tat) right! ;)
You have just as much control as Google does (as well as every other webmaster) over who get's into your site, who can use your site, who can spider your site, and you can even let your own domain expiry, and even stop the new owner from having access to you and your visitors.
NFFC - has and I will never, ever forget that, NFFC
Boy how I love this game!
[edited by: fathom at 1:01 pm (utc) on Mar. 12, 2003]
Why isn't it paying US for providing interesting content for its users?
I have sites and content I don't want Google to index. For those, I use robots.txt.
I have sites and content I do want Google to index -- and I am pleased and happy that it does -- that does me a great favor.
The bottom line is that I can chose if I want to allow Google to index my content; and Google can chose whether it wants to or not. Gogle has no right to index my content, and I have no right to insist it does -- or that it does in any particular way.
That strikes me as a fair bargain based on an exchange of favors; no money need exchange hands in either direction.
Artful, we have seen this argument so many times ever since the first days of search engines. 5 years and more of this argument...
You can block content from Google by a simple robots.txt if you dont want them using your content. Then you can promote your site other ways. Yes, you can start a grassroots boycott. Good luck in getting all the millions of site owners to join, especially those well ranked!
Good luck in getting all the millions of site owners to join, especially those well ranked!
For every 10 sites in the top 10 results for any given keyword, there are usually thousands who aren't. I'm not asking anyone to join anything. The far larger silent majority of people who read this forum but don't post will each make their own individual decisions based on circumstances.
The problem is that webmaster satisfaction with Google is constant over time. At any moment for a SERP, the 10 webmasters on page 1 think that Google is doing a fine job. ;) What Google worries about is if the overall searcher satisfaction goes down.
That is correct!
And typically the thousands who are not are the ones who have not put in any time in learning what Google likes to see on a site or they have tried to take shortcuts and paid for it in the SERPs. They just expect to be in the top and get upset when they are not.
Granted, there are those who game the system, and I can only hope as each dance passes, Google will drop more and more of those so that Webmasters who work hard, follow the Google guidelines will be rewarded with free Google traffic.
Good luck on a call to arms!
You're right! (IMO)
The percentage of webmasters getting anything in return for Google using their
copy is very, very small. I think it is Googles responsibility to create a level
playing field for webmasters and they are not willing to throw any money at the
problems because webmasters are powerless to do anything about it.
Or are they? I don't think a webmaster union has much mileage but the power of the media
might.
I suspect that Googles owners are looking to capitalize on the strength of their brand
by listing on the stock market. They are generally pretty quick to fix
any problems that make the newspapers but I think they will be even more
sensitive if they are looking for a stock market listing.
So instead of positive articles in the Wall Street Journal and Financial
Times we might see headlines like "Webmasters concerned about Google Spam"
and "Are Google really bothered about Spam?"
Any Journo's out there?
I personally think that it is a good thing that Google cares as much as they do about the webmasters. Yeah, innocents are going to get hurt during the 3 months that it takes google to sort this out. But innocents have been getting hurt by the expired domain spamming.
Google is not taking away anything that it *owes* to those sites. In fact GoogleGuy has been saying on this board for a long time that buying expired domains is unsafe.
Google's top priority is to the searcher. And there are very few cases where the SERPs are negatively affected for the *searcher*. The majority of expired domain issues will arise in competitive commercial areas, and the definition of competitive is that there are lots of other options. If the searcher has sufficient, quality options, then google has done it's job right.
There are very few sites whose removal would negativly affect the average quality of the SERPs.
No argument about Googles service to the searcher, although I think their top priority is profit
for themselves. Google delivers top quality results on the whole.
When I mentioned 'level playing field' I was referring to the removal
of cheats but without impact on innocent parties. If they use filters
that affect innocent sites then they should have support staff to put
this right quickly.
There are many types of spam that have been beating the filters, they
should deal with these sites manually whilst writing better filters. It
just seems to me that they're not interested.
If you run a business that depends on google traffic, you better have a backup plan that is good for several months without that traffic. Google could just as easily decide to give a boost to any page that contains the word "balderdash" in orange text, and if you compete against a few of those sites, it could take you several months to figure it out and get back in the game.
Yes. It sucks that "innocent" sites are taking a 2-3 month hit. But remeber, the majority of these sites also got an unfair boost.
Google could probably have handled it better, but it is *their* index, and they have to do what *they* think is right.
If I got knocked out, I would beg and plead just like everyone else, but I would try very hard to work *with* google, instead of *against* them to try and resolve the problems.
If I got knocked out, I would beg and plead just like everyone else, but I would try very hard to work *with* google, instead of *against* them to try and resolve the problems.
I don't see Google ever discussing with webmasters the options for dealing with expired domain spam. Instead they send their representative to smooth things over on this webmaster board after the damage is done. Human nature and corporate profit will teach them that their priorities for webmasters should be just as important as their priorities to the surfer. How did Google become popular in the first place? Word-of-mouth effects amongst users certainly played an important part. But the weight of discussion and promotion of Google amongst webmasters on the wider Internet must surely have played at least an equal part. Isn't it interesting that Altavista's steep decline and looses to Google happenned at a time of greatest webmaster dissatisfaction with their results?
Addressing specifically the topic of expired domains filters, I am confident that if the record creation date of the domain could be tampered with, Google's expired domain filters would be useless. Anyone have a different view?
I'm not even advocating a human only policy. A combination of filters and human effort
should do it. Factor in the disincentive to spammers when spam is dealt with
quickly and I think you have a solution.
Anyway where are the filters? You're a programmer, how difficult is
it to detect an all white gif on a white background? Not very, I bet.
That would be a start.
We're not talking about how many web sites there are, we're talking about
ones that spam. We've heard GoogleGuy say there aren't many spam reports,
surely that's more of an indication of the problem than the number of
websites.
So, you report every case of spam that you come across? They do not want to catch just those sites that are reported, they want to catch all the sites that participate.
His particular comment was about spam reports *in the early stages of this update* before the update was moved to www.
Anyway where are the filters? You're a programmer, how difficult is
it to detect an all white gif on a white background? Not very, I bet.
You must be in management or marketing.
Detecting it would not be that hard. It would require writing some custom checks inot a version of Mozilla. It would probably take something under three programmer years to make a version of Moz that would be able to trigger some spam alerts on a given page that it renders. And that would cover almost all the common, non-cloaking methods of spamming. It would have to run a bunch of tests and compares on each character it outputs.
The problem with the plan is that *every* page would have to be rendered with all it's component parts, and resized multiple times before you could even get close to getting accurate results. How many computers do you think it would take to run all those checks on even a small percentage of the sites? Google currently ignores all the "extras" that go into a site, like images, js, css, etc. all those files would now have to be brought in and processed.
If each system could process 100,000 pages a day, which isn't very likely, they would have to have over 1000 systems just running this code full time to cover the web in a month. At 10,000 pages a day, it would require 10,000 systems.
And with all this, there will still be false positives, and lots of stuff would still slip through.
There are ways to reduce the load, by just concentrating on sites that come up in the top of the serps, but that brings up other problems.
It can be done, it should be done, it probably will be done. But to decide it's easy to do it right, is wrong.
I'm not sure what data Google could be using. Perhaps they have access to more than what is provided in a WHOIS record? Anyway how would you change the registration date anyway?
I'm not yet sure but I have seen whois information on some domains that does not include this data in the past. Other things I will be trying in an effort to prove how pointless this expired domain automation is, are duplication of the previous site's content (with permission), duplication of the whois information (with permission), and duplication of the name servers used by the previous site.
I will publish my results on this forum.
Manual processing does not need to be done internet-wide. There are enough webmasters out there competing for the most valuable keywords who play by the rules and will report those who don't. This is simple to implement. Google simply needs to provide the resources to do this.
Or else I will guarantee we will be back to an expired domain free-for-all on Google in no time. What's it to be Google?
I'm pretty sure Google's algo to detect expired domain spam will be more complex and "artful",( due to its resources and size of the testing sample available to it, not to mention how they can discern relationships with other data they have) than what one person can come up with.
I do wonder how important it is given the small number of sites/webmasters it affects directly compared the potential "good" it can do for the index relevancy for users as a whole.
However I admire your enthusiasm and look forward to seeing the results of your analysis here.
But if you do find a bug in Google's filter or a way to circumvent it, be sure to publish it here. Google will want to read that post carefully.
If you're right that it will take 3 man years and a whole lot more hardware then I was right - it's not that hard.
Alternatively they could negotiate a licensing deal with Microsoft to use the software that they have already written.
Once that's in place cloaked sites wouldn't be a problem either.
You say it probably will be done, well surely it would have been by now if that was the case?
Artful,
Good luck with your project! I guess though that Google may have a special deal with the domain authorities, this would be mutually beneficial as the demand for expired domains is causing all sorts of problems to them.
If you're right that it will take 3 man years and a whole lot more hardware then I was right - it's not that hard.
But don't make the typical manager/marketing mistake of thinking that 36 months/36 programmers = 1 month. It is one good hacker will take 3 years, if he is excused from all meetings. Three good hackers is 1 1/2 years. 7 normal programmers added to those 3 hacker and it is back up to 2 years. 36 programmers would take several years to generate garbage that would not work.
Once that's in place cloaked sites wouldn't be a problem either.
Really? How do you get that? It is a totally unrelated issue.
It would require googlebot doing its normal crawl, and another crawler crawling the entire web, including all the parts that googlebot misses (images, css, js), and doing it in a way that it is not identifiable as being from google or for that matter even being a spider. They would need to have millions of different IPs scattered over all sorts of different ISPs to keep from being traced
This is getting into very questionable behavior that I doubt very much that Google will be willing to participate in.
You say it probably will be done, well surely it would have been by now if that was the case?
I am certain some of it is underway. But you have to remember that google has to actually ship product monthly. Hunting spam is only a sideline to that main product.
And setting up an extra 10,000+ servers, building the datacenters, paying for the maintenance, power and bandwidth is getting into the hundred million dollar range.
Why use the term man/month for an estimation if it's totally useless? Being a programmer you probably think 1 guy doing the job in 3 years is the only solution. If I was playing the roll of manager I would get a senior programmer to design a solution with the requirement that the job should be done as quickly as possible and the number of programmers used was not an issue and that quality should not be compromized. I'd also investigate the option of using software that is already written.
As for cloaking, well I'd go for the millions of IP's and non standard spider behaviour solution - that's pretty simple given that the above mention software is available. I wouldn't be going to the effort you suggest of doing two crawls though as I wouldn't be interested in whether somebody is cloaking or not, just indexing the pages. I don't have a problem with this type of behaviour, the usual rules for spiders could be observed just don't use a known user agent or ip address.