| 5:55 am on Mar 8, 2003 (gmt 0)|
Sounds like an awesome filter. Now if ODP could figure out a way to pick up on that same filter and drop those domains without any editor intervention, it would be really sweet.
Only issue might be someone who lets the domain lapse, has to pay a price to get it back and loses all the links in the process.
| 7:06 am on Mar 8, 2003 (gmt 0)|
1. What is precisely meant by 'expired' in this context? Most people in the domain name industry use it to mean 'past the original expiry date' but others mean actually 'deleted'?
2. If a webmaster accidentally lets a domain name get deleted, and then re-registers it, will they lose all the pr, which could have taken years to garner? This happens to many, many, people. It seems extremely unfair to punish them all.
| 10:04 am on Mar 8, 2003 (gmt 0)|
Hmmm.... some of this worries me GoogleGuy. Could your opening gambit not be interpretted as:
"Hey, we are going to fiddle with our spam filters. Yup, we are really going to screw a stack of quality sites when we do that, so you know what - we will provide an email address to try to mitigate some of the most grotesque errors"
In other words, you are going to net a whole bunch of dolphins, and provide a net-knife for some of them to escape with.
That's how it COULD be read.
Now... tell me you are going to be VERY VERY careful with those filters. You DON'T need drastic surgery, so don't go throwing the baby out with the bath water!
| 10:42 am on Mar 8, 2003 (gmt 0)|
I may be reading this wrong so please correct me.
1) Google reduces effort in manually dealing with spam reports because they want a scaleable solution (GoogleGuy reported yesterdays spam reports being countable on one hand).
2) Google implements filter which will most likely result in innocent parties being affected.
3) How will these innocent parties get their PR back? A non scaleable solution of responding to emails?
If spam reports are so few then why can't they be manually handled whilst a filter solution is investigated?
| 11:23 am on Mar 8, 2003 (gmt 0)|
Skibum, it strikes me that ODP editors *could* use this if the effect is to PR0 sites, simply by using the Google Directory at starting at the bottom (when everything is settled down).
| 12:36 pm on Mar 8, 2003 (gmt 0)|
Dafthead.... you ain't so daft after all. That's exactly how I read it, aligned with the fact that under the new 'scalable non-scalable' system innocent parties are badly hit.
| 4:23 pm on Mar 8, 2003 (gmt 0)|
Napoleon and dafthead, we're always very careful with our filters, and test them a lot before we roll them out. I think I said it earlier, but we do check for all the common cases (e.g. a webmaster accidently allows a domain to expire and then registers it again). It takes longer to implement scalable solutions because by definition that means that you don't want to have to worry about innocent parties being affected and needing to write to Google.
| 4:42 pm on Mar 8, 2003 (gmt 0)|
I'm no expert on how domains expire and get re-registered but in the two cases where I've registered a domain it would appear that the email address is the most important entry. Renewals notices are sent to this address.
What would stop somebody re-registering a domain with the old owners name and address etc. but using their own email? Surely this would be hard to detect if your filters allow for expiry and renewal by the same person?
| 5:06 pm on Mar 8, 2003 (gmt 0)|
There's lots of signals to draw on--I wouldn't worry about that. :) Mainly I wanted to mention expired domains to explain why people might see some differences in PR and the number of reported backlinks this month.
| 6:46 pm on Mar 8, 2003 (gmt 0)|
GG, I know you don't like giving this kind of information away, but are you actually registering and discounting individual links, or just applying a PageRank modifier?
| 6:59 pm on Mar 8, 2003 (gmt 0)|
I really appreciate the additional communication that we are seeing lately. It seems like someone at the old googleplex has decided that it's okay to give out a bit more information about how things operate, as long as they do not negatively influence the integrity of the algo.
Having worked on many quality software teams, and too many that weren't, I have no trouble believeing that you all are able to cover all the common cases before implementing. I'm certain within a few months of bug reports, you will have this particular piece of the algo almost perfect.
I have a couple of suggestions that you might want to consider. You may have already implemented them, or you may have already decided against them. But as free suggestions, they are worth every penny you paid.
The suggestions mostly have to do with the problem of someone buying an expired domain widget.com with the legitimate goal of selling widgets. They will have trouble getting "new" links from many of they places that would link to the site, because those sites "already link to them".
One possibility is to reinstate the link after a certain amount of time. Possibly something like letting them start trickling back in after 6 months, lowest PR links first. This will give the linking site owners more time to realize that there has been a change.
Put together a filter that compares the words on the old site with the new site and come up with a probability that it is on a similar topic. This one could be done, but I'm not sure if it's worth the effort unless you already have someone putting effort into any sort of baysean analysis for themeing.
When a linking page has had some significant changes, specifically, having several links removed, then allow that link to count again. That way, if I am linking to widget.com and if I go through and check my links to delete broken ones, I am sort of validating that I still intend to give that link to widget.com. On the other hand, a list of links that only gets added to might never get verified.
Sometime it might pay for google to document the process that they go through when implementing one of the more simple spam filters, so that people can see the process that goes into making sure that you do not catch the dolphins. Certainly not one of the major spam filters, and you don't need to give all the information, but enough so that we can point to it and show people that you really do try to avoid killing innocent bystanders.
On another note, my biggest wish for google by next fall, would be for Google to either have a way for webmasters to request that they update their entry in their DNS cache, or tell webmasters how to have their hosts transfer the DNS so that google will catch it. I'm hoping to exceed my bandwith limit in about 6 months, and it would be nice to not have to keep my old site up for 3 months just to be safe.
| 7:07 pm on Mar 8, 2003 (gmt 0)|
|So every link has a "date-create-tag" behind it. |
I suppose that will also help find Fresh stuff and adjust for bias against old content not receiving recent links.
Does this mean that there is now value based on the age of a link (assuming it was added after any expiration type thingy)? I have a site that immediately after the update I suspected had fallen (from 15 to 23) from weight given in the algo to age of a site. I purchased the site in July, had it indexed with substantially new content in November, and it had steadily moved up as I added links, content, exactly as you would expect. Google shows my backlinks this update increased from 28 to 36 with similar increases below the threshhold. Most of these have the keyword phrase in the title as it is the name of the company.
The only backlinks to the previous site were from DMOZ and Yahoo and I had both of those changed to reflect the new content. It went from a site about a regional wedding book to a regional wedding directory. Of the several sites that moved up, the only backlinks added were from my directory :(. The only reason I can think of for the fall is that age of a site is now important somehow.
| 7:30 pm on Mar 8, 2003 (gmt 0)|
I appreciate your feedback GG. I spend a lot of time worrying about these matters, because like others I have spent a lot of time building sites/portals which I think are worthwhile and contribute significantly to their respective topics from an intellectual perspective, not just financial. They rank highly through merit.
My main concern though has been with link patterns rather than domain expiry/re-use. I used to be very free and easy with link exchange, as I felt that the web should be open (karma, etc). Now though I am slightly paranoid, as some of these exchanges could come back to haunt me (who knows).
I am 100% confident that this would never happen with a manual filter program, as a human would see that the sites are obviously valid and of quality. With auto-filters? I'm not so sure, hence my concern.... and that of others in similar boats.
I do sincerely hope that you are careful in this area. Not just self preservation (although naturally some), but because people far less able to recover than me would also suffer unreasonably through an over-zealous approach.
I have to say though that the previous track record of Google in these matters has been genuinely superb. I do hope you maintain that approach and standard.
| 8:03 pm on Mar 8, 2003 (gmt 0)|
Here is another concern. Doesn't (or didn't) netsol delete a domain from the registry when they are transferring ownership. I seem to remember someone suing them because their domain was legally purchased while it was in this limbo of having been deleted but not yet reregistered. I am reasonably sure they have this fixed now but how would it apply to when it was happening?
| 8:32 pm on Mar 8, 2003 (gmt 0)|
My domain has been banned for well over a year with no possible explanation except that it was once owned by a company selling in a completely different topic area (same keyword; means two completely different things... the keyword is present in domain name and our company name).
We've never used artificial link popularity, hidden text, cloaking or used any other method of SEO other than just stating what we sell and taking orders. We have built up hundreds of voluntary free links over the last couple years.
I've always figured our banning had to do with the prior owner of the domain. Will this change give me a clean slate and a new chance to be listed?
| 8:34 pm on Mar 8, 2003 (gmt 0)|
jbrausch, e-mail Google explaining the situation. Can't hurt to try.
| 8:51 pm on Mar 8, 2003 (gmt 0)|
Thank you. I have sent them an email about every 3 or 4 months (maybe 5 now all together including the one I just sent). No response yet... Of course, I never really expected one. I just hoped that one day it would show up again.
I hate to throw away our company name recognition and start over with another site, but we have considered it. I don't think that is what Google intended when they put these multi-year bans on domains. If the intention was to ban a particularly bad offender for life, then this expired domain change is very welcome to limit the friendly fire death sentences.
| 9:19 am on Mar 9, 2003 (gmt 0)|
It would seem it would make sense to filter transfer in ownership with drastic changes in content regardless of whether the domain expires, especially if the domain were offline for several months, would it not?
| 10:25 am on Mar 9, 2003 (gmt 0)|
I see this as being a flaw for penalizing expired domain names because I'm always lazy and forget to renew my domains so sometimes there is a week period where a domain is expired then I go re-register it. :D
| 10:38 am on Mar 9, 2003 (gmt 0)|
My understanding of what GoogleGuy has said is that their filters can handle this situation - you won't be penalized.
It's not clear whether this leaves a loophole for bypassing the filters though. It may be possible to buy expired domains and register them in the original owners name.
It definately sounds like the filter is on the whole a big improvement though.
| 10:44 am on Mar 9, 2003 (gmt 0)|
MyWifeSays, So do you think the alg. used will include if the content of the site changes dramaticly during the transition period? ie: bob's domain expires and eric buys it and puts up his own site then it will be re-ranked. But what if eric puts up a copy of bobs site (archive.org ;)? Will it hold the rank?
| 11:01 am on Mar 9, 2003 (gmt 0)|
I would imagine you are dealing with an algorithm that gives a score of how likely it is that the domain is being used by a new owner. Content change would probably be part of the algo.
Registration details, DNS server and site ip address and time offline would probably be taken into account too.
The algo. isn't going to get it right all the time but as long as they err. on the cautious side it should be a big improvement.
| 5:17 pm on Mar 9, 2003 (gmt 0)|
No, now the index is settling down I can see errors in the expired domain algorithm. Site with links added *after* reregistration are *still* sitting at PR0 even though they should be higher with the number of new inbounds.
This is either a bug in the algorithm, or the site *are* being penalised. I suspect it's the former, that Google is unable to accurately discriminate between the old links and the new ones.
| 6:40 pm on Mar 9, 2003 (gmt 0)|
yes... it is obvious they cannot tell the difference between new links and old links. It would be much easier to just add a penalty and be done with it. Perhaps some new links this month would work for next month?
| 9:57 pm on Apr 20, 2003 (gmt 0)|
"GG, I know you don't like giving this kind of information away, but are you actually registering
and discounting individual links, or just applying a PageRank modifier?"
I think this is a very important question.
GoogleGuy (or maybe someone else?), could you give any explanation please?
| 10:38 pm on Apr 20, 2003 (gmt 0)|
It seems that the filter doesn't even work btw: I see lots of expired domains using the old PR to spam the index :-(
It's so annoying to find out that my competitors using these tricks to get high rankings in the serp's ... but I still hope that the filter will work better next update.
| This 86 message thread spans 3 pages: < < 86 ( 1 2  ) |