Just becuase a particular site might have a lower pr then the pr9 one doesn't mean that it should be penalized further or fall pray to other problems it might just be newer. So the strong get stronger and the weak (or new) get weaker?
This is exactly my point:
>> However, in most cases the pages still do just fine, except in the cases where the site the page belongs to most likely has been penalized.
Meaning not all, say my site is new?
>> The valuable clues come from analyzing why some pages from these redirect pages "get hijacked" while other pages, subject to the same redirect code, do not.
'While other pages, subject to the same redirect code, do not.'
The answer to this is what I mentioned earlier...
>> Since one of the heuristics to pick a canonical site was to take PageRank into account.
Loose English translation, 'one of the determining factors of attributing origination of content was PageRank...'
Logical conclusion: The site with the higher page rank had a beter chance of the content being attributed as theirs.
How can you gain anything valuable from knowing a site that has a higher PageRank will rank higher?
Yes, you are correct it was not (please note past tense) hurting high PageRank sites, but there is nothing you can learn from that, because there was an error in the algo...
What can you possibly learn except:
1. Have an old site.
2. Have higher PageRank than the hijacker, so you get credit for your content.
Exactly Justin. Here here!
"What can you possibly learn except:
1. Have an old site.
2. Have higher PageRank than the hijacker, so you get credit for your content."
They certainly don't hurt, but neither of those guarantee anything at all. The problem is not consistent. The only consistent thing is Google calls links "pages". As long as they do that, problems of many kinds will occur.
I have looked at many instances of pages showing in site views that did not belong to that site and where clicking on the link provided by google sent you in most cases to the site. Please note the use of the weasel word most as in NOT all cases.
In most cases the affected site placement in the serps varies depending upon if the &filter=0 or &filter=1 was appended. With the filter set to 0 the page appeared where it used to in the serps in the case with the filter set to 1 it appeared much further down in the serps. Please note the use of the weasel word most.
A classic example of this was when the home page of a site was hit. Japanese posted an example that got zinged by the mods, but not before I got a copy of the page saved.
Now for terminology:
I fully agree that this isn't a real true "hijacking" once again note the use of weasel words.
It is more of a domain poisioning, however the effect is to lower the target site's position in the serps and in the extreme take down the site using Google's (and others) various automated facilities to help the process along.
I wonder if Jane_Doe would care to sticky her sites' domain names to folks with the required scripts, a number of up and regularly spidered throwaway domains, and the time to play the game?
I'm certain that she would like to participate in verifying that Google is 100% correct in its handling of things since their last sets of changes.
Now why were the changes needed?
I don't own the sites I work on so I can't say oh here hit www.example.com with everything in the book.
Lets find out why some sites that have there hijackers/302 removed from there site:yourdomain.com and now seems to be free of the google bug, still dont have googlebot visits or site reapear in the index.
The must have been some kind of filter at work because those 302s had the same cache as the original frontpage of a site, but why is that filter still active or is it a filter?
My guess would be, some sites may actually be still suffering a 'spam' penalty due to the wrongful attribution of originality, while G was using the old method of determining the origination of content.
Keep in mind the following examplification is of the previous method.
A site with a PR3 being 'hijacked' by a site with a PR5. All things being equal, the PR5 would have received credit for the content, while the PR3 would have been penalized for the content.
This highlites the necessity for the change in huristics (to accomodate a more accurate protocol) regarding interpretation of content origination.
Unfortunately, even after the implementation of the improved method, there is no way for G to know of, or review the all sites that may have been penalized in this manner...
In summarization: Unless it is brought to their attention, the penalty will still be enforced.
(The time and effort of arbitrarily checking all dup. content penalties, or other penalties which could be involved, would prohibit any feasibility.)
This seems to make logical sense, given the recommendation of GG, to make use of the 'reinclusion request' for sites that have fallen out of the index or been improperly penalized.
Maybe GG could comment if this summarization is possible or likely?
Some did try to sent a email to google about reinclusion with no luck and about a PR5 hijacking PR3, is not quiet the way it always was, my site was PR6 and the hijacker(not in there intension) was low PR if any because they also had a meta tag that did not allow google to spider there site, but anyway they replaced my site when I search www.mydomain.com in a google search, then ofcause all the other 302 links.
Im realy not a dum webmaster, but Im at the end in this matter, I think I will give it 2 month, then I will change domain and IP, its just sad that a 4 year old site has to be droped because google wanted to add 4 million more "sites".
Please, keep in mind my previous example was just a simple single example of what I believe contributed to bringing about the change. There are more examples that come to mind, but I was in 'fodder avoidance' mode.
I have used the 'reinclusion request' and had great effect. Within 48 hours everything relating to my site went 'supplemental' and then within another 48 hours, my pages started coming back, updated and unduplicated, the way they should have been.
When I sent my request, I did point out that I was already in the index, but wondered (very politely), since 'my exact situation' was occuring, had I been penalized? I further asked, if since the penalty I believed (302 dup) to be affecting my site, was not any of my doing, if they could look into fully reincluding my site.
Curiously, I did not receive a 'form' or 'canned' response. (Actually, except for noticing the SERP changes for my site, which was too coincidental to be an accident, I have received no communication from G.)
BTW Zeus, I would not gather, or imply that you are a dumb webmaster... EVER. We all do the best we can with what we have. I truely hope your reinclusion is sooner, rather than later.
Well I went through my site about 4 days ago and cleaned it up as good as possible and found out that I had white text in a blue table on a white background? Could that have caused the penalty on my site?
Does google just see the white on white and not the blue background in the table? But my site is 5 years old and had top rankings for years with that same white text on the blue background in the table.
That is the only problem I could find? But I submitted a reinclusion request with no luck. I wish Google would just tell me what the problem is so I don't have to keep guessing. This bites!
Ok this thread is about 302 redirects and googles handling of them.
How to id a 302?
It will appear with your title, your snippet, your page in the cache, someone elses URL. It may appear in a site:yoursite or may be filtered from you seeing it.
Try allinurl:yoursite any listings with
your snippet a cache of your page
someone elses URL
is a threat.
Upon further investigation it may be a page with a META refresh pointing at your site or it may be a script. It could be a page containing an amazon type snapshot of your page with a 302 redirect on the same page pointing at you.
These types of listings will poison a site's PR and slowly drag it into googles abyss.
What do I do if I find a 302 like this?
1. put a META tag on the page that is the target of the 302 link.
META name="googlebot" content="noindex,nofollow"
2. Submit the URL of the hijackers 302 link (that is pointing at the page you just added the META tag to) into google's URL removal tool. (nuke it)
3. Remove the META tag from the target page so that it can be crawled again.
ALL these steps are critical. Do it exactly as described.
This is NOT the answer for every website having trouble with google ranking. This is the answer for websites who HAVE a 302 hijack situation.
I have many inbound links with tracking paramaters that I use to track ads/links/partners. I get the paramater, plant a cookie and then redirect to the home page. I was using (PHP) for the redirect:
which returns a 302. This worked fine for many years. Now I've fallen from #1 to nowhere to be seen.
I've changed my redirects to:
header("HTTP/1.1 301 Moved Permanently");
which now returns a 301. Hopefully problem solved.
Any ideas on how long before (if ever..) I regain my position in the SERP's?
When is GooglyGuy getting back to us? In his last message #105 on Apr 18th, he said: “I have to duck out right now, but I'll try to stop by and give more details as things are more fully deployed.”
He told us to report spam penalties caused by the 302 redirecting at: [google.com...] , and to include "Reinclusion Request" in the subject line.
How will I know if a Spam Penalty has been removed or not? When I contacted Google using the above method, I just got an automated email response back that explained how to determine whether a site is currently included in the index.
[edited by: engine at 8:38 am (utc) on May 4, 2005]
[edit reason] formatting [/edit]
|I have many inbound links with tracking paramaters that I use to track ads/links/partners. I get the paramater, plant a cookie and then redirect to the home page. I was using (PHP) for the redirect: |
How have these links shown up in the SERP's?
The problem with 302 was that when someone linked to you (using certain criteria) then google would attribute the redirect as a page and assign that page the PR of the target. Even the SERP placement of the target too I suspect.
When did your site go down?
They did a 'fix' (we think) about 2 weeks ago. So if the 'fix' did it you may want to ponder that a bit. Because that's when 301's started 'acting strange', see the various 301 threads on WW.
What do you do when your Google request to remove the 302 redirect is denied?
In fact, the hijacker knows people go to the site, so now they have even made a full sales page out of this page! Of course the cache is different, hasn't moved since September. The new sales page is defintely about 2 weeks old.
I also see my rankings in allinanchor are doing really well, but of course my rankings have dropped out of site.
Still super frustrated without answers.
Reid, those URL's with tracking paramaters don't show up in the SERP's for regular keywords....only for 'site' and 'allinurl' results.
It seemed to happen near the end of march.
I did'nt realize that 301's might also pose a problem.
In order to remove the 302 the target must return a 404.
I heard some were sucessful using the META tag or robots.txt but I had the same problem.
I just disabled the site (for a min or 2) when i did it. It's up to you, but if you can make that 302 URL return a 404, (by making the target a 404) the removal tool will blast it.
Be very careful though.
If you use
disallow: / in your robots.txt
and submit your robots.txt URL the removal tool will remove your entire site for 6 months, it's a knife on steroids.
Just remember - submit the hijackers URL into the tool after you cause that url to return a 404 (by making the target return a 404)
speda1 - end of March - that would be about the time google started tweaking things?
Since then it's hard to say exactly what a 301 or 302 will do in google - (it's top secret)
the 302's were stealing PR from the target (before the tweak) not sure what they do now.
You should read through some of those 301 threads
These began to occur shortly after the 'tweak'
Did anyone who submitted re-inclusion requests the way GG suggested has any news? Seems like we got screwed again...
No new or action for me. Sent an email, got a canned response. As of this morning, ranking #227 for my own unique site name and showing as supplemental... continues to drop a couple of spots each day.
All white hat site too, now nearly 11 months old. Yahoo loves the site and the funny thing is I use Adsense. I now get about 8 times more referrals from ASK than Google even though ASK has less pages in their index(450) versus Google (830).
yup so did and got canned response, no news from googleguy yet....:( he said he will look into yet nothing.
We are awaiting results too after our reinclusion request for many pages lost with a loss of huge traffic Feb 3. Much spidering by Gbot since then but traffic still down 90+%. Google and certainly GG are not to blame here - rather the increasing number of blackhat SEO firms who force them to spend enormous money and time simply removing crap.
"URLs with spaces are indexed with "%20" but the removal tool doesn't recognize these URLs, presumably because it cannot find the pages."
put them in robtos.txt, 50 at a time. Of course tripple check it for
Walkman - check for Disallow: / Is a GREAT idea I got in that trap, but I must say I have given up on google,
Zeus - what "trap" do you mean?
Caution to all:
The following line in yoursite.com/robots.txt will completely remove your site and all pages from Google index (as well as the other search engines):
I mean, I also once forgot the Disallow: /
so be careful.
No words on the reinclusion request here either.
We did have a reply when we wrote in about
the hijacker pages though.
No word either. Was contacted for more information about other problems we were having. No reply yet on that of course (none expected). Shortly after the reinclusion request we seen some heavy crawling but as of late it hasn't been quite so heavy. Seen about 400 pages return but waiting on the home page to return to the index and waiting for few thousand more pages to get crawled.
to those expecting a word: it doesn't mean that they haven't received, or even acted on it. I think it's great that GG joined the thread and offered specific advice and help but to expect a reply might be asking too much. Googleguy rules for doing this, (even though we disagree on a few things ;)). I have screwed my site up so badly trying to "fix it" is not even funny. Just re-did the entire thing from scratch, allowed Google back in and let's hope.
Today GB has asked for 300+ pages (so far), all are 404s so far since i changed the url format but eventually the new pages will kick in.
Why do you think its asking too much to expect a reply?
"Why do you think its asking too much to expect a reply? "
because they aren't setup that way to go back and forth.
Why was I penalized?
A: because of XXXX
But I did it because of this, it isn't so bad, is not fair, blah blah...
This way the avoid all that..if they need more info, they'll post here or ask you via the e-mail.
don't get me wrong, it would great, but it's too much and counter-productive for them to deal with.
I agree, GoogleGuy owes us nothing. Anything he can do is a plus. What I do hope is that Google figures this thing out so that other webmasters don't have to sweat this out. I don't make my living this way. I do feel badly for all those that are suffering because of this - those that really depend on the revenues to feed their families.
All I have is a bunch of sweat equity in my site and a wife that thinks I'm nuts. Did you ever try to explain this to an outsider?
I started this whole thing because people at work tell me that writing is a strenght of mine. I'm asked to write on all kinds of topics, so why not write for myself.
When I was looking for my first job all those years ago, someone gave me a gift - The Psychology of Winning. A couple of things stuck with me all these years and one had to do with watching Television...
When you watch TV, you are watching entertainers working. They are getting rich because so many people are willing to sit idly by and watch them. When you watch TV, you are watching entertainers working. Why invest in them, when you could invest in yourself?
I try to put all my time to good use, whether it be with my family or working on this project. I'd rather talk to a person than stare at the boob tube. I love to learn and the TV just doesnt do it for me.