homepage Welcome to WebmasterWorld Guest from 54.237.38.30
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 467 message thread spans 16 pages: < < 467 ( 1 2 3 [4] 5 6 7 8 9 10 11 12 ... 16 > >     
Google's 302 Redirect Problem
ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 4:17 pm on Mar 25, 2005 (gmt 0)

(Continuing from Google's response to 302 Hijacking [webmasterworld.com] and 302 Redirects continues to be an issue [webmasterworld.com])

Sometimes, an HTTP status 302 redirect or an HTML META refresh causes Google to replace the redirect's destination URL with the redirect URL. The word "hijack" is commonly used to describe this problem, but redirects and refreshes are often implemented for click counting, and in some cases lead to a webmaster "hijacking" his or her own URLs.

Normally in these cases, a search for cache:[destination URL] in Google shows "This is G o o g l e's cache of [redirect URL]" and oftentimes site:[destination domain] lists the redirect URL as one of the pages in the domain.

Also link:[redirect URL] will show links to the destination URL, but this can happen for reasons other than "hijacking".

Searching Google for the destination URL will show the title and description from the destination URL, but the title will normally link to the redirect URL.

There has been much discussion on the topic, as can be seen from the links below.

How to Remove Hijacker Page Using Google Removal Tool [webmasterworld.com]
Google's response to 302 Hijacking [webmasterworld.com]
302 Redirects continues to be an issue [webmasterworld.com]
Hijackers & 302 Redirects [webmasterworld.com]
Solutions to 302 Hijacking [webmasterworld.com]
302 Redirects to/from Alexa? [webmasterworld.com]
The Redirect Problem - What Have You Tried? [webmasterworld.com]
I've been hijacked, what to do now? [webmasterworld.com]
The meta refresh bug and the URL removal tool [webmasterworld.com]
Dealing with hijacked sites [webmasterworld.com]
Are these two "bugs" related? [webmasterworld.com]
site:www.example.com Brings Up Other Domains [webmasterworld.com]
Incorrect URLs and Mirror URLs [webmasterworld.com]
302's - Page Jacking Revisited [webmasterworld.com]
Dupe content checker - 302's - Page Jacking - Meta Refreshes [webmasterworld.com]
Can site with a meta refresh hurt our ranking? [webmasterworld.com]
Google's response to: Redirected URL [webmasterworld.com]
Is there a new filter? [webmasterworld.com]
What about those redirects, copies and mirrors? [webmasterworld.com]
PR 7 - 0 and Address Nightmare [webmasterworld.com]
Meta Refresh leads to ... Replacement of the target URL! [webmasterworld.com]
302 redirects showing ultimate domain [webmasterworld.com]
Strange result in allinurl [webmasterworld.com]
Domain name mixup [webmasterworld.com]
Using redirects [webmasterworld.com]
redesigns, redirects, & google -- oh my [webmasterworld.com]
Not sure but I think it is Page Jacking [webmasterworld.com]
Duplicate content - a google bug? [webmasterworld.com]
How to nuke your opposition on Google? [webmasterworld.com] (January 2002 - when Google's treatment of redirects and META refreshes were worse than they are now)

Hijacked website [webmasterworld.com]
Serious help needed: Is there a rewrite solution to 302 hijackings? [webmasterworld.com]
How do you stop meta refresh hijackers? [webmasterworld.com]
Page hijacking: Beta can't handle simple redirects [webmasterworld.com] (MSN)

302 Hijacking solution [webmasterworld.com] (Supporters' Forum)
Location: versus hijacking [webmasterworld.com] (Supporters' Forum)
A way to end PageJacking? [webmasterworld.com] (Supporters' Forum)
Just got google-jacked [webmasterworld.com] (Supporters' Forum)
Our company Lisiting is being redirected [webmasterworld.com]

This thread is for further discussion of problems due to Google's 'canonicalisation' of URLs, when faced with HTTP redirects and HTML META refreshes. Note that each new idea for Google or webmasters to solve or help with this problem should be posted once to the Google 302 Redirect Ideas [webmasterworld.com] thread.

<Extra links added from the excellent post by Claus [webmasterworld.com]. Extra link added thanks to crobb305.>

[edited by: ciml at 11:45 am (utc) on Mar. 28, 2005]

 

theBear

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 1:46 pm on Apr 18, 2005 (gmt 0)

larry,

I can confirm that the "jacker" urls within a site view are no longer showing.

I had a sticky from a fellow member who I was working with, he went looking for the 302's I stickyed to him that were showing up as being part of his site.

I also confirmed that the leaches attached to one of our sites also no longer show up in a site: search.

And a certain Drudge no longer has any attached to his site. In fact I looked at 15 sites that I knew about having leaches and they were all gone.

Now is the problem fixed?

I don't know it could just be hidden

Atticus



 
Msg#: 28742 posted 5:31 pm on Apr 18, 2005 (gmt 0)

Sounds like G is willing to hand edit the index when it comes to sweeping dirt under the rug.

I'll be happy to recant when and if my G referals start topping AltaVista referals again.

theBear

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 5:48 pm on Apr 18, 2005 (gmt 0)

Atticus,

LOL, nothing like putting it in their face.

I shall have to do some searchs a bit later. There were whole piles of sites that I was aware of that got hit.

I stopped after looking at 15 of them so I don't think we had a hand edit done.

Altough I did tell Google how to cleanup the mess I don't think they would pay a clerk to sit at a computer to hand delete many entries.

When one PHd programmer type could do far more damage automagicly if you get my drift.

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 8:10 pm on Apr 18, 2005 (gmt 0)

As far as i'm concerned, i now consider this particular issue solved wrt. Google. Unless it's all just temporary, that is. I had a strong feeling it was going to take a long time, and it seems that was right. Now we just need MSN to solve it as well.

If anyone from GOOG should read this: Thanks a lot :)

Now it will be interesting to see at which rate the sites that were hit will surface again :)

steveb

WebmasterWorld Senior Member steveb us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 8:44 pm on Apr 18, 2005 (gmt 0)

Hmm, I'm certainly not going to consider the issue "solved" until previously hijacked sites return to ranking somewhere in the ballpark of where they were pre-hijack.

My ex-hijecked site does now appear 11th for a seearch for its site name, better than not at all, but not first like it did for four years. Other example searches include going from 21st to out of the top 1000, and 1st to sixtieth.

They may be handling the technical part now, but the damage to their index continues. In the above site's case, it appears this four year old site is being treated as if sandboxed, that is, brand new after the hijacking URLs were removed.

Dayo_UK

10+ Year Member



 
Msg#: 28742 posted 9:07 pm on Apr 18, 2005 (gmt 0)

steveb raises a point I am concerned with also.

If Google do sort the problems with the index will previously established sites be sandboxed.

6 Visitors from Google for one of my sites today :( - Lycos brought more visitors.

Lorel

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 9:42 pm on Apr 18, 2005 (gmt 0)


I also confirmed that the leaches attached to one of our sites also no longer show up in a site: search.
Now is the problem fixed?

That would be a big relief!

However I spotted something else today while trying to figure out what exactly is happening on a site I manage which I will call "target site".

I was checking the ranking of the site name for the target site with a tool that gives the Google ranking for keywords and found a PHP redirect to the site in the results, instead of the target sites site name, and it's ranking #62 in Google.

I checked the Site: and Allinurl: commands and nothing there.

But the site, while it is fully indexed and has been since early January, and has at least 50 quality links, is not ranking for it's major keywords in Google and particularly it's site name (but is doing fine in Yahoo) and the site is almost 6 months old. By now it should be ranking #1 for at least the site name. And thus my suspicions that there is a redirect affecting this site.

I checked out the redirecting site and while there is a real URL on the page to the the target site it is NOT an active link, yet there are other links for Comments and votes, and the Site name for this site which all contain a php redirect. SO I'm wondering why they purposely added the real URL? Possibly to make it APPEAR legit to the unknowing or for Google to pick up the URL?

So now, along with the above discussion I'm hoping google has indeed removed the redirects from it's index, BUT what if it has just removed the evidence?

I was able to find that site with the redirect by searching google for the site name and looking for #62 in the results. it's still in the results with a redirect pointing to the target site. but it's not in the site: search.

Emmett

5+ Year Member



 
Msg#: 28742 posted 9:49 pm on Apr 18, 2005 (gmt 0)

Just checked today and sure enough, there are no more 302's assiciated with any of my sites that I can find :D

As far as traffic goes my most affected site is doing better than before since the latter part of March.

The newer site is either still sandboxed or not optomized correctly, I get a handfull of referrals each day but my main traffic comes from links there so its no big deal.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 11:01 pm on Apr 18, 2005 (gmt 0)

OK so site:www.dmoz.org now gives zero results (was 20 000 a few days ago, all of which were 302 redirects or scapers).

What about this then?

A site:dmoz.org search says there are 11 million results, but you can't get beyond 950 results however hard you might try to do so.

What does that 11 million figure represent anyway? There are only 600 000 categories, and 600 000 category charters, and 60 000 editor profiles, and a few hundred guidelines and informational pages to index, making only about 1.2 million real pages.

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 11:07 pm on Apr 18, 2005 (gmt 0)

It looks like it might just be a simple patch.

All they had to do is disallow non-mysite.net URLs from
the results of searches for site:mysite.net.

That's way faster and easier than actually discrediting 302 hijacks.
One indication: site:mysite.net indicates 153 pages.
I get to about 147, and it stopped listing them, saying "similar results were not shown."
I clicked to see the full list. I STILL got 147, and
the "similar results" option disappeared.

What and were are those last 6 links?
That's about how many phony 302's I had previously.

The proof will be in the SERPs, but not in my case.
302s didn't affect me that badly once I got some kicked out. -Larry

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 11:28 pm on Apr 18, 2005 (gmt 0)

Here's one, completely new to me:

Doing a site:mysite check, I found 4 of my pages with title only, no description.

I did a Copyscape check on those looking for scrapers.

One page wasn't scraped really, just my anchor-text and a snippet,
but get this: the hypertext LINK reads:
<a href="/go/jjj.yneelungpu.arg"> #*$!x Map: Eastern Hemisphere</a>

jjj.yneelungpu.arg?

I put that into the address bar and of course there was no such URL or TLD.

The 'linking' site had similar baby-talk URLs for other sites besides mine.

I'm used to the /scrapings.php/site#123 type ripoffs, but what is this?

What are the mechanics of the "/go part of such an hyperlink? -Larry

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 11:34 pm on Apr 18, 2005 (gmt 0)

Okay, i might have been a little too early with the optimism, so perhaps i should not call it solved yet. It's a very clear indication that something is being done actively, though.

I agree 100% that "the proof will be in the pudding [webmasterworld.com]" so let's see some of those sites come back before we jump for joy.

If this "data wash" is anything like a real update (it should be similar regarding the amount of data to update) then it will take time as batches run and data is shifted betweeen DC's and such. To make the sites come back a real update is probaly needed on the cleaned data as well, so it might still take a while.

claus

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 11:37 pm on Apr 18, 2005 (gmt 0)

>> jjj.yneelungpu.arg?

It's just a file name. It can be anything, even a php script. Its probably an ID field in a database using letters in stead of numbers (for whatever reason).

(eg. a rewrite of "go.php?jjj&yneelungpu&arg")

[edited by: claus at 11:49 pm (utc) on April 18, 2005]

theBear

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 11:48 pm on Apr 18, 2005 (gmt 0)

Claus,

I agree something is being done.

What follows is highly maybe could have or could happen but we don't know or may never know. In other words take it with a dump truck load of salt.

However getting the baby out of the drain pipe where it was tossed isn't a simple matter.

Sites may have been split because of this, that means that PR probably took hits and sites downstream got a kick in the head as well.

Then folks who said I'll wait got classified as spammers and then those sites started losing pages.

Then there were all those new 301's that while they prevented one form of site cancer may have tripped the new link addition filters.

So damned if you do damned if you don't.

GoogleGuy

WebmasterWorld Senior Member googleguy us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 11:59 pm on Apr 18, 2005 (gmt 0)

We've been steadily improving our heuristics for 302s based on the feedback that you've sent us. There have been two recent changes that I know of. We changed things so that site: won't return results from other sites in the supplemental results. We are also changing some of the core heuristics for the results for 302s. I believe that most of these changes are out, but there may be a few more in the pipeline.

Note that for inurl: and allinurl: searches, results from other sites are perfectly valid. So if you own yoursite.com and do a search allinurl:www.yoursite.com, it's a completely valid result to get a url from www.someothersite.com/resources?url=www.yoursite.com, for example. That's how inurl: and allinurl: are supposed to work--they match all docs with the requested terms in the url, not just docs on www.yoursite.com. That doesn't imply any problem/hijacking/issue; just that someone else had your domain name in their url.

Thank you for the feedback that people have given us about 302s. I'd be interested to hear if anyone sees a result where site:yoursite.com returns urls from domains other than yoursite.com. You might want to wait another few days before checking though, to give things time to get fully out. I have to duck out right now, but I'll try to stop by and give more details as things are more fully deployed.

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 12:18 am on Apr 19, 2005 (gmt 0)

Hello GG:

Thanks for a fast long awaited response.

Its good that other sites are removed from the site: search,
but it also kills our main way of finding malicious 302 redirects.

"We are also changing some of the core heuristics for the results for 302s."

That, I think, is the burning question!

Will the heuristics changes prevent a malicious site from scoring for my original content?
Will these changes pass PR (etc.) thru to the actual pages with the content?
Will the 302-jackers be derated if not penalized?

Since there remain legitimate uses for the 302 redirect,
is there a simple way to only credit those redirected to the same domain?
That alone might solve most of the problem. -Larry

Marval

10+ Year Member



 
Msg#: 28742 posted 12:24 am on Apr 19, 2005 (gmt 0)

I agree about the core changes and also agree that taking away the ability to see those other sites with the site: command might not be the best solution as many of us have used them to Googles benefit as well as our own to remove people that were scraping.

I also noticed that the algo for the site: command may have a little glitch as you can get the total number of results (pages for the site) on the first page (ex.125 pages) and that number reduces itself as you go deeper in the results - losing 1 or 2 pages each 10 results in the total pages number at the top. Ive tried it and replicated it across a few domains but of curse others may not see the same thing.

I was also wondering why the &filter=0 filter wording was changed - seems not to have the same effect anymore as it was originally designed and talked about here a little over a year ago - seems that Google still has a similar filter working but it's not accessible using that command? It was very useful in seeing if Google considered something from your site a duplicate result - again helping us find scrapers and report them.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 12:31 am on Apr 19, 2005 (gmt 0)

>> We changed things so that site: won't return results from other sites in the supplemental results. <<

Why only in supplemental results? What about normal results too?

Not showing the sites in the search results is not the same as not having the rogue URLs in the database. Not the same by a very long way.

Much of the stuff that has been seen in recent results should not have even made it into your database.

Why can't this be fixed by going back to how things were a few years ago? And, dare I mention the logic that Yahoo applies to 301 and 302 redirects, and the different way that they treat onsite and offsite redirects now?

As for the search I mentioned above, is it really true that out of 1.2 million pages, Google only has 950 pages indexed for the ODP now (but says there are 11 million when you first look)? This filtering of the results served, rather than cleaning of the internal dataset, seems to have some flaws perhaps?

larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 12:58 am on Apr 19, 2005 (gmt 0)

(copied from a dictionary site. -LH)

HEURISTICS - This describes a set of rules developed to attempt to solve problems when a specific algorithm cannot be designed.
For example, if the problem is "When do you eat food?", if you answer, "When I'm hungry" then you would have to eat immediately every single time you were hungry.
Instead, we follow heuristics to determine when to eat by gauging our hunger level, the situation we are in, and our ability to get food. As you can imagine, heuristics are very important for solving artificial intelligence problems.

GoogleGuy

WebmasterWorld Senior Member googleguy us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 1:06 am on Apr 19, 2005 (gmt 0)

g1smd, the changes had already been applied for the main results. The web has changed over time, but url canonicalization is definitely on our radar now--contacting at google.com/support with the title of "canonicalpage" will make it to engineers who read the reports and suggestions.

larryhatch, I believe the answers are yes, I'm not sure given the current heuristics, and yes. Marval, if someone is doing 302s to your site, you might be able to find redirecters by looking in your server logs for unusual referrers. I'll ask about filter=0. There's been some index changes lately, but I hadn't heard about any changes with filter=0.

theBear

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 1:15 am on Apr 19, 2005 (gmt 0)

GoogleGuy,

I was looking at a site today where as near as I could tell every page was marked supplemental this site up until recently had a large number of 302 leaches within its site: view.

It doesn't look like the owner of that site stands a chance.

Assuming (and I know what that does) that the problem was "duplicate content due to the 302's", how long will it take for his site to recover? Will it recover?

Marval

10+ Year Member



 
Msg#: 28742 posted 1:50 am on Apr 19, 2005 (gmt 0)

GG - been using that term in msgs to the engineers for almost 6 months now trying to provide some information that seems to being "missed" with an algo tweak performed back in August last year - I do get a lot of the thanks we'll forward the msg replies, but seems to be a black hole - not asking for specific or even "personal" feedback, but certainly would be nice to at least hear that some of the information we are providing is making it to the right ears.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 1:54 am on Apr 19, 2005 (gmt 0)

Googleguy,

The unrelated urls that were once showing in site: search are gone. But, how long will dup penalties last? Are we still looking at 90 day penalties from this point on? As Dayo_UK says
steveb raises a point I am concerned with also.
If Google do sort the problems with the index will previously established sites be sandboxed.

Are provisions being made to allow sites to return to the serps after being penalized for problems beyond their control? :)

C

GoogleGuy

WebmasterWorld Senior Member googleguy us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 3:02 am on Apr 19, 2005 (gmt 0)

crobb305, in many of the cases that I've examined, a spam penalty comes first. That spam penalty causes the PageRank of a site to decrease. Since one of the heuristics to pick a canonical site was to take PageRank into account, the declining PageRank of a site was usually the root cause of the problem. That's what happened with your site, crobb305. So the right way forward for people who still have problems is to send us a reinclusion report.

steveb

WebmasterWorld Senior Member steveb us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 3:09 am on Apr 19, 2005 (gmt 0)

I don't know about usually, but I do know that part of the problem is that Google apparently believes "one of the heuristics to pick a canonical site was to take PageRank into account" when that is seldom the case. The key problem with every instance of hijacking I've seen is pagerank is ignored in judging the canonical page.

crobb305

WebmasterWorld Senior Member crobb305 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 28742 posted 3:20 am on Apr 19, 2005 (gmt 0)

GG, thanks for the info. Odd thing is, my PageRank dropped to Zero last September, but back to PR7 in December (at least based on toolbar). If declining page rank is the root of the problem, would one expect a reversal of spam penalty as PageRank returns?

If the home page is still indexed with title and description, do you still need to do a reinclusion request? Seems they will say "your site is aleady indexed"

C

arubicus

10+ Year Member



 
Msg#: 28742 posted 3:34 am on Apr 19, 2005 (gmt 0)

GG - I am currious to know what you at Google think of search results lately. Many of searchers I know and webmasters have been complaining about results being of a lower quality.

Many, many, many webmasters have been adversly affected by recent updates also. Many of sites have went supplemental or even loosing bulk pages. Is this a short term side effect of the recent changes, 302 fixes, or should those webmasters look to their own site as the cause of the problem. Don't want to try and fix something that isn't broken.

I am also currious on old 301 redirects why new content is being cached/displayed under old the old redirected URL and not the new. Could this cause any problems?

howiejs

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 28742 posted 3:35 am on Apr 19, 2005 (gmt 0)

Silly question "reinclusion report" -- is this the same as Add Url?

My story:
Redirect dropped 2k pages - was showing the other site - now returning no results

(I had the redirect removed)

What do I do?

arubicus

10+ Year Member



 
Msg#: 28742 posted 3:36 am on Apr 19, 2005 (gmt 0)

GG- BTW thank you so much for a long awaited response. These explainations will be of great help to us all.

Nosmada

10+ Year Member



 
Msg#: 28742 posted 4:01 am on Apr 19, 2005 (gmt 0)

As soon as I saw a massive drop in my traffic that was the first day that inurl: showed that I had been hijacked by 302 redirects. My pr was rising. So what you are saying GG doesn't make sense? I used the URL remover tool but I have done this with other sites and it doesn't seem to help me. If I was using site: I wouldn't have found the 302 hijackers?

As soon as a site gets to around 6000 plus page views it gets hijacked and killed? Nothing can ever do well in this environment and maybe this is the idea?

joeduck

10+ Year Member



 
Msg#: 28742 posted 4:16 am on Apr 19, 2005 (gmt 0)

GG thanks for input - very helpful.

Regarding canonical page identification:

Our site is very large and spread over several domains and we've had serious canonical problems recently (we think we have fixed them with 301s)

Should we consolidate under 1 domain to make it easier to be spidered correctly?

This 467 message thread spans 16 pages: < < 467 ( 1 2 3 [4] 5 6 7 8 9 10 11 12 ... 16 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved