homepage Welcome to WebmasterWorld Guest from 54.211.235.255
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 713 message thread spans 24 pages: < < 713 ( 1 ... 13 14 15 16 17 18 19 20 21 22 [23] 24 > >     
302 Redirects continues to be an issue
japanese




msg:748407
 6:23 pm on Feb 27, 2005 (gmt 0)

recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

 

Reid




msg:749067
 1:07 pm on Mar 17, 2005 (gmt 0)

Mike D do a google search site:w*w.yoursite
if you see your homepage listed there with the proper title and description but a different URL
then that is a hijack (not always on purpose) there shouldn't be ANY foreign URL's in there.
Feb 28 isn't all that far off especially since google has been dancing but it is a bit lagging.
Keep an eye on that site: listing in case something pops up. That is the list of pages google has
of your domain.

Safaridude - google is all too obviously aware of what's going on, it's in the news, it is a difficult
problem to deal with. What is happening is people are linking to other sites with a 302 redirect which means
(in the future use the url of this page to find that page) so google IS following the user_agent standard
however they shouldn't in this case.

Zeus Nov 3 is probably too early since raw log files are usually discarded after six months but check your server,
it will be dated 20041103.gz or something like that. If it's there download the Nov 4 one. each day the raw log
file of the previous day is created. Unzip it with winzip (or equivalent) and open it with wordpad (not notepad).
If its there I can help you decipher it - it looks daunting but it's really very simple.

edited to fix a date error

hereforinfo




msg:749068
 1:30 pm on Mar 17, 2005 (gmt 0)

Japanese

thanks.. "the legitimate sites index page was absent in the result, most probably penalized by google" this clearly has happened to my site

i appreciate all the mode rewrites posted BUT how does this help a site owner who has no idea how to implement it..

wonderful examples snipped and googleguy no where in site... i was reading this post hoping to be able to send someone my site example.. im no seo but its a shame when someone cant find a site typing in a 5 word search with the exact company name.. how is someone to know that www.hijacking.com//cgi-bin/datacgi/database. cgi?file=LinkManager&form=HitOut&&record=33... - 9k - is my site.. Japanese started this thread and has now left the building.. what a shame

kaled




msg:749069
 2:45 pm on Mar 17, 2005 (gmt 0)

Has anybody tried to file a DMCA with Google concerning a page hijacking.

If Google attributes their cached page incorrectly, then you would have a good case.

Kaled.

Safaridude




msg:749070
 3:07 pm on Mar 17, 2005 (gmt 0)

Wouldn't the company that was in DCMA contravention be google itself in this case, because they are the ones that are attributing the cache in the index in that way and not the site with the link on it, even though that may be the intention of the link on the site.

That would open up a whole new can of worms.

WebFusion




msg:749071
 3:10 pm on Mar 17, 2005 (gmt 0)

Let me ask this to those who are more familiar with the technical requirements of the net...

What negative affect would simply eliminating the following of a 302 redirect by Google have? In other words, as opposed to working a technical fix, what would happen if they simply stopped indexing/following 302 redirects? Would it cause widespread damage to the net?

idoc




msg:749072
 3:54 pm on Mar 17, 2005 (gmt 0)

"what would happen if they simply stopped indexing/following 302 redirects? Would it cause widespread damage to the net?"

Therein lies the dilemma... what if all those redirect tracking adlinks stopped being indexed? It will never happen. The reason this hijack is happening to begin with is partly dependent on necessitating the indexing of url's (such as these) that aren't real web pages to begin with. What I mean by that is...the url only exists as part of a redirect url.

I might post something on this later in the apache forum. I have spent some time the past couple days looking to how maybe to immunize for this by customizing apache. There are some folks in that forum way ahead of me on apache.

Chard




msg:749073
 4:28 pm on Mar 17, 2005 (gmt 0)

MikeD,
I don't think your sites old cache date is a sign of anything ominous - or if it is, I'm in trouble as well!
My sites Google cache has been all over the place lately, including showing no date on the serps display, even when the cached page has a very recent date on it. I think my cached page showed 28th Feb a few days ago, 8th March yesterday (16th) and 16th March today. Other sites around me seem to show pretty much the same, so it's either Google messing about or everyone's getting hijacked!

blend27




msg:749074
 4:39 pm on Mar 17, 2005 (gmt 0)

--- dependent on necessitating the indexing of url's --

IDOC
You right on this one, if Google would remove all 302's they would simply loose the battle of "I got more pages indexed than Others Media war", and the perception of "We are big, with lots of DATA Idea" - would simply heart them big time.

it is no secret by this time that:
Example:
http:...../PRIVACY.html
https:...../PRIVACY.html
http:...../privacy.html
https:...../privacy.html
https:...../Privacy.html
http:...../Privacy.html

Count as 6 PAGES in Google index, since there is a CACHED copy of each page in the index. So imagine on dynamic site that has 1000 products, having a CASE WRONG on URL variable. - 2000 pages indexed. What are the penalties for that? Duplicate content?

[edited by: blend27 at 4:49 pm (utc) on Mar. 17, 2005]

Dayo_UK




msg:749075
 4:42 pm on Mar 17, 2005 (gmt 0)

I am a bit confused whether I am suffering from a Hijack or not - my cache date is from 16th Feb - but internal pages are much more recent.

Also on a check on domain.com is showing a PR0 while www.domain.com shows PR5 - I have therefore redirected non-www to www to see what happens.

claus




msg:749076
 4:56 pm on Mar 17, 2005 (gmt 0)

Reid, you could probably make the redirect from "/" to "/index.html" work - try posting it in the apache forum to get more specific response :-)

theBear




msg:749077
 5:12 pm on Mar 17, 2005 (gmt 0)

Google can find more than 8 billion pages no problem at all.

However it need not manufacture any.

I can see it now Google programmer goes to work for large bank and implements a transfer by copying money and putting it into someones account.

Yep the auditors would just love it along with the OCC, FICA, various state agencies, treasury folk, and the FBI as it creates plenty of employment.

In this case the recieving account holder would probably soil themselves while trying to get their account closed and located elsewhere.

However I don't think the company president would like to have to restate results, etc... etc ...

Most places I know of would roll heads really fast.

mikeD




msg:749078
 5:44 pm on Mar 17, 2005 (gmt 0)

cheers for the replies guys, just getting a little paranoid. I know plenty of ppl who have been hit. It's not very reassuring.

I am a bit confused whether I am suffering from a Hijack or not - my cache date is from 16th Feb - but internal pages are much more recent.

i am seeing this which is rather odd

londoh




msg:749079
 6:12 pm on Mar 17, 2005 (gmt 0)

Has anybody tried to file a DMCA with Google concerning a page hijacking

Yes I got redirected pages taken out of the index by doing this. Its my content - and neither google or the scraping site has any right to pretend otherwise.

The site was MIA. It took about about 2 or 3 weeks for g to remove the offending pages and about another 6 weeks for the site to reappear exactly where it should be.

But whether or not it works, I think its worth hitting them with as many DMCA's as possible on the basis that somebody there might (possibly) take notice

thanks to japanese

Reid




msg:749080
 6:23 pm on Mar 17, 2005 (gmt 0)

appreciate all the mode rewrites posted BUT how does this help a site owner who has no idea how to implement it..

We are still working on it but have not come up with a viable solution yet.
Believe me if we find one it will be the news of the year on webmaster world.

Thanks Clause this idea has a chance - better check with jpMorgan in apache forum - that guy knows his stuff.

The cache date means nothing... googledance is a good sign .. shows they are working on it. Clause stated earlier that he's seen some 302 pages disappear, sounds like good news.

Again here is the test to see if you are a victim of 302 hijacking bug.
search google for
site:w*w.yoursite

This is a list of pages from your domain that is in the Google index. There should be no other urls in there. A hijacking page will typically show up in there (usually duplicating your homepage) with your title and description but a url from another domain.
ALL url's in there should be your own without variations.
Usually it will be a dynamic url (one with a? in it)
It could be an appended one (w*w.somesite.yoursite.com)
whatever it is it will be obvious to you that it is not yours. It may appear as just a link with no title or description, if it not one of YOUR url's it does not belong there - google is associating it with your domain.

Lorel




msg:749081
 6:49 pm on Mar 17, 2005 (gmt 0)

I have mentioned this before but nobody seems to find it an issue. I have noticed that some of these "innocent" Google bug "redirects" have the same shared IP as the victim. The redirecting site has a 302 linking from an old site to their new site and somehow it gets applied to another person using the same hosting co under the same shared IP #.

has anyone researched this?

Lorel




msg:749082
 7:06 pm on Mar 17, 2005 (gmt 0)

Hi Reid,

this is old news--I can't get caught up with you guys--go take a holiday :o)


Hey japanese is that your page about 302 hijacking at Loris web?

A page that talks all about how to detect the various methods of 302 hijacks and what to do about it.

It is a very comprehensive page, showing various hijack methods and how to detect them, the solution however is along the lines of whois search and contacting hosts for TOS violations.

It's my own page. I have 25+ clients, several of which have been affected by hijackers, including myself, and I wrote that partly out of experience and from reading these threads and other research.

I wrote it simple enough for a newbie to understand. I'm not a programmer so I don't get into that part of the problem although I do link to sites that do.

Submitting to sites and later finding your site hijacked and or in a frame even happens to myself, even with all my cautions about avoiding bad links on another page I authored, and with finding links for 25 clients it happens often--too often.

I wish google would fix this. I spend several hours a day researching this matter or chasing bad links instead of earning a living.

*edited spelling

[edited by: Lorel at 7:08 pm (utc) on Mar. 17, 2005]

kaled




msg:749083
 7:06 pm on Mar 17, 2005 (gmt 0)

Whilst it may be useful to people to make themselves feel better, trying to workaround this problem is probably akin to trying to fix a flying saucer with chewing gum and a solder.

Clearly, there is a large random factor at play (or a factor that is simply not understood). When I checked a few days ago, there were three clone sites each trying to hijack one of my pages, but it's fine. That's been the case for several months.

I'm all in favour of experimentation, but if a workaround is found, it will require a great deal of luck.

Kaled.

Safaridude




msg:749084
 7:24 pm on Mar 17, 2005 (gmt 0)

Londoh

When you filed the DMCA did google get the offending sites to remove their links to your site or did they just clear the cache and point googlebot in the right direction?

twist




msg:749085
 7:47 pm on Mar 17, 2005 (gmt 0)

While reading through some apache stuff on something totally unrelated I came across this,

Time-Dependend Rewriting

RewriteEngine on
RewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
RewriteCond %{TIME_HOUR}%{TIME_MIN} <1900
RewriteRule ^foo\.html$ foo.day.html
RewriteRule ^foo\.html$ foo.night.html

How about instead of just night and day you set your page to have minor changes every hour or two. Actually create 12 or 24 copies of your homepage and change minor things in the coding and content. Not so much that any casual visitor would notice but enough that google might not see it as duplicate content. For example, replacing a few tables with divs and vice versa.

Not a fix by any means but until a fix is found it could be something to try.

Jim_at_SFE




msg:749086
 7:57 pm on Mar 17, 2005 (gmt 0)

The assumption that these redirect pages are being created maliciously may be wrong, at least in many cases. I just looked to see what comes up with a site:mydomain.com search and found a number of legitimate sites listing my pages with their URLs, all ending with something like "redirect.php?id=256" or "go.php?id=576", etc. One of them is a high school site, and another seems to be a labor union organization. There must be widely available php scripts or broader CMS software packages that are using 302 redirects as a way of tracking click-thrus on links. There have long been cgi scripts for tracking click-thrus on links but I think they were written with "location=url" to do the redirect, not a 302, and search engines did not interpret those links as new pages.

Jim

Webdetective




msg:749087
 8:36 pm on Mar 17, 2005 (gmt 0)

I only just learned about 302 hijacks today. Somebody pointed out to me that a php script I have been using on my site was creating links that look something like:

http:URL....../go.php?id=aHR0cDovL3d3dy5zcHllcXVpcG1lbnRndWlkZS5jb20vcGVvcGx

He discovered 2 of my links that look like this when he did a Google search inurl:www......com for his site. I was completely unaware my script was doing this until he brought it to my attention.

I corrected the problem and apologized to him. I also thanked him for bringing this to my attention.

crobb305




msg:749088
 8:54 pm on Mar 17, 2005 (gmt 0)

Webdetective

Redirects showing in an inurl: search prove nothing. Are those redirects showing when you search site:mysite.com? The site: command should show only the pages that Google thinks are truely part of your site. If any of those unrelated urls are showing, then you know there is a problem.

The inurl: search will show any page that has the search phrase in it's url. It says nothing about hijacking. If you are redirecting to him, it could be harmless. But, I suppose it's better to not risk it since Google has become increasingly incompetent and has proven it's inability to rank sites. Original content sites are getting replaced by 302s, tracker2s, scrapers and directories.

twist




msg:749089
 9:03 pm on Mar 17, 2005 (gmt 0)

This thread is approaching almost 700 posts and i'm only guessing but I would have to say about 500 or people asking people to explain the problem or asking if they are being hijacked. About 100 posts from people who have been hijacked. About 50 posts from people trying to explain the problem and a handful of posts of people trying to figure out how to deal with the problem.

Any chance someone wants to start a new thread that starts with a complete and detailed description of the problem and then only allow people to post their ideas or possible fixes to the problem. A person looking for solutions on how to deal with this problem would be lost looking through this monstrous thread.

zeus




msg:749090
 9:19 pm on Mar 17, 2005 (gmt 0)

Twist - There are NO solution for this, google has to fix the way googlebot is indexing redirecting links.

You can remove some of the hijackers with the google remove tool, but its not a sure thing.

Atticus




msg:749091
 9:47 pm on Mar 17, 2005 (gmt 0)

Greetings;

I operate about a half dozen related, loosely interlinked web sites which have been page one on Google for years for various competitive and non-competitive phrases. Last summer things got real goofy -- pages which had ranked well fell down or out, only to return to high listings sometime later, then drop out again...you know the drill.

I couldn't figure out why Google loved me, then hated me, then loved me. But this 302/duplicate content issue seems to make alot of sense for these otherwise unexplainable yo-yo like SERPS. I am quite convinced that this issue is responsible for my traffic being down about 2 million page views per month.

Here's the thing -- I have found only 5 URLs in Google that were "hijacked" versions of my pages. Seems like alot of damage for 5 lousy pages. I found zero hijacked pages for my biggest site which is nevertheless suffering from the G loves me/hates me yo-yo syndrome. So I suspect that there may be additional false URLs somewhere in the system that are screwing me over dupe content wise, but which don't show for any of the "how to find the hijacker" searches discussed here. Seems reasonable considering G's inability to list proper backlinks for the past year or two.

So what do you all think? Is it possible that some sites are effected by this problem even if the owner finds very few or even no "hijacked" pages via methods listed here?

twist




msg:749092
 10:03 pm on Mar 17, 2005 (gmt 0)

Twist - There are NO solution for this, google has to fix the way googlebot is indexing redirecting links.

Your saying there is no solution, protection, precautionary measures, nothing a person can do to make it harder or more difficult for people to hijack your page? I find it hard to accept as a (wanna-be) programmer, that every possible angle on dealing with this problem on our side has already been tried.

It feels like jim (aka jdMorgan) is clark kent (aka Superman) in superman 2, the (webmaster)world is in trouble and he is off with lois lane somewhere completely unaware of the problem. The hijackers have taken over and we need jdman to save us.

For those that don't know, jdMorgan is the moderator for the apache forum and a programming GOD!

theBear




msg:749093
 11:01 pm on Mar 17, 2005 (gmt 0)

Atticus,

Welcome.

Now since those injected pages are scripts and Google thinks they are part of your site what happens if:

<a href="www.example.com/badsitewithevilthings/">site1</a>
<a href="www.example.com/badsitewithbadthings/">site2</a>

etc... etc .... gets presented to Google when its bot visits?

It is a script under the control of another party in most of these cases (it doesn't have to be however).

This is an arbitrary executible code injection bug.

Look past the dup content issue for a moment.

Atticus




msg:749094
 11:08 pm on Mar 17, 2005 (gmt 0)

Hi thebear,

Completely lost me with the code injection thing and looking past dupe content -- can you restate, please?

theBear




msg:749095
 11:14 pm on Mar 17, 2005 (gmt 0)

If Google thinks that page [injected script (executable code)} then when Google bot runs (executes the code in) the script it can be presented with whatever the "jacker" wishes it to.

Atticus




msg:749096
 11:18 pm on Mar 17, 2005 (gmt 0)

thebear -- so you are saying that since G thinks these pages are my pages that if these pages link to bad neighborhoods ten G thinks I'm linking to bad neighborhoods?

While digging deeper into one of the hijacker sites I did see some questionable adult content...

claus




msg:749097
 11:18 pm on Mar 17, 2005 (gmt 0)

>> protection, precautionary measures

Twist, see msg #218 - i summed up some suggestions there. Also, there's this new thread on one method of removing a redirect script from google (you can also return a 404):

[webmasterworld.com...]

This 713 message thread spans 24 pages: < < 713 ( 1 ... 13 14 15 16 17 18 19 20 21 22 [23] 24 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved