homepage Welcome to WebmasterWorld Guest from 23.20.44.136
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 713 message thread spans 24 pages: < < 713 ( 1 ... 10 11 12 13 14 15 16 17 18 19 [20] 21 22 23 24 > >     
302 Redirects continues to be an issue
japanese




msg:748407
 6:23 pm on Feb 27, 2005 (gmt 0)

recent related threads:
[webmasterworld.com...]
[webmasterworld.com...]
[webmasterworld.com...]



It is now 100% certain that any site can destroy low to midrange pagerank sites by causing googlebot to snap up a 302 redirect via scripts such as php, asp and cgi etc supported by an unseen randomly generated meta refresh page pointing to an unsuspecting site. The encroaching site in many cases actually write your websites location URL with a 302 redirect inside their server. This is flagrant violation of copyright and manipulation of search engine robots and geared to exploit and destroy websites and to artificially inflate ranking of the offending sites.

Many unethical webmasters and site owners are already creating thousands of TEMPLATED (ready to go) SKYSCRAPER sites fed by affiliate companies immense databases. These companies that have your website info within their databases feed your page snippets, without your permission, to vast numbers of the skyscraper sites. A carefully adjusted variant php based redirection script that causes a 302 redirect to your site, and included in the script an affiliate click checker, goes to work. What is very sneaky is the randomly generated meta refresh page that can only be detected via the use of a good header interrogation tool.

Googlebot and MSMBOT follow these php scripts to either an internal sub-domain containing the 302 redirect or serverside and “BANG” down goes your site if it has a pagerank below the offending site. Your index page is crippled because googlebot and msnbot now consider your home page at best a supplemental page of the offending site. The offending sites URL that contains your URL is indexed as belonging to the offending site. The offending site knows that google does not reveal all links pointing to your site, takes a couple of months to update, and thus an INURL:YOURSITE.COM will not be of much help to trace for a long time. Note that these scripts apply your URL mostly stripped or without the WWW. Making detection harder. This also causes googlebot to generate another URL listing for your site that can be seen as duplicate content. A 301 redirect resolves at least the short URL problem so aleviating google from deciding which of the two URL's of your site to index higher, more often the higher linked pagerank.

Your only hope is that your pagerank is higher than the offending site. This alone is no guarantee because the offending site would have targeted many higher pagerank sites within its system on the off chance that it strips at least one of the targets. This is further applied by hundreds of other hidden 301 permanent redirects to pagerank 7 or above sites, again in the hope of stripping a high pagerank site. This would then empower their scripts to highjack more efficiently. Sadly supposedly ethical big name affiliates are involved in this scam, they know it is going on and google adwords is probably the main target of revenue. Though I am sure only google do not approve of their adsense program to be used in such manner.

Many such offending sites have no e-mail contact and hidden WHOIS and no telephone number. Even if you were to contact them, you will find in most cases that the owner or webmaster cannot remove your links at their site because the feeds are by affiliate databases.

There is no point in contacting GOOGLE or MSN because this problem has been around for at least 9 months, only now it is escalating at an alarming rate. All pagerank sites of 5 or below are susceptible, if your site is 3 or 4 then be very alarmed. A skyscraper site only need create child page linking to get pagerank 4 or 5 without the need to strip other sites.

Caution, trying to exclude via robots text will not help because these scripts are nearly able to convert daily.

Trying to remove a link through google that looks like
new.searc**verywhere.co.uk/goto.php?path=yoursite.com%2F will result in your entire website being removed from google’s index for an indefinite period time, at least 90 days and you cannot get re-indexed within this timeline.

I am working on an automated 302 REBOUND SCRIPT to trace and counteract an offending site. This script will spider and detect all pages including sub-domains within an offending site and blast all of its pages, including dynamic pages with a 302 or 301 redirect. Hopefully it will detect the feeding database and blast it with as many 302 redirects as it contains URLS. So in essence a programme in perpetual motion creating millions of 302 redirects so long as it stays on. As every page is a unique URL, the script will hopefully continue to create and bombard a site that generates dynamically generated pages that possesses php, asp, cigi redirecting scripts. A SKYSCRAPER site that is fed can have its server totally occupied by a single efficient spider that continually requests pages in split seconds continually throughout the day and week.

If the repeatedly spidered site is depleted of its bandwidth, it may then be possible to remove it via googles URL removal tool. You only need a few seconds of 404 or a 403 regarding the offending site for google’s url console to detect what it needs. Either the site or the damaging link.

I hope I have been informative and to help anybody that has a hijacked site who’s natural revenue has been unfairly treated. Also note that your site may never gain its rank even after the removal of the offending links. Talking to offending site owners often result in their denial that they are causing problems and say that they are only counting outbound clicks. And they seam reluctant to remove your links....Yeah, pull the other one.

[edited by: Brett_Tabke at 9:49 pm (utc) on Mar. 16, 2005]

 

twist




msg:748977
 1:36 am on Mar 16, 2005 (gmt 0)

Alright, looking around at some random htaccess examples I noticed one possible(?) solution although I wouldn't know where to begin to create the code for it.

Rough example (don't actually use anybody),

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{If no TIME_MIN at end of url}
RweriteRule {Append TIME_MIN to end of url}

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{If TIME_MIN on url == current TIME_MIN}
RewriteRule ^(.*)$ [example.com...] [R=permanent,L]

Then remove the appended TIME_MIN in a php script.

You could of course set it for 5 or 10 seconds instead of a full minute.

boredguru




msg:748978
 1:45 am on Mar 16, 2005 (gmt 0)

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{If no TIME_MIN at end of url}
RweriteRule {Append TIME_MIN to end of url}

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{If TIME_MIN on url == current TIME_MIN}
RewriteRule ^(.*)$ [example.com...] [R=permanent,L]

Then remove the appended TIME_MIN in a php script.

You could of course set it for 5 or 10 seconds instead of a full minute.

Here is the problem that i see with this.
How do you plan to remove the appended time in a php script. Lets suppose the script removes the time from the url and redirects (it will have to, there no other way if you want to remove the appended time) to the original url, the code in the htaccess again catched it without the time and enter the time and sends it to the php script whhich again....... you get the drift.

Maybe i missed something. If i wrong, i love getting corrected.

Trawler




msg:748979
 1:48 am on Mar 16, 2005 (gmt 0)

Boredguru:

But...and this is a big But....what if Gbot does not go yipee one more new url. It already knows that the redirected url exists in its index. It just by default assigns that url to the hijackers url without doing an fetch.
_____________

Sorry to shoot that down but,

Gbot does fetch new data at the target and immediatly indexes it under the "302e's) url with an updated cache date. It is routine.

boredguru




msg:748980
 1:52 am on Mar 16, 2005 (gmt 0)

If that is true Trawler and you are a beautiful babe, then I love you! Else thanks.

PS: Been writing lots of if loops lately

twist




msg:748981
 2:36 am on Mar 16, 2005 (gmt 0)

How do you plan to remove the appended time in a php script. Lets suppose the script removes the time from the url and redirects (it will have to, there no other way if you want to remove the appended time) to the original url, the code in the htaccess again catched it without the time and enter the time and sends it to the php script whhich again....... you get the drift.

How about this for an approach,

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{THE_REQUEST}!^[A-Z]{3,9}\ /.*googletime.*$
RewriteCond %{THE_REQUEST}!^[A-Z]{3,9}\ /.*\?.*$
RewriteRule {Append "?googletime={TIME_MIN}" to url} [L]

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{THE_REQUEST}!^[A-Z]{3,9}\ /.*googletime.*$
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*\?.*$
RewriteRule {Append "&googletime={TIME_MIN}" to url} [L]

RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
RewriteCond %{If TIME_MIN on url == current TIME_MIN}
RewriteRule ^(.*)$ [example.com...] [R=permanent,L]

In php just check for variable and pass it along from page to page,

if(!empty( $_GET[ 'googletime' ] ) { pass it along so googlebot wont get stuck in a loop again }

Once again, have no idea if it will work but maybe it will spark an idea in someone smarter than I who can create something that will work.

Reid




msg:748982
 2:38 am on Mar 16, 2005 (gmt 0)

Ok here are some detailed results of my homepage problem.Where a#*$!y is a bogus name for the directory (which is owned by tucows by the way according to whois)

Here is the SERP from site:mysite (this is my homepage)
MY TITLE
MY TITLE. Snippet from page.............
www.axxxy.com/cgi/axxxy/go.cgi?id=175653 - 7k - Supplemental Result - Cached - Similar pages

Cached is older page from Nov 1 (very different since then) Ive changed it like 4 times since then .. google is not updating this cache and has been crawling my site every day.

They removed my link but have not associated anything else with this id# when you click it you get a 404 page on their site.

I ran it through a server header checker result:
Domain [axxxy.com...]
IP Address www.axxxy.com/cgi/axxxy/go.cgi?id=175653
Server Location
Host Name
Server Type Apache/1.3.33 Sun Cobalt (Unix) Chili!Soft-ASP/3.6.2 mod_ssl/2.8.22 OpenSSL/0.9.7e PHP/4.3.10 mod_auth_pam_external/0.1 FrontPage/4.0.4.3 mod_perl/1.29

I ran it through a page header checker result:
Page [axxxy.com...]
Response 302 Found
Last Modified No data returned
Content Type text/html; charset=iso-8859-1
Last Cached (Google) 1 Nov 2004 12:26:10 GMT

I go to google URL removal tool and get this message:
remove url check 'anything assiciated with this url'
response:
The web server has given us a redirection but has not provided a final destination. The "Location: " HTTP header is missing.

tried again but checked 'cached version only' this time.
same response.

Reid




msg:748983
 2:48 am on Mar 16, 2005 (gmt 0)

Im not even sure what I want this directory to do.
It seems they have left the page assosiated with my id# intact but have removed my url from it.
If they point it at a 404 won't google then see my homepage as 404?

Reid




msg:748984
 2:53 am on Mar 16, 2005 (gmt 0)

Oh yea my real homepage also appears in site:mysite but further down the list - looks all good (updated march 8 2005)

edit in - another thing....
since the last googledance (last week) all my descriptions have changed fom snippets to my actual META descriptions - even on the few 'supplemental results' except this bogus one - still showing snippet from cached page.
The link was removed after the descriptions changed

another thing.... when I click similar pges on my bogus homepage link it show 28 directories with the oe in question at #1 when I click 'similar pages' on THAT link....exact same results.

Reid




msg:748985
 3:44 am on Mar 16, 2005 (gmt 0)

sorry Boredguru access logs from nov 2004 are gone (begins on nov 15) gone to never never land.

When they removed that link my traffic almost doubled instantly - getting double the amount of robots too.

coincidence or related?

jk3210




msg:748986
 3:49 am on Mar 16, 2005 (gmt 0)

When you type in the original url (http://www.a#*$!y.com/cgi/a#*$!y/go.cgi?id=175653 ) into your browser's address window does your page come up?

If it returns a 404, then you can delete it via the url console. If it returns your page, you're out of luck.

Reid




msg:748987
 4:04 am on Mar 16, 2005 (gmt 0)

When I typr that url into my browser (IE6) i get IE error page "the page cannot be displayed"

"Cannot find server or DNS Error
Internet Explorer "

Reid




msg:748988
 5:14 am on Mar 16, 2005 (gmt 0)

Interesting - very interesting
Ok I did site:a#*$!y.com 175653 (thats my id#)

I get 2 pages

A#*$!Y - Search Engine Directory
A#*$!Y SEARCH engine, Domain Name Search, Whois World Wide Search and Free URL Submit.
www.axxxy.com/cgi/axxxy/ reviews.cgi?id=175653&cid=1096 - 22k - Supplemental Result - Cached - Similar pages

the link itsef is dead like the other but the cache is a voting page with a thumbnail of my current homepage up to date, almost a framed copy of alexa.

the other one - same thing (exact same title and description just like thousands of others only different id's) but different directory with a picture of my homepage it's even got that alexa graph thing 'not in the top 100' its not voting its info about my link.

heres the real kicker;
on both of those pages - MY page title (which is an active link) points to the same url of the link in site:mysite

blend27




msg:748989
 5:50 am on Mar 16, 2005 (gmt 0)

I dont speak PHP, nor Jane or Billy. Should I read more? or go back to HTML Forum?

[webmasterworld.com...]

walkman




msg:748990
 6:08 am on Mar 16, 2005 (gmt 0)

so after 580+ posts:

we can't do anything about it. Google and MSN have to step up to the plate and fix this.

Reid




msg:748991
 6:10 am on Mar 16, 2005 (gmt 0)

read more blend - it's not all php. go back few pages to get the picture.

here is something else very very interesting about a#*$!y.com

I found six documents all exactly the same but with one difference each ponts to some poor joe that they must hate.

here is what it looks like in site:a#*$!y.com

[code}
302 Found
Found. The document has moved here.
www.example.com/cgi/example/go.cgi?id=91523 - 1k - Cached - Similar pages
[/code]

when you go to the page it just says the document has moved here. the word here is a real link to some poor shmuck. actually the link goes to 'page cannot be displayed' in IE but the cached copy is viewable

[edited by: rogerd at 5:02 pm (utc) on Mar. 16, 2005]
[edit reason] examplified [/edit]

blend27




msg:748992
 6:12 am on Mar 16, 2005 (gmt 0)

--- we can't do anything about it ---

yes we can, register 2 domain names - g-jokes dot com and so on..
make jokes, have big companies sponsor space on the site, pay with it for hosting, and most important make people fill better about what they are the best in.

The POWER of CAN DO

edit: {g-jokes dot com ] already taken...

blend27




msg:748993
 6:40 am on Mar 16, 2005 (gmt 0)

It’s very simple,

I have more the $80.000 in inventory of widgets that was purchased with in last 14 month, relying on traffic from just BIG G$. If the don’t fix the ALGO, I will spent my planned budget for this year on Ads Else where.

Carl Marx said long time ago, - CAPITILIZE ON SOMEONE ELSES EFFORTS.

That simple, really.

Reid




msg:748994
 6:43 am on Mar 16, 2005 (gmt 0)

Nothing wrong with Overture

blend27




msg:748995
 7:23 am on Mar 16, 2005 (gmt 0)

Reid -- don't go there, i mean the topic, i am a retailer, all of them are the same.

claus




msg:748996
 10:44 am on Mar 16, 2005 (gmt 0)

>> The web server has given us a redirection but has not provided a final destination. The "Location: " HTTP header is missing.

Reid, i think they have removed the URL from the database, so their script still works, but it has no longer got a target URL to send the visitor off to. Of course, if you send the visitor "out into nothing" then there is nothing to delete. However, if it points to "nothing" then it does not point to your page :)

The script URL should return a 404 for it to be deleted. It will not be your homepage URL that is a 404, it will be the scrip URL.

boredguru, in other news i heard that you nailed it :)

How do you know if when the Gbot visits it visits thinking its fetching your domainname.com or it is thinking it is fetching hijacker.com/url.php?domainname.com .

I'm sorry i overlooked this, there's just too many threads and posts.

I don't know this - the 302 script by itself sends a referrer which is the page the script is on, not the exact script url. It only does this when accessed from the page via a click on a link, not when the script URL is accessed directly.

It's easy to show: Just put up a php/asp/cgi page that does nothing but display the referrer. Then set up a 302 redirect from some other page to this page (by means of a script). Click the URL on "the other page" and look at the referrer string the first page prints out. Then, try to enter the script link directly in the browser.

Add to this that Googlebot does not send referrer information when it fetches your pages, so there is no way we can know if it's going straight to your URL or if it's going via a redirect script.

[edited by: claus at 10:45 am (utc) on Mar. 16, 2005]

kaled




msg:748997
 10:44 am on Mar 16, 2005 (gmt 0)

But...and this is a big But....what if Gbot does not go yipee one more new url. It already knows that the redirected url exists in its index. It just by default assigns that url to the hijackers url without doing an fetch.

That seems highly likely to me. I see no reason whatsoever why a robot should follow redirects immediately. They don't follow links immediately, they simply build a database of pages to get later. Redirects, especially ones to other domains, may well be treated the same way.

It should be possible to check the logs of sites where redirects are used to find the answer. However, bearing in mind the potential to send robots into infinite loops, I doubt redirects are followed immediately.

Kaled.

rob_casino




msg:748998
 12:42 pm on Mar 16, 2005 (gmt 0)

Maybe I'm missing something obvious or perhaps I've misunderstood something, but there's something I don't understand.

So... people are using new (throwaway) domains to replace the existing domains of genuine sites in the SERPs. To do this, they need to show a stronger site (PR-wise?) than the original? If so, how can the blackhats do that with brand new, weak, non-popular domains?

Sorry if I'm being ultra-dim here.

theBear




msg:748999
 1:38 pm on Mar 16, 2005 (gmt 0)

rob_casino,

They don't have to use the throw away domains for the site receiving the jacked whatever traffic , although they can.

What you are missing is that the script allows the jacker to do any number of things.

Jacker installs script on a toss away domain and places anchors on the new sites home page pointing to his script with several hi value serp results.

Jacker sends toss away domain several inbound medium PR links. Thus starting the bots off to look at the wonderful new domain. This places the entries in the SE.

Now maybe this first round serves only to trip Googles dup content filter for just the targeted pages or to sit dormant until needed.

It does its job and the jacker is happy. Because his other clean site now has that place in the serps.

But on the off chance that isn't enough he can change how the now implanted page operates by changing a database entry on his toss away domain.

The new action is to take advantage of relative addressing and information likely to be availiable to allow more duplicate content to be inserted in googles index. This time using googlebot to walk the target site duplicating all possible pages. Now Google has several copies of most pages of your site, starting with your home page. Ask all kinds of folks on this forum or me, I just spent a bit of time dealing with that part of this. This part can be done with out a script but hey it saves seting up links by hand everytime.

If this still doesn't work for them they can resort to changing how that script works again.

This time to point links to bad places. Remember that is a script that will be run when Googlebot hits it.

The are "rumored" bad "places" on the net that Google might say you are a bad site too so down you go. Or inserts indications that you run a site that isn't suitable for younger surfers.

Remember what you see when you visit that page may or may not be what Google bot visits. It is a script that runs.

For the jacker to derive profit he needs only to remove you from placing for terms he is interested, he has many ways to do this.

And then we have the case where the object is to just see what it can do.

And we have the case where newb website owner gets a copy of the script from the internet, installs it and doesn't even know all of its ability and the script has a nice administrative backend.

Now the jacker is using newb's site.

rob_casino




msg:749000
 2:30 pm on Mar 16, 2005 (gmt 0)

Thank you, bear - much appreciated.

boredguru




msg:749001
 2:31 pm on Mar 16, 2005 (gmt 0)

I got the script finally working.
What it does is gives one 301 redirect to itself and then gives a 200 ok. And then again when GBot fetches it will give 301, then again 200. No matter the time difference between each fetch. The reason i have kept it alternating is because of the previous stated reasons. Which i shall post now.
Unfortunately this script only will work on php mysql, and even more unfortunately this whole algorithm will work only on sites that use database and ssi language. Sites having static html pages with no database cannot be benefited by this, (but i think there is a workaround which will take 2 days to a weekif it is successful

Firstly

1) You will need to create a table in your database with these two fields.
i)
a) fieldname = url
b) fieldtypr = varchar
c) fieldlength = 255
d) fieldattributes = unique
ii)
a) fieldname = value
b) fieldtype = int
c) fieldlength = 2
d) fieldattributes = none.

2) Now to the code part

$url2=$_SERVER["REQUEST_URI"];
$ua=$_SERVER["HTTP_USER_AGENT"];
$ip=$_SERVER["REMOTE_ADDR"];
$url='http://www.example.com'.$url2;
if(strpos($ua,"Googlebot/2.1")===false && (strpos($ip,"216.239.")===false ¦¦ strpos($ip,"66.249.")===false) )
//if(strpos($ip,"69.36.190.")===false)
this above backslashed code wont be executed. That ip is that of searchengineworld.com. I included it for the test phase to see what where the headers being given. You can check the headers alternate between 301 and 200 at [searchengineworld.com...] for the url [example.com...] . If lots of people are checking the header simultaneously then you might not see it alternate all the time because
user 1 checks it and gets a 301
user 2 immediately or simultaneoously checks it and gets a 200
when user 1 checks it again he gets a 301.

{
your normal code goes here
the above is the normal code for your page to display.
}
else
{
$dbh = mysql_connect ('localhost', 'username', 'password');
mysql_select_db ('database name');
$sql = mysql_query("SELECT value FROM noredir WHERE url='$url'");
$value = mysql_result($sql,0);
if($value){[code]
[code]if($value==1){

$value = $value+1;
$sql1 = "UPDATE noredir SET value='$value' WHERE url='$url'";
mysql_query($sql1);
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$url);
header("Connection: close");
}
elseif($value==2){
$value=$value-1;
$sql1 = "UPDATE noredir SET value='$value' WHERE url='$url'";
mysql_query($sql1);
your normal code goes here
Your normal code has to be added twice but is only executed once. The first time you insert it above on line 10 is executed only when it is not googlebot and it is not from the above ip(ie for all other users). But this code (the same one) is executed only if it is googlebot and the conditions are met for Gbot to get a 200 response

}
}
else{
$sql2 = "INSERT INTO noredir (url, value) VALUES ('$url', '2')";
mysql_query($sql2);
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$url);
header("Connection: close");
}
}
}

3) Now to the algorithm
i) check if googlebot.
ii) if no throws normal page with 200.
iii) if yes, check to see the value.
iv) if value is present
a) if value=1, then
value=value+1
301 permanent redirect to same page
b) if value=2, then
value=value-1
execute your normal code and throw your normal page.
v) if value not present
get the current url($url) and insert into the url field in the database with the 2 as the corresponding value for field value
301 redirect to the same page
vi) exit.

Everytime the value is 1 it redirects after changing the value to 2. And everytime the value is 2 it throws the normal page after changing value to 1.
So 1 = 301 &
2 = 200,
That is why while creating a new row for the new url, you have to set the value to 2 as the redirection has already happened.

As you can see the advantage is that you dont have to worry about how many pages you have. Just inserting it into your php file will add new urls dynamically so that you dont have to do the job of entering each url in the database and setting a value.
But i think using it only on your homepage is enough.

Now to the FAQ

Q1) Why so many continuous redirections? why not do it just once and leave it.
A1) How do you know if when the Gbot visits it visits thinking its fetching your domainname.com or it is thinking it is fetching hijacker.com/url.php?domainname.com .

Because when you are redirecting it, Gbot could really have come asking for yourdomain.com but the next time(more like day) it could be asking for hijacker.com webpage which it thinks has moved to your homepage.

And as your homepage will be visited more often than some page three levels deep on your hijackers site, we would be pretty lucky catching the bot at the right time to make it think that the hijackers page has moved permanently.

Now how it will work is

Day1 : Gbot asks for yourdomain.com. You redirect it once that day to yourdomain.com. No harm done today and no gain also.
Day2 : Gbot asks for yourdomain.com. You redirect it once that day to yourdomain.com. No harm done today and no gain also.
Day3 : ditto
Day4 : ditto
Day5 : ditto
Day6 : Gbot asks for yourdomain.com thinking it is fetching hijacker.com/url.php?url=yourdomain.com. Today no harm done but lots of good done.

Q2) How much will it slow down my server
A2) Not much ( and that too only in the eyes of gbot) as long as you dont plan on implementing this on all your 1000's of pages. As Gbot will request for your page mostly once every day (atmost two times the same page) it will depend on the no of pages you are planning to implement it on. If you implement it on all 10000 pages (an example) of your site, you will be facing a server slowdown which will be noticable to all when Gbot comes on a full crawl. Its mostly the homepage that gets hijacked. So implement it on your homepage alone. But otherwise its your call depending on the resources you have.

Q3) I dont use php/perl/asp etc or database. Can i implement this technique?
A3 You cant. But i have been thinking of a way. Give me 2 days to a week. Ill try and work something up as its lots of static sites that face this problem.

Q4) I use php and dynamic urls is there anything i have to add if i am implementing it sitewide?
A4) No. Eventhough you might have a 1000 urls generated by a single file, it will work as this keeps track of the urls and not how many time the file is called.
eg yourdomain.com/forum/viewtopic.php?id=34
and yourdomain.com/forum/viewtopic.php?id=55 will be recognised as different urls eventhough the call is to the same file viewtopic.php

Q5 Will you take responsibility for the code and any liabilities that occurs due to our use of your code.
A5 Nope. We are all free thinking individuals who have the rights to use what is offered or not. So though the script is free to use and implement and copy and change and (add anything else you might want to add except for taking credit for this idea)

Q6 Will it change my google rankings?
A6 Only G knows [God? Google? You select!). But this script is more of a preventive step. Lots of people have their hijackers link removed yet G shows in their index the hijackers link with almost 3 months old cache. If your hijackers cache of your site is recent and is Gbot is still crawling you, then you have a good fighting chance.

Q7 Have you tried it?
A7 My site was thankfully G (again you select) never sucessfully hijacked. Though i had lots of 302 links from allover the digital wilderness pointed at me. G removed them. But i am using this script on my homepage from the last 12 hours. Will inform if anything diabolic happens to my site.

Q8 What is the theory this is based on?
A8 On the theory that G respects HTTP protocols a lot. If it can respect 302 protocol to the letter and misjudge your page to be someone elses, then it must respect 301 protocol as much to disown all previous urls for that page and take the new url as the new destination for that page (which is the same url by the way)

Q9 Any presumptions?
A9 Yes a few.
1) G is good. All this is a glitch because of the http protocols and not for purposeful effort by G to target your page. The glitch could be because G does give more emphasis on educational and gov site where it is pretty common for a resource to be temporarily relocated somewhere else. This might seem odd to mom & pop stores on the net and small Business but it is quite common there. G has afterall come from the same terrain (edu) so it probably knows how things could be affected if it changes its ways to ignore the protocols.
2) The http protocol is not quite clear about a temporary redirect to a permanent redirect.
That is if site A 302 redirects to site B which 301 redirects to site C.
Case 1) (favourable) all previous urls are dropped and only site C is taken to be the defacto site for that content.
Case 2) (unfavourable) Content for site B can be found only at site C and update url for site B, and here is the kicker, content for site A is at site C, dont update url as site A has only temporarily stored the content at site C.
We will never know how G will react when faced with this though i think it is case 1 as the protocol states
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use the returned URI/URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible. This response is cacheable unless indicated otherwise.
emphasis mine[1]

As you can see it does talk a lot about dropping all previous references and take the last uri as the only one. What do you think?

Q10 We are an SEO firm. Can we use this script for our clients
A10 Yup go ahead and get your clients back in the serps. But dont fleece them!
And you agree that i dont have any liabilities if you use them. Actually using them means you agree there are no liabilities.

Q11For how long should i have it
A1115 days to a months time should be enough i guess to catch Gbot when it comes asking for the hijackers url. But i will be keeping it for some more time to see if anything else happens

Q11 Cant think of anymore. if you have any, ask and i know lots of more experiences members here can answer you to your satisfaction.

[1][edited by: rogerd at 5:07 pm (utc) on Mar. 16, 2005]
[edit reason] examplified [/edit]

theBear




msg:749002
 3:58 pm on Mar 16, 2005 (gmt 0)

boredguru,

I agree with your defense setup.

Now using a little bit of lwp magic and or file reading it might be extended to help the static folks as well.

Provided php and mysql is installed on the server.

In the spots where you say code goes here the code that went there would be file reads of the requested page by the php script along with an echo of the read content.

This is what I was calling 1 shot 301 last evening when I posted about site hardening.

Of course site design can play havoc with this. Trust me I work on a site that has evolved since 1998 and a lot of the old is still a running (but not the same way).

Lorel




msg:749003
 6:24 pm on Mar 16, 2005 (gmt 0)

I'm trying to reach the end of this thread (3 more pages were posted since I started reading about an hour ago--sigh!) so please excuse if this is "old" news:

Persist by contacting another similar site until you find a suitable volunteer. But like I suggested, do net accept a refusal easily, what right do they have to deny us this opportunity to excersise our ability to bring their websites to its knees. They have the reasurance also from google that nobody can affact their ranking.

Hmmmm. this sound just like the response I got from Alexa when I asked them to remove all their 302 redirects to my sites:


If you're referring to the fact that we redirect before the site leaves Alexa.com, but still deliver the visitors to the site in question, it is our right to use redirects to track where people go on our site and that
behavior will not be changed.

japanese




msg:749004
 6:58 pm on Mar 16, 2005 (gmt 0)

Lorel,

Alexa and similar sites that automatically add a redirect to their search results are a timb bomb waiting to explode.

""I bet no webmaster here would accept googlebot to follow that redirect""

It could herald the end of their site in google's index.

Any volunteers? It is only a link from alex's redirecting system. May even help your inbound links count.

Reid




msg:749005
 7:04 pm on Mar 16, 2005 (gmt 0)

You guys are losing me with that code etc.
but keep up the good work.

We have caused quite a stirr this time 302 is the buzzword in SEO world right now.

That a#*$!y.com site I had trouble with - their server is down today.
I found they are doing the exact same thing to one of my clients - I was going to try a new method I learned from the other thread on this forum "not all 302's are hijackers".

Heres the trick
set up a noindex tag on the page being jacked and then use the url removal tool to remove the offending link.

I was going to try this on my clients site but the offending server is down - probably not for long though.

japanese




msg:749006
 7:09 pm on Mar 16, 2005 (gmt 0)

THE ARROGANCE OF ALEXA

They refuse to remove their 302 redirect to my sites.

Despite my sending them evidence that they have rank popularity pages with crawlable redirect links that will cause googlebot to create duplicate content.

kila_m




msg:749007
 7:25 pm on Mar 16, 2005 (gmt 0)

Well at least some of the press have picked up on it.. any one want to post on slashdot as well?

[theinquirer.net...]

This 713 message thread spans 24 pages: < < 713 ( 1 ... 10 11 12 13 14 15 16 17 18 19 [20] 21 22 23 24 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved