Forum Moderators: phranque

Message Too Old, No Replies

Banning part of a URL

Site has 302 redirects to my pages

         

larryhatch

10:25 am on Apr 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One guy has two sites with bogus 302 redirects to several of my pages.

The redirects are all of the same form as shown in my access_log files :
www.badguy.com/websites/site123.htm ..
where 123, 124, 125 .. nnn refer to my pages.

I have verified the 302s by doing header checks on badguy's URLs.

I am considering banning ALL visits where the referrer contains 'websites/site'
(without the quotes of course.)

1) I am hoping that Google etc. will ALSO be banned from my content
and therefore not credit it to badguy.com, IS THIS TRUE?

2) If so, I need to know the exact syntax for the .htaccess file, start to finish.
This part really scares me. One character out of place and the whole site goes blooey!

Can anyone help me with this? - Larry

larryhatch

8:54 pm on Apr 21, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Anyone?

sitz

1:00 am on Apr 22, 2005 (gmt 0)

10+ Year Member



1) I /think/ so, but I Am Not A Search Engine Guy (IANASEG); someone else want to chime in on this one?

2) This can be done with SetEnvIf [httpd.apache.org] (I tend to prefer SetEnvIfNoCase [httpd.apache.org], just to be safe);

As far as making the site go blooey, this is why you test things in a safe place (such as in a .htaccess file in an unused directory, or on a test installation of Apache. One of the many reason I run Apache on my laptop. =) ).

jdMorgan

4:51 am on Apr 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I am hoping that Google etc. will ALSO be banned from my content

Google and other search engines do not provide referrers when they spider your site. SO, as mentioned before, you cannot get rid of these hijacker links using mod_rewrite.

Jim

larryhatch

6:43 am on Apr 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks much Jim.

I was afraid of something like that, but couldn't be sure.
There was just TOO much to digest and remember it all.

If I ban www.badguy.com/(redirect) then all I'm doing is
disallowing the few referrals he does send me. -Larry

larryhatch

6:47 am on Apr 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jim: One last vain hope here:

What happens when Google spiders BADGUY's site?

They go from one page to another, and pretty soon to HIS pages
which 302 to MY pages.
What happens when they call up the page and the content is ZERO?

Wouldn't that take credit for my content away from the badguy's phony pages?

That's all I was hoping for in the first place.
Wouldn't THAT work? -Larry

jd01

5:32 pm on Apr 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Larry,

I wanted to do the same thing, so I asked about it a little while ago... Here's how it works.

Mr. G-bot visits b-guy's site, then finds the links to your site...

But, the guy's over at G, don't want you to know where G-bot came from (you might 'track' him), so they do not send referer headers... (and they think we're paranoid, lol)

So, when G-bot visits b-guy's site, and it either logs the links, or follows them to yours... You never know which link was used... No referer.

Effectively, what blocking access by referer, does is blocks any people from entering, but since G-bot does not send referer headers... Your Pages Are Still There.

Hope this helps.

Justin

jdMorgan

6:58 pm on Apr 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> they think we're paranoid

Nobody's paranoid... They don't provide a referrer for the simple reason that --if you've done your SEO homework right-- they are following many links from many pages to your page. So, providing a referrer would be meaningless.

The same is true of a caching proxy setup like AOL uses. They cache your page based on requests from many users clicking links on many other pages, using many different user-agents (browsers). So, AOL's caching proxies (and most others) do not provide either a referrer or a user-agent for their cache-validation requests.

The basic problem here is that no matter what you serve in response to somebody's 302, Google will ascribe that 302'ed URL to your site. There are current indications that Google is rolling out a fix, or has already done so. I'd suggest watching to see if this helps, rather than pursuing fixes on your server that won't work. The fact that fixes on the "victim" server won't work is what makes this problem so awful.

Jim