homepage Welcome to WebmasterWorld Guest from 54.227.171.163
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
A new method to steal PR?
faulty link that works
Lorel

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3037400 posted 3:57 pm on Aug 7, 2006 (gmt 0)

I found a site linking to a client's site I manage in this manner.

[ClientExample.com...]

It looks like it should be a fauly link that doesn't work but it redirects to my client's site.

Here is another on the same page (wihout the slash) and they both work:

http://www.example.com?Hijacker.com

The server header checker shows this as a 200, normal link. All other outgoing links on the page appear to be normal.

Does the "?" mean this is run by software that runs a search?

 

jenkers

10+ Year Member



 
Msg#: 3037400 posted 11:08 pm on Aug 7, 2006 (gmt 0)

I just found something very similar (posted a spam report to G immediately).

The page in question has an url in the format

http//www.example.com/nnnnnnn/querykeyword1,querykeyword2.php

where nnnnnn is a string of numbers, note the keywords are seperated in the url by a comma.

When you open the page up there is no content on the page apart from a context focussed overture ad.

If you look at the source of the page large snippets of text from the websites high in the serps for that query are jumbled and hidden in an iframe.

This site just jumped in at position 1 for this particular query I watch today.

Nikke

10+ Year Member



 
Msg#: 3037400 posted 11:24 pm on Aug 7, 2006 (gmt 0)

Lorel,
Why do you think it is any kind of highjacking? I have used this method in the past when placing true links to partner sites that want a real link and an easy way to count incoming visists.

Since just about any server accepts an empty variable after a question mark, the link will work and return a 200. Noting strange there.

However, it's next to useless as far as PR goes, and not used as much these days. It can still be practical for counting purpouses where a site owner doesn't have full access to referral logs. Ask your client if they have had any previous collaboration with the site linking to them.

Lorel

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3037400 posted 1:59 am on Aug 8, 2006 (gmt 0)

I consider any abberation of a normal link suspect of wrong doing until proven innocent. I'm not a programmer so not aware of what one can do to a link. But thanks for the info re this not passing PR. I suspected as much.

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3037400 posted 2:42 am on Aug 8, 2006 (gmt 0)

I keep a seperate 404 log and see all kinds of goofy stuff at the end of URLs on a regular basis. Oftentimes it is caused by bad link generation (e.g. .html> instead of .html">). I see this so much I actually have a whole series of .htaccess instructions to get people on their way to the correct spot. Here are some of the rewrite rules I use:

RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html\.(.*) /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html>(.*) /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html-(.*) /$1.html [R=301,L]
RewriteRule ^([0-9Ša-zŠA-Z].*)\.html([0-9Ša-zŠA-Z].*) /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)Enviro(.*) /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)E /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)20E /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)20 /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)$ /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)/(.)target= /$1/ [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.html(.)target= /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)/default\.htm$ /$1/ [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.htm$ /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.ht$ /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.h$ /$1.html [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)\.$ /$1 [R=301,L]
RewriteRule ^([a-zŠA-ZŠ0-9].*)&gt$ /$1 [R=301,L]
RewriteRule ^&(.*) / [R=301,L]
RewriteRule ^([0-9Ša-zŠA-Z].*)/&(.*) /$1/ [R=301,L]
RewriteRule ^([0-9Ša-zŠA-Z].*).html&(.*) /$1.html [R=301,L]
RewriteRule ^([0-9Ša-zŠA-Z].*).htm&(.*) /$1.html [R=301,L]

Since all of my URLs start with a number or a letter I use the "[0-9Ša-zŠA-Z]" at the begining of my regular expression to prevent any funny stuff. I haven't yet come up with a good instruction to strip off query strings although I'd really like to.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3037400 posted 3:11 am on Aug 8, 2006 (gmt 0)

KenB,

> I haven't yet come up with a good instruction to strip off query strings...

If you add a "?" to the end of your substitution URL, any query string on the request will be cleared.

Demonstrating that, along with a generic rule you might use to replace the first ten of your ".html<plus-more>" rules:

RewriteRule ^([a-zŠ0-9].*)\.html.+$ http://www.example.com/$1.htm[b]l?[/b] [NC,R=301,L]

Note that [NC] makes the compare case-insensitive, and is more efficient than using [A-Za-z]. Also note that on many servers, the substitution will be required to be a canonical URL as shown here and below.

Lorel,

The "?" and characters following the URL are a query string, and won't do anything unless your site is dynamic, and the page-generation script that you use accepts that query string and processes it in some way to affect the page that it produces.

Otherwise, the only negative effect is that it produces a second URL by which your page can be accessed, thus creating a minor duplicate-content annoyance.

If you're on Apache, and your site is entirely static, you can remove the query string using this general-case rewriterule:

RewriteCond %{QUERY_STRING} .
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

Jim

jomaxx

WebmasterWorld Senior Member jomaxx us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3037400 posted 3:21 am on Aug 8, 2006 (gmt 0)

It's not the best way to link, but it's valid and I think it could pass PR. It's a matter of Google recognizing that the two forms of the URL are functionally identical, which their algorithms can figure out even if it doesn't happen immediately.

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3037400 posted 4:32 am on Aug 8, 2006 (gmt 0)

jdMorgan,

Plus one on your cleaner .htaccess fix.

Thanks, it worked like a charm.

I suspect I'll have to nuke the bogus query strings via a 301 redirect using PHP.

KenB

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3037400 posted 5:14 am on Aug 8, 2006 (gmt 0)

Here's a quick PHP code I threw together that strips the query string off of a request. I exempted out a contact form as it is common for me to feed query strings to the contact form for things like prepadded subjects. The code should be one of the first things PHP processes for a page request and must come before anything is output to the browser.

if($REQUEST_URI!="/email.html"){
$strURL="http://".$_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI'];
$arrayURL=parse_url($strURL);
if($arrayURL['query']!=""){
header("HTTP/1.1 301 Moved Permanently");
header("Location: ".$_SERVER['HTTP_HOST'].$arrayURL['path']);
exit();
}
}

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved