homepage Welcome to WebmasterWorld Guest from 54.145.183.169
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Mysterious query string showing for a site: operator - begins with ?ArdSI
FranticFish

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4055046 posted 11:21 am on Jan 6, 2010 (gmt 0)

site:example.com for a client shows www.example.com/?ArdSI=2a2f396967c7706d5d172ac64e0a16e0

Searching Google for the query string shows thousands of results for different sites (root files, internal pages (dynamic and static), PDFs) but no clue as to where this comes from, nor if I just search for "ArdSI"

Any ideas?

 

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4055046 posted 7:39 pm on Jan 6, 2010 (gmt 0)

Sure, someone linked to the page using the query_string you noted and your site does not remove invalid query_strings or serve a 404 for them, so the location returns a 200 OK and is considered a valid location... I personally strip all query_strings from sites, except those matching a valid pattern and serve a 404 using PHP if the valid pattern does not show a result other than the original. Most sites will show the content of the page when you use an invalid query_string, which makes it easy for someone else to duplicate your content.

Trying to find the link is probably not worth the time since you just need to redirect and remove the query_string to fix any issue. I wouldn't worry about where it came from as much as how to get rid of it and keep it from happening again.

FranticFish

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4055046 posted 9:23 pm on Jan 6, 2010 (gmt 0)

The thing is, this EXACT query string is showing 585,000 results. I have never seen this before on any site in years; I find it for the first time today.

The only result in Google I can find for it (other than other sites where it is in the url) is this thread.

Searching for ArdSI reveals nothing.

Is this a Google glitch? Some other search site or aggregator? A scraper?

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4055046 posted 9:31 pm on Jan 6, 2010 (gmt 0)

I'll let someone else try to address what it is that's specifically causing the query_string to be used beyond: SEs have been known to use 'non-existent' query_strings to determine how sites handle missing dynamic pages and what you are seeing could somehow be a result of that...

Personally, I don't worry as much about the why as I do about how to fix and keep it from happening again, because I feel it's better use of my time... The preceding is only one of the possible causes, and I don't know if that's what it is or not.

FranticFish

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4055046 posted 1:05 pm on Jan 7, 2010 (gmt 0)

Thanks. On dynamic sites I have 404 and 301 traps for invalid / expired parameters.

As this is a static site and the first occurrence I've just added the query string to robots.txt then I can use WMT url removal tool.

teokolo

5+ Year Member



 
Msg#: 4055046 posted 3:07 pm on Jan 7, 2010 (gmt 0)

I had a weird (but funny) url indexed:
site.com/?q=saveusfromberlusconi

Try a google search inurl:?q=saveusfromberlusconi

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4055046 posted 7:11 pm on Jan 7, 2010 (gmt 0)

As this is a static site and the first occurrence I've just added the query string to robots.txt then I can use WMT url removal tool.

Should work fine if it's just this one, but personally, I would recommend using mod_rewrite and redirecting all requests with query_strings to the same location without a query_string so it doesn't keep happening...

RewriteEngine on
RewriteCond %{QUERY_STRING} ^.+
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

Adding a ? with nothing after it removes the query_string.

FranticFish

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4055046 posted 1:03 pm on Jan 8, 2010 (gmt 0)

Thanks for the code, I'll keep it handy. I only manage about six or seven sites at the moment and this is so far an isolated occurence.

I'm not so sure I like the idea of 301-ing anything anyone might throw at a url. I mean, that htaccess effectively says "Yes, this is part of my site". I'd rather serve a 404 which says "Not me".

TheMadScientist

WebmasterWorld Senior Member themadscientist us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 4055046 posted 8:02 pm on Jan 8, 2010 (gmt 0)

I'm not so sure I like the idea of 301-ing anything anyone might throw at a url.

All you're doing is removing any query_string they might put there. Not redirecting any requested location. IMO it's actually good practice to not allow non-existent query_strings to be sent.

I'd rather serve a 404 which says "Not me".

Then you'll have to change your site to dynamic rather than static and use a scripting language to serve the 404... Try typing ?anything=whatever-you-want on any of your URLs and you'll see what I mean... A query_string is technically not part of the location requested, but rather information passed to the script at the location requested, which means no query_string on your site will serve a 404 if the location requested (/page.html) from your server is a valid resource.

The code doesn't do anything to change the location /page.html it just removes the query_string. A request for /missing-page.html will still return a 404 just like you want, and by not removing the query_string you are actually sending a stronger signal the location with the query_string is part of your site than you are by removing it, because if you don't remove it the visitor receives a 200 OK for a request with any query_string, which means it is part of your site...

Personally, I'd rather show visitors (including search bots) the information they were looking for at the location requested with a query_string by serving them the resource (page) without the query_string than show them a 404 because someone linked to a page and included a query_string, especially when I know exactly what page they were looking for...

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved