Forum Moderators: phranque

Message Too Old, No Replies

This: ?cat=1 is appearing on end of domain name

what is causing this

         

Lorel

4:29 pm on Sep 16, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was using a program that tracks link and noticed ?cat=1 appearing on the end of a domain name of a site I manage. and it repeats to =4, =10 etc., so it's showing multiple copies of the home page (only the above shows up for the home page in Google but multiple links show up on the link program). All the pages are in html but only the home page appears to be affected. Can someone tell me what might be causing this? There is no database on the site and it's not a blog.

I use relative links in menu so have the index page set up like this:
<a href="/">

Here is the only reference to this site in the htaccess file.

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#

keyplyr

9:38 pm on Sep 16, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Many utilities append a parameter to the end of a web address in order to track or manage it.

Your page(s) may be in a directory, a shopping site, a link schema, etc. A bot may be validating the outgoing ling or simply following links found by crawling a site where your link appears.

There's nothing inherently malicious about a parametet being used. You'd need to investigate further to make that distinction. Look up the IP address of the agent and find out who they are.

phranque

10:31 pm on Sep 16, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



you should probably use a 301 redirect for those requests.

https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule
By default, the query string is passed through unchanged. You can, however, create URLs in the substitution string containing a query string part. Simply use a question mark inside the substitution string to indicate that the following text should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just a question mark. To combine new and old query strings, use the [QSA] flag.



RewriteCond %{QUERY_STRING} .
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

whitespace

12:36 am on Sep 17, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



Can someone tell me what might be causing this?


Does the "link program" not tell you from where these URLs are being linked (internal / external)? Presumably this "cat" URL parameter means nothing and is not used by your site? Otherwise it's a bit tricky for anyone to say what might be causing this, ...other than it must be being linked to from somewhere (which isn't really an Apache thing).

To prevent search engines picking this up, you can use a rel="canonical" link element in the HEAD section (or HTTP response header) to "suggest" the canonical URL. Or, if this param is wholly invalid then redirect, as phranque suggests. (But do you not use a query string anywhere on your site?)

Lorel

7:18 pm on Sep 18, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is no shopping cart/database on the site so there shouldn't be a query string. The page didn't have a canonical tag (due to be redesigned soon) so I'll see if that takes care of the problem.

Thanks everyone.