Welcome to WebmasterWorld Guest from 107.20.110.201

Forum Moderators: rogerd & travelin cat

Message Too Old, No Replies

Wordpress blog resolving with url extension

     

Gemini23

10:44 pm on Mar 21, 2013 (gmt 0)

5+ Year Member



Can anyone advise how to stop the following from happening? and what is causing it.

I was checking for duplicate content on Google and noticed that one particular page on my blog has several variations in Google's organic serps.

Basically the results show EXACTLY the same wordpress blog post url BUT with the following type of extension AFTER the usual url...

/?p=bxprvxvf

and I notice that if I type in various different letters the url still resolves... all well and good BUT these urls are in Google's search results.

S.. why is it happening and how to stop it?

lorax

12:42 am on Mar 22, 2013 (gmt 0)

WebmasterWorld Senior Member lorax is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That looks like Google indexed the page(s) before you set permalinks and has kept them because they still resolve. If the pages resolve to the new permalink structure then put a redirect plugin in place and redirect the old URLs where you want with a 301.

phranque

2:02 pm on Mar 22, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



what lorax said.
in addition, make sure you aren't internally linking to these types of non-canonical urls.

lucy24

8:24 pm on Mar 22, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



There's a third possibility. This happens with or without query strings. Periodically you'll see the googlebot asking for something like

:: shuffling papers ::

/paintings/tundra/lenkoljgkuhz.html

The search engine is testing whether your site returns a 404 when the page genuinely doesn't exist. In the case of php pages, the fix almost always lies in the php itself. If the parameter itself has meaning, the script has to return a 404 when it points to a nonexistent page-- like a post whose number puts it somewhere in the summer of 2016.

If the parameter has no meaning-- or if you don't use it at all, but someone else does-- tell search engines to ignore it.

Gemini23

9:36 pm on Mar 22, 2013 (gmt 0)

5+ Year Member



I am double checking when permalinks were added...

what I have also noted is that if I ADD "?p=xyzetc" the the url stays the same as without...

Gemini23

9:39 pm on Mar 22, 2013 (gmt 0)

5+ Year Member



It seems that Categories and Pages allow the added "?p=xyzetc" and the previous url and content remains.. but if I add it to a blog post.. it redirects to the actual blog url... not sure how to resolve that...

phranque

10:10 pm on Mar 22, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



if you can't figure out how to fix this with a WP config change and if all parameter strings on category and page urls are non-cnaonical, you could do this using mod_rewrite.
(assuming you can recognize category/page url patterns and assuming you're hosted on apache)

lucy24

12:32 am on Mar 23, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



if I ADD "?p=xyzetc" the the url stays the same as without...

I don't think this means what you intended it to mean. Can you clarify?

Gemini23

12:53 am on Mar 23, 2013 (gmt 0)

5+ Year Member



Often difficult without being able to put in the url..

but..

greenwidgetxyz.com/blog/index.php/page/2/

will display the content for that page... all is fine..

if I add the extra letters to the url... the content remains displaying.. and it doesn't matter what letters I add... I have only added 3-4 different letters but it doesn't seem to matter which ones..

ie
greenwidgetxyz.com/blog/index.php/page/2/?p=xyzetc

the content remains displaying as if nothing has happened to the url...

HOWEVER... if I do this to a blog post rather than a page... it refreshes to the correct url and removes the added character string..

ie.. greenwidgetxyz.com/blog/index.php/123/test/?p=xyzetc

and ONLY this will display.. in the url..

greenwidgetxyz.com/blog/index.php/123/test/

None of the above affects what displays in the actual content... I am just concerned that these extra urls are getting indexed.. and potentially giving duplicate content issues...

lucy24

3:59 am on Mar 23, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Got it. If WordPress receives a request for something it has been told to look out for, it processes the request and issues redirects to keep the URL looking pretty. But if it receives a request for something it doesn't expect-- like, say, an URL bearing parameters that originated in a search engine's fevered imagination-- then it doesn't know what to do. It processes the familiar part and ignores the rest.

So then you're back with the two-part solution. One part is to tell the googlebot to ignore certain parameters. There will be a list on the "parameters" page of wmt. The other part is to forcibly redirect any request that contains a query string. Your RewriteRule will need a preceding condition looking at THE_REQUEST so you're only redirecting external queries, not internal ones.

But you will have to be very careful in your htaccess file. It already contains things WordPress put there to make the whole thing work. Make sure your additions are in the right place, where they won't conflict with anything that is already there.

And as long as you're in there, you can probably do some cleaning up. Your average CMS htaccess file is an unholy mess ;)
 

Featured Threads

Hot Threads This Week

Hot Threads This Month