Welcome to WebmasterWorld Guest from 54.196.244.206

Message Too Old, No Replies

GWT Reporting Duplicate titles & metas? Where are they coming from?

     
9:48 pm on Sep 18, 2012 (gmt 0)

Full Member from US 

joined:Feb 24, 2011
posts: 251
votes: 0


I went into my GWT account today and normally I might have 2-3 HTML optimization suggestions from Google in there.

Today I have over 1000+ ?

My site is in wordpress. So I'm not sure what I might be doing wrong because it's been in wordpress for 1.5 years? Why there errors now?

I'll use "example-page" instead of my actual page name.

This is what I am seeing:

/example-page/1345455629000/
/example-page/1345547746000/
/example-page/1345557128000/
/example-page/1345560196000/
/example-page/1345560437000/
/example-page/1345677539000/
/example-page/1345696266000/
/example-page/1345803012000/
/example-page/1345803037000/
/example-page/1345803063000/
/example-page/1345885749000/
/example-page/

For each instance or page that Google reports these - they all lead back to the original/actual page. But where are they coming from?
12:56 am on Sept 19, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


you might be serving the same content at those urls and at one or more other urls.
1:06 am on Sept 19, 2012 (gmt 0)

Full Member from US 

joined:Feb 24, 2011
posts: 251
votes: 0


I know the same content is being served. That's the issue.
How are the pages with the numbers on the end being generated? Somehow they are being generated and Google is calling them duplicate content when there is really only the original post?
1:10 am on Sept 19, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


what does the duplicate url path look like?
normally the non-canonical url gets internally rewritten to index.php and the script issues a 301 external redirect in response.
7:16 am on Sept 19, 2012 (gmt 0)

New User

joined:Aug 21, 2012
posts: 18
votes: 0


What do you see in GWT, when you navigate to Configuration -> URL Parameters?
9:00 am on Sept 19, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12696
votes: 244


Unfortunately these aren't parameters. They're part of the URL.
/example-page/1345455629000/
/example-page/1345547746000/
/example-page/1345557128000/

For comparison purposes, the present page is

http://www.example.com/google/4497098.htm

As it were. And in fact if you search these very forums for anything specialized enough to yield only a page or two of results, you'll find half a dozen variant names leading to the identical thread. But I don't think the People Up Top are worried ;)
1:27 pm on Sept 19, 2012 (gmt 0)

Preferred Member

10+ Year Member

joined:Jan 3, 2006
posts: 612
votes: 0


I also see the same and a lot of weird paths that never existed and page not found series after the last crawl spike. Maybe something went wrong?
1:36 pm on Sept 19, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Those numbers look like datestamps. Is this related to sessions in some way?
1:43 pm on Sept 19, 2012 (gmt 0)

Full Member from US 

joined:Feb 24, 2011
posts: 251
votes: 0


@MinostheNinth

This is what is in the box at the Configuration-->> URL parameters:

c month day week
3:21 pm on Sept 19, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:Mar 9, 2010
posts:1806
votes: 9


Aren't you using canonical urls? Unfortunately in wordpress you can append any number towards the end like your example and it will return the same content as /example-page/.

The best way to handle this is by using canonical urls on your posts.

There are millions of wordpress blogs that suffer from this. Googlebot is probably discovering them thro. their buggy javascript crawler. Do you see any referrers for these links in your logs? If not it is surely the result of their js crawler.
4:48 pm on Sept 19, 2012 (gmt 0)

Full Member from US 

joined:Feb 24, 2011
posts: 251
votes: 0


@indyank

Hopefully I'm not a totally doofus - trying to understand what you're saying.

Do you mean that any site links on my site should have the FULL url when linking?
Like is should be: http://www.example.com/page1/
and NOT just: /page1/
7:18 pm on Sept 19, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


Just my personal opinion, but I don't think that's exactly what indyank meant. It seems to me that he's talking about the canonical link element in the <head> section. That's a very reasonable approach to this kind of Wordpress trouble, and there are canonical tag plug-ins available to help with the job.

<link rel="canonical" href="http://example.com/page1/">

That way, search engines that read the canonical link will know that, no matter what exact URL they requested, the URL to be indexed is shown in the <head>. There are over 40 possible canonical problems [webmasterworld.com] and many of them can be challenging to deal with depending on your hosting.

The URL listed in a canonical link is not 100% binding, technically - but Google does take it as a very strong suggestion. For that reason it is best to deal with the potential canonical errors [webmasterworld.com] directly on your server whenever you can.

Before you go live with a canonical link, it's good to double check to be sure you aren't creating any canonical disasters [webmasterworld.com].
7:26 pm on Sept 28, 2012 (gmt 0)

Full Member from US 

joined:Feb 24, 2011
posts: 251
votes: 0


I don't know if I can post a link here to Google forums - but my original issue is something others are experiencing and it seems to be an issue with Google and Disqus conflicting or something. But my errors have skyrocketed to 7500+ and grows daily. In case someone else has this issue - I thought this might help since it was none of what was suggested here by those that commented.
[productforums.google.com ]
4:04 am on Sept 29, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


A link to an official communication from Google staff is fine - and this one comes from John Mueller, who was handson in SEO before he took the job at Google. He's been an excellent advocate for webmasters and a good communicator on the Google forums.

According to John, commenting on an explosion of 404 errors, it looks like this is the bottom line:

It does look like we're picking up something funny via JavaScript there. We're looking into what can be done in this particular case. In the meantime, keep in mind that 404 errors of URLs that are invalid...are not something that would affect your site's indexing or ranking...


This is a bit different from the opening post, however, because the server IS resolving the links rather than sending a 404 Not Found. While the source of those URLs may be the same JavaScript crawling problem, the fact the your server is resolving those URLs is something you should address. It increases your site's vulnerability to both accidental errors and malicious attacks.

I'd look into what kind of add-on characters can be appended to a valid URL and have the new URL actually resolve. Then take steps to either correct the bad configuration, or add a line in .htaccess to at least NOT return a 200 OK for those artificially padded URLs.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members