you might be serving the same content at those urls and at one or more other urls.
I know the same content is being served. That's the issue.
How are the pages with the numbers on the end being generated? Somehow they are being generated and Google is calling them duplicate content when there is really only the original post?
what does the duplicate url path look like?
normally the non-canonical url gets internally rewritten to index.php and the script issues a 301 external redirect in response.
What do you see in GWT, when you navigate to Configuration -> URL Parameters?
Unfortunately these aren't parameters. They're part of the URL.
For comparison purposes, the present page is
As it were. And in fact if you search these very forums for anything specialized enough to yield only a page or two of results, you'll find half a dozen variant names leading to the identical thread. But I don't think the People Up Top are worried ;)
I also see the same and a lot of weird paths that never existed and page not found series after the last crawl spike. Maybe something went wrong?
Those numbers look like datestamps. Is this related to sessions in some way?
This is what is in the box at the Configuration-->> URL parameters:
c month day week
Aren't you using canonical urls? Unfortunately in wordpress you can append any number towards the end like your example and it will return the same content as /example-page/.
The best way to handle this is by using canonical urls on your posts.
Hopefully I'm not a totally doofus - trying to understand what you're saying.
Do you mean that any site links on my site should have the FULL url when linking?
Like is should be: http://www.example.com/page1/
and NOT just: /page1/
Just my personal opinion, but I don't think that's exactly what indyank meant. It seems to me that he's talking about the canonical link element in the <head> section. That's a very reasonable approach to this kind of Wordpress trouble, and there are canonical tag plug-ins available to help with the job.
<link rel="canonical" href="http://example.com/page1/">
That way, search engines that read the canonical link will know that, no matter what exact URL they requested, the URL to be indexed is shown in the <head>. There are over 40 possible canonical problems [webmasterworld.com] and many of them can be challenging to deal with depending on your hosting.
The URL listed in a canonical link is not 100% binding, technically - but Google does take it as a very strong suggestion. For that reason it is best to deal with the potential canonical errors [webmasterworld.com] directly on your server whenever you can.
Before you go live with a canonical link, it's good to double check to be sure you aren't creating any canonical disasters [webmasterworld.com].
I don't know if I can post a link here to Google forums - but my original issue is something others are experiencing and it seems to be an issue with Google and Disqus conflicting or something. But my errors have skyrocketed to 7500+ and grows daily. In case someone else has this issue - I thought this might help since it was none of what was suggested here by those that commented.
A link to an official communication from Google staff is fine - and this one comes from John Mueller, who was handson in SEO before he took the job at Google. He's been an excellent advocate for webmasters and a good communicator on the Google forums.
According to John, commenting on an explosion of 404 errors, it looks like this is the bottom line:
I'd look into what kind of add-on characters can be appended to a valid URL and have the new URL actually resolve. Then take steps to either correct the bad configuration, or add a line in .htaccess to at least NOT return a 200 OK for those artificially padded URLs.