I read an interesting article by Yoast on duplicate content https://yoast.com/articles/duplicate-content/
Ages ago I kicked this thread off [webmasterworld.com...] but to be honest my developers and SEOs have handled it well.
But not my new team who are unfamiliar it seems with a quick upfront set of tasks to settle things in.
On multiple sites we have launched with separate ccTLDs and new URLs, we are battling both internal site duplication and erroneous cross linking which should not exist. There are all sorts of duplicate entry paths, such as currency variants and language variants. Non "/" URLs as well as intended "/" ; old pages and URLs not properly 301d or 404d ... the instances seem to pile on and it's stuffing the traffic in Google.
WMT shows 500k+ links from one site to another just to give a sense of the extent etc etc.
Is there an exhaustive and generic checklist that developers should have on hand to deal with this on Wordpress through the dev stages. As much as I love plugin creators, I don't feel comfortable shelling out $$s for what should be reasonably anticipated developer administration .
Thoughts on the checklist, or reference to other sources, and ways to help?
These don't really go together. :) As soon as you select a theme you've diverged from a standard and what comes into play is driven by the specifics of how WordPress is configured, the theme and plugin selections.
For SEO issues: You can probably take Brett's SEO Checklist [webmasterworld.com...] and modify it to suit what you need. The peculiarities of WordPress have more to do with permalinks and how WordPress shares/makes content available. A bit of research on the options here and you could put together a checklist on what you want.
For WordPress specific info consult the CODEX. Full of good info in there including:
The major difference in moving a site to WP that is not easily apparent is that WP by default offers many ways to find your content. Each post or page is duplicated in several virtual directories: /archive/ /category/ /tags/ and there are other options that can be active. An URL such as "http://example.com/small-blue-widgets" can be viewed at "http://example.com/small-blue-widgets", "http://example.com/tags/small/small-blue-widgets" or"http://example.com/widgets/small-blue-widgets" besides its original URL. Each one of those URLs exists whether you use them or link to them so it is important to select the permalink taxonomy you will use globally and noindex the others. Links in the sidebar need to be selective also because they are multiplied on a massive scale in all those alternative URL taxonomies. The general settings before you start adding URLs are very important, because changing them at a later date means a lot of internal redirecting.
See lucy24's excellent description: [webmasterworld.com...] and carefully read and digest what the second post (the 1st response to the OP) says about redirecting and rewriting.
You can't use htaccess to control this URL behavior within WP, it is all in the settings and handled internally. I use Yoast's plugin to handle the noindexing and rewrites, many others handle it another way. The reason I use the plugin is because it also limits what gets into your sitemaps. You don't want noindexed content submitted in sitemaps.
Proactive work before beginning the process should be designed to minimize the reactive work to where it is a minor inconvenience. The size of the project and scope of the changes would be factors in an estimate of this type. It is kind of an open question though.