Forum Moderators: Robert Charlton & goodroi
1) How is the page being picked up if the site isn't being spidered
2) Why are the pages being dropped from the idex
3) How can I stop this happening
Would very much appreciate clarity on this as it's driving me crazy and I can't find any relevant info anywhere.
many thanks
A site map may help.
But there may be other issues; meta tag, <TITLE>, duplicate content, code bloat ...
Is this an isolated problem or does the site have any other problems?
The site ranks well for category pages - but does tend to have a problem with product pages being indexed. They have an xml site map that is auto updated when a new product page is produced - I would say that 100% of category pages are indexed and about 10% of product pages. They rank for a few days and then dissapear even if they do appear on the site map - if they don't put a link off the homepage - they don't seem to get indexed at all.
There is one odd thing that springs to mind - they have a url rewrite for all product pages to make them SEO friendly - this is custom written rather than an off the shelf solution - do you think this might have something to do with it?
In terms of on the page stuff - if anything I would say they are in danger of being over optmised (although the category pages follow the same SEO practises as the product pages and these seem to do ok).
With regards to duplicate content - not an issue other than the url rewrite, but I'm pretty sure that it is only picking up the "re-written" url and not the original.
Can't think of anything else that might help!?
cheers
I don't think your automated renaming and re-sitemapping should be a problem - so long as it works, but I'd be concerned about the sight navigation being too dynamic; good navigation needs stability and html links; new internal links added is just fine; but links chamging isn't!
Another big issue for many automated sites is small unique content per page, with heavy 'shared content'.
Let xenu be your friend, too.
I have just spoken to the client - it's not a re-write as such - the url's are re-written within the database using a VB script- they take the name of the product - hyphenate it and produce an asp page so the domain is www.domain.co.uk/directory/product-product.asp
He did say whilst answering this question that the page that produces the product html page relies heavily on server side includes
i explain it here
[webmasterworld.com...]
In summary the pages when listed in the index have full PR value, when Sandboxed they have a grey toolbar. I really think this is some sort of New Page sandbox filter on large established sites.
I did notice that over on google groups there have been a few posts from a week ago where one or two people complain about specific url's being deindexed, but when I check the URL today with a site: command, I find them. So hoping it's just temporary as the new June thread here is reporting a google dance.
If you look at the page code as SEs see, not as you make it!), and there's five centimeteres of content and 2 metres of 'shared content' (2 inches and 2 yards!), maybe with duplicated meta tags, and you have a common form of sick site.
his site has about 950 pages and his products change regularly - however when I do a site: search I find -
3510 pages indexed - only 230 of these are NOT in supplemental.
Only about 10% of his product pages are showing as being the index.
The question is - due to such a large percentage of his site being in supplemental - should I remove it?
I have found some duplicate content (not much though - about 20 pages at a guess) - I'm guessing I should remove them - but what about the rest of it? Some of it are things like view basket pages and stuff like that - that we have now blocked access to via robots.txt file - ut should we remove these from the index?
What I getting at is - Google likes big sites yeah? What if getting rid of all these pags makes Google realise it's a smaller site than it thinks it is?
Is there any benefit to removing these pages from supplemental (apart from dup content - which I assume there is)
cheers guys
What I getting at is - Google likes big sites yeah? What if getting rid of all these pags makes Google realise it's a smaller site than it thinks it is?
I really don't think Google has any preference on that.
It's usually best to think of Google as going page by page; so look at those in supplemental on their merits; there may be more than one reason. But there's always a reason; you need to know what's sick about your site to avoid future problems.
Think about how to interlink product pages based on category or subcategory or product sort of some sort.
When you link to a different product from the current page use a different text in the link that discribes the product.
in other words lets say you have a widget named
--------------------------------------------------
Purple Hat with orange feathers.
Link to that page from products that have everything to do with:
.Orange purples in hats
.Hats that fade from orange to purple with feathers
.Purpled feathers in oranges
--------------------------------------------------
Widget Properties
Standard:
Name: Big Red Bold Widget
Dimension: Bold
Size: Big, 12 Feet
Color: Red
Creative:
This fantastic widget of red color has bold dimensions in big 12 Foot Size
With Size of 12 feet and in red color with bold dimensions has this widget fantastic looks.
--------------------------------------------
Also if you could build metatags dynamicaly that takes the relationship between the products, that makes it even better.
The only thing that you have to do is to make it sure if looks natural to a human.
Mix but Match..
Oh Boy I could go on and on with this...
P.S. The only thing that I don't get is why Google shows the Old Meta Description and Title of the page, and when I click on Cashed it shows the new copy in place from 3 days ago.