Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Internal Links Webmaster Tools missing entire section of site

         

martinmartin

7:12 pm on Jul 9, 2010 (gmt 0)

10+ Year Member



Our e-commerce music downloads website has approximately 250k pages that are mysteriously missing from Google's Webmaster Tools data. The area of the site is under our /ARTIST folder. This has been going on for one year. We are completely stumped. What could possibly be the problem?

1. There are no /ARTIST pages indexed in the Google index.

2. Webmaster Tools doesn't show any evidence that it knows about the /ARTIST section of site. W.T. Internal Links shows no /ARTIST pages and no links to /ARTIST pages from other pages on the site.

3. Webmaster Tools / Sitemaps hasn't indexed any of the /ARTIST urls. The urls in sitemaps we submit regularly that are /ARTIST related are /album_sitemap1 thru 3 containing 135,000 urls, and /artist_sitemap1 thru 5 containing 225,000 urls. W.T. shows that they have been downloaded and submitted but zero show that they are in web index. - They do index our /LABEL urls.

4. In W.T. under site performance it never shows a url from the /ARTIST folder.

The pages in /ARTIST are well linked from all areas of the site. Yahoo! and Bing both index this area of the site.

Any help or assistance would be greatly appreciated.

Thank you,

DM

[edited by: tedster at 5:19 pm (utc) on Jul 10, 2010]
[edit reason] no personal URLs, thanks [/edit]

tedster

5:31 pm on Jul 10, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The really puzzling part is that Bing and Yahoo are indexing those URLs - that rules out most of the "usual suspects" such as a robots.txt error or a robots meta tag problem. I assume that your robots.txt doesn't have separate rules for googlebot, and that googlebot doesn't get served content that's any different.

The only thought I have right now is that you may have a very indirect click path to the ARTIST directory - Google tends to ignore deep pages at times. Is /ARTIST part of the main navigation?

A second thought, do you have some kind of server log access so you can check for googlebot access?

Also, have the artists you list gone on any spamming sprees, promoting their own pages in ways that cause your site to get dinged? That seems unlikely, but it's a thought.

And finally, have you checked a few of those ARTIST pages to see if they were hacked and are now serving parasite content that is cloaked only for Google to see? You can use the "Fetch as Googlebot" tool in the Labs section of WMT to do that.

martinmartin

9:43 pm on Jul 10, 2010 (gmt 0)

10+ Year Member



Google did at one time index a lot of these pages under our IP which we have been 301 redirecting for quite some time. There are still a few indexed now under the IP.

The only thought I have right now is that you may have a very indirect click path to the ARTIST directory - Google tends to ignore deep pages at times. Is /ARTIST part of the main navigation?

>> All are directly linked on home page and browse and top sellers, etc

Link structure is like this:

Like this: /artist/NAMEOFARTIST/
ARTIST/NAMEOFARTIST/release/NUMBER-NAMEOFRELEASE
ARTIST/NAMEOFARTIST/release/NUMBER-NAMEOFRELEASE/TRACK/NUMBER-NAMEOFTRACK

We do have both /ARTIST and /.../RELEASE/ urls in Sitemaps. Could that be a problem?

A second thought, do you have some kind of server log access so you can check for googlebot access? >>Yes. We've done some checking. Google has spidered those urls but doesn't ever index them.

Also, have the artists you list gone on any spamming sprees, promoting their own pages in ways that cause your site to get dinged? That seems unlikely, but it's a thought. >> There are a few that have placed inbound links to their pages out there. And we have used inbound CPA and affiliate type ads to generate leads. But we stopped doing that over a year ago now. It wasn't spammy. Or scammy.

And finally, have you checked a few of those ARTIST pages to see if they were hacked and are now serving parasite content that is cloaked only for Google to see? You can use the "Fetch as Googlebot" tool in the Labs section of WMT to do that. >> I'll check but kinda doubt that.

tedster

12:28 am on Jul 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google did at one time index a lot of these pages under our IP which we have been 301 redirecting for quite some time. There are still a few indexed now under the IP.


That could be an important clue. Have you verified that IP addresses are redirected with a 301 status in the http header?

tedster

12:39 am on Jul 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



An added thought - if indeed an accurate 301 redirect has been in place for that long, and Google still has pages indexed under the IP address, then I think a reconsideration request has a chance of turning things around for you.

phranque

5:48 am on Jul 11, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Like this: /artist/NAMEOFARTIST/
ARTIST/NAMEOFARTIST/release/NUMBER-NAMEOFRELEASE
ARTIST/NAMEOFARTIST/release/NUMBER-NAMEOFRELEASE/TRACK/NUMBER-NAMEOFTRACK

We do have both /ARTIST and /.../RELEASE/ urls in Sitemaps. Could that be a problem?

it's not clear from your post but is your server case sensitive in terms of urls.
if you serve your artist content for any case in the url you may have a serious canonicalization problem.
i.e. /artist/... /ARTIST/... /Artist/... /ArTiSt/... etc.

martinmartin

3:16 pm on Jul 21, 2010 (gmt 0)

10+ Year Member



Yes. It is case sensitive. but the caps in example here were to add emphasis.
:-)

martinmartin

3:21 pm on Jul 21, 2010 (gmt 0)

10+ Year Member



It seems that the only pages that are not followed are those that have buy links. all of our buy links have a rel=nofollow. would this possibly cause Google any problems? a typical buy link is:

<a class="cart_icon" href="#" id="c_Song_151235" onclick="new Ajax.Request('/execute/add_cart_item/151235/Song', {asynchronous:true, evalScripts:true}); return false;" rel="nofollow" title="Add This Song To Your Cart">Buy</a>