I firmly believe that this issue is what is causing pages to drop.
except that the majority of people who have lost pages, don't use google sitemaps
|except that the majority of people who have lost pages, don't use google sitemaps |
And don't forget about all the people who don't have punctuation in their domains.
Anyone who does use sitemaps getting this error:
This service is unavailable.
Please check back later.
Yes I am getting the "Service unavailable" as well..
I meant I do not think the / is just effecting sitemaps only. I think its also effecting the way google indexes.
It would not suprise me if this is also a factor in dropped pages.
"I meant I do not think the / is just effecting sitemaps only. I think its also effecting the way google indexes."
I am currious on that also. For one...why has it been taking this darned long to fix?
Search for an email address with a hyphen in it.
See that you get a bunch of results.
Replace the hyphen with a space, and search again.
See that a large number of supplemental results appear, and the snippet does show that same email address again.
More work to do.
Is anyone else having trouble with the links on the sitemaps index stats page? I click on the link, a new blank window opens and nothing shows up.
I checked our site by copying and pasting the link from the sitemaps page into our default google (188.8.131.52)and show 770 pages without the trailing slash, 779 with the slash. Isn't this just the opposite of what Vanessa said would happen?
It is probally taking a long time to fix, because it is effecting their whole system
I quite like the trailing slash thingy.
With trailing slash - shows pages on our site indexed by Google
Without trailing slash - as above but also includes supplementals
The good thing from Google's point of view is that it's getting harder to distinguish what they mean to do versus what is actually not working.
To put it politely, i think nobody knows what's going on in certain areas of functionality, at Google or even out here on WebmasterWorld at times, which is a worry for trying to establish some stability - it could take ages.
I mean, look at this for example:
i do a site: query on our sitemap pages on one site and get 2 pages,
on another site with the same structure [ but unique content ] updated 2 months later i get a correct 28 pages. The first site is showing a drop in sitemap pages, but actually shows more pages cached across the site.
How can Google say that site map content which is used for assisting the bots through the site be excluded from caching [ or maybe I've missed a new innovation!?!? ]
We currently own a hyphenated domain name..
We still see NO CHANGES..
With a slash and without a slash...
24 pages from our site listed..
496 pages supplemental results..
Is this considered a fix?
|We still see NO CHANGES.. |
No changes on our sites either in terms of number of pages showing up in the index of a generic google.com site: search.
All pages seem to be indexed when doing a site: search on various specific data centres, complete with unique descriptions for each page.
One thing I have noticed when using the generic google.com search as opposed to a specific DC, is that all pages which do show up display only the (same) general site description rather than the individual page description.
Has anyone else noticed this, or is it just me? :(
"All pages seem to be indexed when doing a site: search on various specific data centres, complete with unique descriptions for each page. "
Could you give me the IP Address of the datacenter?
There's a good list of apparently updated DCs in msg #45 of this [webmasterworld.com] thread. I'm not so sure they're truly updated - some of my results are pretty old, but at least the pages are listed, albeit with some as supplementals.
I also tried this one 184.108.40.206.
All the same crap..
Only 24 pages indexed from our site..
Some datacenters include 496 supplemental results to it, some don't..
Supplemental results going back 01/26/05.
|Some datacenters include n supplemental results to it, some don't.. |
True. And doesn't help one iota if Joe Public surfer is getting the datacentre which shows only n (very small number) of results.
Weird thing is, Googlebot's been all over my sites in the last few days like some 06/06/06 demon! I gave up trying to fathom Google's algos after Florida, and am just thankful we get most of our traffic from Yahoo and MSN.
Do you use Google Sitemap?
Does it show when Google has indexed your site?
Was it before the fix or after?
I just noticed a difference. On one site I manage, all but two pages have been supplementaled or deindexed. One is my index page. The other is a popup page. What is funny is the popup page is dup content. It is a list of laws and regulations copied from an authority site for regulating the "widgets" this site sells. It is only for the convenience of my visitors and all credit is given to the authority site.
Now here is what is different about that page from all others. It is the ONLY one that points to my homepage as [mysite.com...] . All of the rest of the pages in this site point to mysite.com/...so is this something?
The only 301 redirect on my site is from non-www to www.
Now I also have a site that all pages point to mysite/index.html and it is fully indexed except for the links page, which I named directory html.
Both have google sitemaps.
I noticed something else regarding the site command. inurl and allinurl return a different number of cached pages totally different from both of the site commands with and without trailing slashes.
So now, I get:
site:www.mysite.com -> 5 pages + lots of supplementals
site:www.mysite.com/ -> 4 pages + lots of supplementals
inurl:www.mysite.com -> 8 pages no supplementals
inurl:www.mysite.com/ -> 8 pages no supplementals
I don't have mysite.com except in urls from my website.
I've just seen a slight change for one of my sites when doing a site: search on google.com - ie taking pot luck which datacentre it uses rather than using a specified datacentre.
BUT, although there are now significantly more results, it's still showing the general site description for each result instead of the individual descriptions I'm seeing for each page if I use a specific datacentre.
No change so far on the other sites.
F-Rose - no, I don't use Google Sitemaps, but I did recently add my own sitemap component in an attempt to get the bot through the site. Given Googlebot has been through the site like a dose of salts since I added it, maybe that's why more results are showing in the index. Hopefully this improvement will roll out across my other sites.
" I did recently add my own sitemap component"
Is it a regular site map which should be included on every site, or is it something else?
Could you be more specific about this?
|Is it a regular site map which should be included on every site, or is it something else? |
My sites run on Joomla, and I added a sitemap component made to run with it. No faffing around trying to design one myself, just uploaded and there it was done! I should perhaps have mentioned I did have a sitemap beforehand, but this one is much, much better.
Mainly did so for two reasons:
1. There was a debate going on speculating whether Google sitemaps were a good idea, and as I don't like giving G too much information, I preferred implementing the Joomla one,
2. There was another thread discussing disappearing pages being from predominantly level 3 and 4, so added site map so everything is spiderable at level 2.
Could you please send me a sticky mail with a link to your site map?
It would be greatly appreciated..
"2. There was another thread discussing disappearing pages being from predominantly level 3 and 4, so added site map so everything is spiderable at level 2. "
I was apart of that discussion. Some webmasters who have created rather large site maps seem to have lost pages again. Since our site is much smaller than theirs (a couple thousand pages) we were able to split up the site maps for each of our 15 sections. The largest site map has between 100 and 200 links and seem to have no problem (knock on wood). The bulk of our site did get re-indexed and is now being crawled in full about every day. Rankings are slowly going back up. When our site does come back we may take those site maps and redo them so that there are no more than 100 links by drilling doen a level but for now they seem to be working to get our pages crawled frequently and indexed.
We do have a google xml site map that has been in place since it started. We never really seen a huge benefit crawl wise from it - never tried a plain text sitemap. New pages seem to be picked up a bit quicker without having to go through the site but that is about it. The onsite site maps seem to work alot better to get heavier crawling/re-crawling of existing pages. This is my take on it.
We were fortunate enough that google was typically crawling 3 levels. A 2 level site may have some problems since the site map would reside on level 2 and any links would be level 3 which Gbot may not be so quick to crawl but would not hurt to try (google has always recommended one anyway). Our site maps make sure that all links are on level 3 (again that will change at a later date) with a simple outline structure and no design elements.
"Since our site is much smaller than theirs (a couple thousand pages) we were able to split up the site maps for each of our 15 sections."
Are you talking about Google site map, or your own site map?
In that statement I was talking about an onsite sitemap. We do use an xml sitemap submitted to Google sitemaps that has shown to have little effect. That sitemap incorperates ALL of the pages on our site. It isn't split up. I believe that Google sitemaps say up to 50,000(? - Can't remember for sure) links is ok.
The onsite site map has worked wonders so far (knock on wood again) and pages seem to be "sticking" now.
The discussion that was being referred to was using an on-site sitmap linked directly off the home page to where deep pages would reside on a higher level (level 3). For some it worked good and others not. Don't know the factors why though.
|The discussion that was being referred to was using an on-site sitmap linked directly off the home page to where deep pages would reside on a higher level (level 3). For some it worked good and others not. Don't know the factors why though. |
That's right, and that's what I did, except I linked the sitemap from the main menu so it's accessible from every page. The visitors seem to like it too ;) So with one click, they and bots can see a link to every article on the site.
| This 51 message thread spans 2 pages: 51 (  2 ) > > |