| 8:16 am on Aug 7, 2006 (gmt 0)|
There is also a uk version
should uk sites by manged here or does it make no difference?
| 8:26 am on Aug 7, 2006 (gmt 0)|
Intresting example with the nytimes - it is hard to find examples which are OK to discuss at WebmasterWorld but NYtimes should be fine.
In a way it is intresting that they have not already been penalized like the majority of sites that have had this issue.
I guess a PR10 on the www and a PR7 on the non-www are enough to overcome a lot of penalties.
Lots of webmasters have been crying out for a fix for this issue for ages - whether this sitemaps preferences results in improvements to the ranking of sites effected by the bug is the real question in my opinion - if it is succesfull then perhaps it would be correct for a general announcement by Google that using the webmaster central is the way to deal with any issues.
| 10:56 am on Aug 7, 2006 (gmt 0)|
I went through it and i guess it is ok. I still have no intention of verifying my sites, but I can see some benefits to using it. Right now, my favorite thing to do is look at competitor stats and look over their robots.txt files. While it is basic, I can still at the very least be able to use common operators on a competitors site and also see the robots.txt no follow directories all at one place.
Granted that has always been out there, but being able to go to one place on google and a couple of clicks is pretty cool.
| 11:48 am on Aug 7, 2006 (gmt 0)|
"Angonasec, what browser are you using? I'll check into why the tools doesn't expand for you when you click the +."
I'm using FF 220.127.116.11 (The latest) on Mac OSX 10.4.7 also the latest.
It is a browser compatability bug, because I do see the drop down menu when I use Safari.
If only I could get a helpful reply to my emails asking why our NFP site is being punished by the algo...
| 11:59 am on Aug 7, 2006 (gmt 0)|
I'm also using FF 18.104.22.168 on Mac OSX 10.4.7 and haven't noticed any problems. Specifically, the +Tools does exapand on the My Sites page. But I've been all over the rest of the site without noticing any other problems.
| 12:14 pm on Aug 7, 2006 (gmt 0)|
Thanks jay5r that's helpful to know.
One additional factor that may help Vanessa, is that I use the FF extensions "Noscript" and "Adblock Plus", but with both set to allow the the Sitemaps pages to run whatever they like. ie. Unblocked and allowed.
Despite this the + Tools menu does not drop down.
| 7:10 pm on Aug 7, 2006 (gmt 0)|
>> Problem is these redirects don't always resolve the problem very fast. I've seen some sites penalized up to 9 months when trying to fix a www vs. non-www issue with many of the old domain pages still in cache up to 3 years later never being updated. <<
As far as I can tell, seeing old non-www URLs in the SERPs as Supplemental Results isn't usually a problem in and of itself. Google holds on to those for years. That is to be expected; but GoogleGuy has just confirmed that are going to be updated to more recently spidered data soon.
The correct measure as to whether the problem is fixed, is to see how many www pages get listed. Every time that I have added a 301 redirect to a site that has never had one before, the number of www pages listed has rapidly increased, the URL-only www listings have turned into full listings, and the number of www Supplemental Results has rapidly decreased.
The non-www URL-only results have declined in number too. The amount of non-www supplemental results has varied, sometimes staying static or sometimes falling, but often increasing by a small amount a month or two after the redirect was first implemented.
Where those non-www Supplemental Results appeared in the SERPs, the redirect on the site still manages to deliver the visitor to the correct www version of the page.
For most sites without the redirect in place, I almost always also found a poor usage of the <title> tag, the same meta description in use on multiple pages, and an incorrect link back to the root index page: always use the "http://www.domain.com/" format, always omitting the index.html index file filename from the link itself.
Clearing up all of those issues, has always helped a site get over the listings problems. Xenu LinkSleuth has often been a great help here too.
Finally, remember that when you link to a folder to always include a trailing / on the URL. A link to just /folder on a page at www.domain.com could see the link pointing to www.domain.com/folder which then gets automatically redirected to domain.com/folder/ (without the www!) to add the trailing / and then redirected onwards to www.domain.com/folder/ to add the trailing / back on again.
The intermediate step at domain.com/folder/ could kill your listings.
| 7:19 pm on Aug 7, 2006 (gmt 0)|
All of that has been accounted for and the problem has still existed on some sites.
The url's are redirected properly, the title, meta, slashes are all correct. The only problem is what is showing in the index. All links have been checked and give the proper 301 through Xenu. This was all checked a very long time ago.
Google simply has problems and its obvious. Hopefully the new options fixes the issue.
| 9:36 pm on Aug 7, 2006 (gmt 0)|
I just setup my site with Google's new control tool (sitemap) and I love it! Anyone know how long it will take G to crawl/index the content I reference in my sitemap?
| 10:46 pm on Aug 7, 2006 (gmt 0)|
|a.) Of the millions of websites, how many of them have different pages for www.domain.com and domain.com?! |
b.) Assuming the answer to the above is none, why haven't G engineers figured this out yet?
That is a terrible assumption to make. I can think of quite a few very important domains where that is not the case. this is especially true with domains that have several levels of subdomains, like large corporations and .edu domains.
Historically, there weren't many entities that would route the traffic to different machines based on the port, you routed it based on the domain name. If anything, the default domain was used as the email machine. Then you would have a few default machines with domain names for the services that they offered, such as ftp.example.com. The new, funky thing called the World Wide Web got its own server, usually set up by some geek in a lab that just wanted to play with it. Naturally they gave it a subdomain of www.
Granted, the majority of sites have both the primary domain and the www subdomain pointing to the same site, that does not mean that assuming that they are the same is the correct way to handle it.
While google is aware that they have a problem that they need to sort out, I would rather have them come up with something that will work right in all cases, not only the cases where the webmasters or hosting services are ignorant.
| 7:17 am on Aug 8, 2006 (gmt 0)|
Okay, I believe most/all U.S. users should see radically fresher supplemental results now. The earliest page I saw was from Feb 2006, and most of the ones that I looked at averaged in the ~2 month old range.
As data gets copied to more places, the fresher supplemental results should eventually be visible everywhere, not just the U.S.
| 7:51 am on Aug 8, 2006 (gmt 0)|
Can't wait to see my USA supplemental results from 2005 get sorted out.
"fresher supplemental results" is that good or bad?
| 8:23 am on Aug 8, 2006 (gmt 0)|
Is that data heading out from the 22.214.171.124 dc (it has hit a fair few more now too)?
It is just that for me the data refresh on the 27th July (the second 27th one) seemed to hit all DCs but not that one - so is that DC (and associated ones) still at a less advanced stage in some aspects but a more advanced stage in others?
| 2:37 pm on Aug 8, 2006 (gmt 0)|
woohoo supplemental result clean up - finally! All my pages until 2.Jan.2006 are gone. Also ranking of my websites jumped up.
Now a million dollar question - when / how to get rid of supplemental result after 02-Jan-2006?
Ideas for sitemaps team:
1) Realtime url removal tool from index or any kind of url removal / xml support
2) Duplicate content indicator
3) https/http issue - [domain.com...] == https://www.domain.com (like [domain.com...] == [domain.com)...] this will save us alot of CPU :)
| 2:47 pm on Aug 8, 2006 (gmt 0)|
Okay, the main site I was worried about went from 640 results to 12,200 overnight, most of which are supplemental cache from March 2006, which is definitely a step in the right direction. I'm pretty sure I know why they went supplemental, so I'm pretty sure I can get most of 'em out. On the other hand, another client went from 9600 pages to 78, so now I gotta figure that one out.
GG, it'd be a good thing if they added the URL Removal tool to WebmasterCentral, and I'm all for it, since I really hate how many different times I have to log in to different places (under different Google accounts, mind you) in order to get things done, but my post was actually about the fact that I was having TROUBLE with the url removal tool, and there's no place I can report it or get help with it. That's mainly what I was asking for to be added to the overall picture.
| 3:00 pm on Aug 8, 2006 (gmt 0)|
znakedwrx: Google offers a url removal tool. Just put the right meta tags on the pages and remove them?
Did I won the million dollar?
| 3:14 pm on Aug 8, 2006 (gmt 0)|
NedProf: Sorry you did not ;) Million dollar question was for supplemental results after 02-Jan-2006 :)
Regarding url removal - noindex is fine for pages which are accessible. What happends if some pages where indexed by my mistake - i remove them - but google still shows the pages that i dont want too.
My comment about current url removal tool: SLOW SLOW SLOW.
It took me over 40 days to remove 50 pages. Lets say - its okey.
But man - it took me 1 hour to submit those 50 pages to google removal tool! Thats why i say - make something like remove-me.XML and problem solved. Other option is to use current sitemap.xml but to add new flag: NOINDEX
If we can help google to find pages, then we can help them to remove pages too :)
| 3:28 pm on Aug 8, 2006 (gmt 0)|
With the choice of how to display urls www vs non www in site maps, does that now mean we do not have to 301 the pages?
| 4:08 pm on Aug 8, 2006 (gmt 0)|
Suggestion for url removal tool in Webmaster central
Clearly , Google can generate a list off all pages off a site that are indexed in google databases
Can you not simply include a checkbox beside each URL?
Plus a delete all selected URLs push button
| 4:45 pm on Aug 8, 2006 (gmt 0)|
Actually, I'd rather have a bulk upload tool of some kind. I just got through removing several hundred urls, and I have several hundred more to remove, and it's a pain going through and entering them one by one - it's only somewhat less of a pain to click checkboxes for each one. It'd actually be easier to download a csv file of the 404s reported in the tool, remove any that you don't want removed, and then upload it back up. Except that my sitemaps account has 61 websites in it currently, and it'd be even better to be able to do it en masse, rather than site by site.
| 4:54 pm on Aug 8, 2006 (gmt 0)|
One other question, when we select our preference www vs non www, will google adjust the back links and will google adjust page rank accordingly?
| 9:09 pm on Aug 8, 2006 (gmt 0)|
It's great that there's a new capability to:
"Display URLs as www.mysite.com (for both www.mysite.com and mysite.com)
Display URLs as mysite.com (for both www.mysite.com and mysite.com) "
Is there any capability (maybe it is already there as part of the above?) of setting the https to http? Right now when you search site:mydomain.com you get https://mydomain.com at the top of the list returned.
| 9:16 pm on Aug 8, 2006 (gmt 0)|
I haven't seen a http or https ability. Usually that comes from not using absolute links when on a secure page so that when someone (or a bot) follows a link off of the page they view the site on https.
| 10:26 pm on Aug 8, 2006 (gmt 0)|
znakedwrx: you're removing the pages one at a time? Last time I wanted to remove all the pages in a directory (Actually, they were images from a shared directory which Google had indexed by mistake) I used the "Remove pages, subdirectories or images using a robots.txt file" and it zapped all the indexed files which were denied in my robots.txt in one go.
| 11:14 pm on Aug 8, 2006 (gmt 0)|
Matt Cutts has recently said that the 301 redirect is still the best fix for www and non-www issues.
He also said that PR gets merged too.
Removing multiple pages is best done by listing them in the robots.txt file and submitting that, but I am not sure if that touches old supplemental results in any way though.
| 11:53 pm on Aug 8, 2006 (gmt 0)|
Why not 301 redirect obsolete URLs to some place similar and useful? This would be good for the user (they still get somewhere) and should help quickly make those URLs disapear from SEs.
| 12:09 am on Aug 9, 2006 (gmt 0)|
For a URL already supplemental, adding a redirect does not make the old URL listing go away, but it does get the user directed to some sort of content on the site.
| 1:10 am on Aug 9, 2006 (gmt 0)|
Well, google needs to answer about the page rank and back links and if we still need to do 301's. We just recently aquired a site that has 5000 pages, half are indexed as www and the other half is non www(Back links and PR lay with both www and non www accordingly)
I am hesitant in doing a quick fix in sitemaps because the site ranks well in serps and if it does not pass pr and back links, it could tank in the serps.
It is one thing to say we think it will pass pr and back links and another thing to say google tested it and it seems to work.
| 1:12 am on Aug 9, 2006 (gmt 0)|
Matt Cutts still advises to add the 301 redirect. He commented on that in his blog in the last couple of days.
| 1:13 am on Aug 9, 2006 (gmt 0)|
The Tools would be great if they correctly showed useful information, specifically, why a site, with decent pagerank is being penalized or filtered in the SERPS. If honest webmasters had this information I think you would be surprised how many of us would make a serious effort to clear up the issues that are causing these penalties and filters. This rampant speculation of why this and why that is happening and all the testing of “theories” is what gets us upset and in many cases creates many of the black hat tactics to get our pages reinstated. Put tools in place that provide us with information we Can Use to fix problems with our sites. Obviously Google knows very clearly why a penalty or filter is being generated, why can’t Google provide us with this information so we can fix the problem?
| 1:14 am on Aug 9, 2006 (gmt 0)|
If thats the case, then this is just a cosmetic fix and does not deal with the real issue. Googleguy, can you clarify this?
| This 167 message thread spans 6 pages: < < 167 ( 1 2  4 5 6 ) > > |