|
How Google reads URL with "#"
|
jcmiras
#:766668
| 5:08 am on Feb. 1, 2006 (utc 0) |
anyone who has an idea? I`m afraid it can cause I duplicated content because, for example, -www.domain.com/page1.htm, -www.domain.com/page1.htm#part2, -www.domain.com/page1.htm#part2, will basically point to a similar webpage. Would it also cause a some sort of URL unfriendliness? Thanks.
|
Wizard
#:766669
| 7:03 am on Feb. 1, 2006 (utc 0) |
I'd say #anchor is not a part of URL actually, browser doesn't send it to server with HTTP request. Google doesn't treat /page.html#anchor as different URL than /page.html. It might be possible that keywords after # mark matter a little, but in Google links database everything after # is stripped. I'm sure about this after checking with site: command one of my sites which uses #anchor links extensively. The site is old and completely crawled, links with # are showing in the source of my page in Google cache, but site: command shows only pure URLs.
|
Dijkgraaf
#:766670
| 8:19 pm on Feb. 1, 2006 (utc 0) |
Google does not have a problem with the # in URL's, it correctly ingores them. Some other badly written bots do however have problems with them and actually send requests with them. But these are usually the ones which are hostile bots (email harvesters, site copiers) that you don't want visiting anyway.
|
mrMister
#:766671
| 8:28 pm on Feb. 1, 2006 (utc 0) |
| I'd say #anchor is not a part of URL actually |
| Agreed, the #anchor is not part of the URL, but it is a component of the URI
|
g1smd
#:766672
| 12:50 am on Feb. 2, 2006 (utc 0) |
I linked to a page using ONLY /the.page.html#sectionA and /the.page.html#sectionB and Google indexed the page, and correctly listed it as /the.page.html with no problems at all.
|
jomaxx
#:766673
| 1:00 am on Feb. 2, 2006 (utc 0) |
I've never had a problem with Googlebot not understanding this, but Google's AdSense ("Mediapartners") spider sometimes converts the "#" to a hex code and appends that to my page names -- resulting in a couple of dozen 404's every day. Hard to believe they can't detect and fix this bug, but it's been going on for at least a year, as recently as yesterday.
|
g1smd
#:766674
| 1:31 am on Feb. 2, 2006 (utc 0) |
Send in a report using the feedback forms.
|