homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

How Google reads URL with "#"

 5:08 am on Feb 1, 2006 (gmt 0)

anyone who has an idea? I`m afraid it can cause I duplicated content because, for example, -www.domain.com/page1.htm, -www.domain.com/page1.htm#part2, -www.domain.com/page1.htm#part2, will basically point to a similar webpage.

Would it also cause a some sort of URL unfriendliness?




 7:03 am on Feb 1, 2006 (gmt 0)

I'd say #anchor is not a part of URL actually, browser doesn't send it to server with HTTP request.

Google doesn't treat /page.html#anchor as different URL than /page.html. It might be possible that keywords after # mark matter a little, but in Google links database everything after # is stripped.

I'm sure about this after checking with site: command one of my sites which uses #anchor links extensively. The site is old and completely crawled, links with # are showing in the source of my page in Google cache, but site: command shows only pure URLs.


 8:19 pm on Feb 1, 2006 (gmt 0)

Google does not have a problem with the # in URL's, it correctly ingores them.

Some other badly written bots do however have problems with them and actually send requests with them. But these are usually the ones which are hostile bots (email harvesters, site copiers) that you don't want visiting anyway.


 8:28 pm on Feb 1, 2006 (gmt 0)

I'd say #anchor is not a part of URL actually

Agreed, the #anchor is not part of the URL, but it is a component of the URI


 12:50 am on Feb 2, 2006 (gmt 0)

I linked to a page using ONLY /the.page.html#sectionA and /the.page.html#sectionB and Google indexed the page, and correctly listed it as /the.page.html with no problems at all.


 1:00 am on Feb 2, 2006 (gmt 0)

I've never had a problem with Googlebot not understanding this, but Google's AdSense ("Mediapartners") spider sometimes converts the "#" to a hex code and appends that to my page names -- resulting in a couple of dozen 404's every day.

Hard to believe they can't detect and fix this bug, but it's been going on for at least a year, as recently as yesterday.


 1:31 am on Feb 2, 2006 (gmt 0)

Send in a report using the feedback forms.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved