Welcome to WebmasterWorld Guest from 126.96.36.199
I have a website that is very new and has been indexed about a month ago. it is build static in a directory structure format: www.domain.com/contact/ for example. All 15 or so pages are done this was with the default page in each folder being index.asp.
What is weird is that Google has located every index.asp page in the subfolders and I have not linked to any of them. I triple checked all navigation and also just dropped in the canonical tag on each page to the www version with the directory url only.
I am very confused as to how google found each default page. Especially that each default page would have to be linked to directly in some form.
Any thoughts. I didnt want to drop a link in here as was not sure if i could.
You've got one of the classic "canonical url" issues going here. How did Google find those urls? Here's one possibility: Googlebot has routines that test server responses for different variations of urls. In your case, your server did resolve the index.asp urls and for whatever reason, Google chose that canonical version of the page's address. If that is what happened, it seems quite perverse not to choose the version that you have in your links -- but that's machine "intelligence" for you.
Here's another possibility. Does your server change the url in the browser window when a pure directory is requested, adding the index.asp automatically? I've seen that happen on IIS servers, and when it does, Google will index the final url, and not the intermediate one.
At any rate, you may find some ideas and guidance in this thread:
Canonical URL Issues - including some new ones [webmasterworld.com]
That thread is always available in the Hot Topics area [webmasterworld.com], which is always pinned to the top of this forum's index page.
Thanks for the reply. You may be right in that the server tested different url paths. I in no way linked to the actual pagenames and in some cases i actually did absolute linking within the navigation just to kick it off right.
As I mentioned before I do have the canonical tag in each page referencing the absolute path as well.
To answer your last question, the server does not in any way display the index.asp default page before loading up the directory.
Here is another example of my url and what is happening
url indexed: www.domain.com/seo/index.asp
what i want indexed www.domain.com/seo/
Another interesting thing is that only about half the pages actually have been indexed with both index.asp and the folder. The other half are fine.
The only thing I can think of is my site is now on IIS7 which I have never hosted on before until recently with this new site. Maybe IIS 7 handles directory structure differently. Something I will be poking into this afternoon ;)