Forum Moderators: phranque
Keywords for this post: <snip>
I have a site that is sitting on the Intershop 4.03 platform.
We have been working on the site and have removed most of the issues associated with, bad crawlability
of a dynamnic site;
Removed frames;removed the cgi-bin from the URL, removed the session id in the url and included a site map with a list of all the DSE links (direct store entry links, i.e direct links to individual products, with out session id's in the urls, no CGI-bin in the url)
As you all know, Intershop is a e-commerece solution used by more than 20 000+ people around the web, and this is an ongoing problem, that has not been solved, in the 5 or so years intershop has been running;
Here is how the site and its re-directs work at the moment..
And what we have done so far in attempting to resolve this world wide problem;
Previously the site would behave as follows;
www.domain.com - >asp redirect to: >
store.domain.com/cgi-bin/company.storefront/12321esqdJHKJHSKJH2312j3hk1j2h3kj1he2jk1dkjn1d1/main/
when you clicked on a product, the link looked like the following;
http //store.domain.com/cgi-bin/company.storefront/12321esqdJHKJHSKJH2312j3hk1j2h3kj1he2jk1dkjn1d1/product/diamonds/2034
The session ID is automatically generated by the IIS webserver.
The site was also originally in frames, these have been removed.
it now behaves like this;
www.domain.com --> asp redirect to
www.domain.com/customer.storefront/
If you wandering what the customer.storefront is; it is a server side mapping; to a DLL file, that proccees information from the SQL database; apon, entering of parameters after the file;
for example to call up a product;
www.domain.com/customer.storefront/producttemplate/23445
this bit >::: producttemplate/23445 <-- parameters.
Now here is the problem;
The front page of the site, is indexed and is performing REALLY well,on ALL major search engines, some time achiving top, page one results.
However, all the links on this; "front page" are DSE or direct store entries; meaning they link directly to the products or category pages; and contain no session ids or any thing odd like that.
look like this: www.domain.com/customer.storefront/EN/category/3344
The front page is being cached and spidered by google, but all the other pages are being ignored!
That go off this main page.
I think it is something to do with this server mapping? perhaps
www.domain.com/customer.storefront
The customer.storefront bit, could that be stopping google from
getting in to the pages?
Please take into account, that this is a dynmically generated site(you
know this i am sure); the code each template page is stored in a sql database and so individual pages dont actually physically exist...
If I arrange to have an isapi or MOD rewrite filter installed, on the IIS server;
that changes the URLS; to something more friendly like;
www.domain.com/customer.storefront
into >: www.domain.com/default.asp
and any product page from:
www.domain.com/customer.storefront/EN/product/3234
to
www.domain.com/product/2345.htm
Any ideas....
Cheers
[edited by: pageoneresults at 5:02 pm (utc) on July 9, 2003]
[edit reason] Delinked Examples - No Specifics Please [/edit]
www.domain.com/customer.storefrontI suspect the SEs interpert the "customer" portion of the the URL as a file name and "storefront" as an unrecognized extension. Since they can't identify a file type from the extension, they don't index the page.
www.domain.com/product/2345.htmWe know they will index this URL. Make yours similar.