Forum Moderators: open
1st site:
First Deepcrawl picked up all 77 products (id=1 thru id=77)
2nd site:
Has been through 3 deepdrawls and to date has yet to have any product detail pages (abc=00001-99999) indexed. It should be noted that I used a 3-digit identifier here that doesn’t contain “id”, whereas I used the actual “id=” on the 1st site. In addition, both sites have fewer than 100 products; I just used a 5-digit product id numbering convention on this 2nd site.
On May 9th, Freshbot grabs a new product added to the DB. The kicker is I didn’t have a new product id at the time, so I assigned it a “0”. So Freshbot grabbed a “product-detail.asp?abc=0”.
My Opinion:
It appears that at least on newer sites, Deepbot & Freshbot initially shy away from large ID numbers. Makes sense, they must think there’s perhaps more work there than they ready to commit to yet on a new customer.
Whenever possible, start ids at “id=1” instead of “id=10000”. Nice lesson from Freshbot & more DB work to do before the crawlers return. I feel like the next Deepcrawl will pick up my “id=10000”s, but it’s another long wait if he doesn’t.
Google itself states that indexing of dynamic pages is dictated by a number of factors including PR and site size (I think....possibly I read that somewhere else). How big are your sites? Could that have something to do with it? Or I guess it could be that the cut-off is between 7 and 12 characters.
They both went online with around 75 pages. The 2nd older site (PR4, should be PR5 this update but who knows now) has put on 10 more static pages and many more backlinks but the products on both hover under 100.
It's just odd that Freshbot ONLY grabbed the first product detail page ever off of the site & it had a new anamoly id of one digit with product detail links all around it with 5 digit ids. Freshbot has returned since the first grab and only grabbed the one digit id again.
I'd say there are very high odds it's just a coincidence, but it just might expedite indexing of a new dynamic site if one starts at "id=1" whenever possible, couldn't hurt.
Freshbot is at the moment picking up all of these changed 2-digit id product detail urls/pages. And it's not because it's fresh w/ the digit change: these product detail URLs are dynamically generated and change frequently; Freshbot (& Deepbot) never gave them a look until that accidental "id=0".
I wonder if using 'id' has anything to do with it?
Didn't appear to here as I changed only the digits on id# & left the "abcID=" format intact.
This APPARENTLY had an effect on Freshbot as he is now so aroused he's disobeying the robots.txt and gobbling up disobeyed duplicate product detail pages (same product, different picture, different disobeyed page).