Forum Moderators: Robert Charlton & goodroi
After all this work, I expect some rewards, but im currious in the time frames and how internal pages are ranked.
Under my old static site, Google would actually, spider my site, however all internal pages would have no PR, despite the home page having a PR of 5. And all the internal pages were supplemental.
My understanding, that now with static content, my home page PR will drop to 4, and first level links will get a PR of 3, second level links a PR of 2 etc.
My new static looking pages have already been crawled, only after 3 days, however I don't see the internal pages indexed yet.
My understanding is that Toolbar page rank is only updated about 4 times a year, and it has already been close to 90 days since the last Toolbar update, however I figure PR is calculated internally, and is used by Google much quicker.
Im looking for a time frame from others who have gone from dynamic to static, and how long it took for internal pages to get indexed, and become visible for search results.
Our site has been listed for seven years, so I figure the sandbox (if it actually exists, and is not an algrothmic artifact), will not be a factor.
I guess im a little impatient, in reaping the fruit of my labour.
One thing we've all learned over the past 12 months is that you can take Google to the water, you may even get him to drink ... but whether he swallows is a whole different ball game.
Best Wishes & Just Desserts for your Hard Work.
Col :-)
www.example.com/cgi-bin/script.pl?page=pagex&cart_id=number
to
www.example.com/pagex for spiders and bots and
www.example.com/pagex/cartnumber for users
I did have a hundred or so old dynamic urls indexed but they were all supplemental and didn't count for much anyways. Im letting those 404 out, to get rid of them.
With a PR of 5, I expect google to index my whole site (around 500 product pages). I was hoping for 6 weeks or less.
My assumption is that the sandbox applies to the site not individual pages?
Everyone here is pretty much in the same boat.
All the best
Col :-)
Google is doing something strange.
I checked my raw log file and google is still fetching the
old dynamic URL's on todays date, however the new static
site has been up for a week.
The only thing I can figure, is that google spiders for the URI path only on a first run, but then spider the actual content at a latter date.
However it would not make sense for google to do this, as it would add overhead. Google is better off doing everything in one pass.
The site went online the end of august. Within 2 weeks Google had the new pages indexed with new title and descriptions and old asp pages turned Supplemental Results (Matt Cutts recently stated that SR does not cause a penalty so that's not a problem).
Keyword rank steadily increased till major keywords were ranking on first page of result within 3 months. PR recently turned to PR 3 on all but a few pages.
This appears to be the norm for redesigned websites. I think you just need to wait a bit longer to see results.
If the old URLs serve content with status "200 OK" then Google will continue to index them. You need the old URLs to each serve a 301 redirect to their new respective URL, or send "404 Not Found" for each request. All of your internal linking needs to use the new style URLs on every page. Use Xenu LinkSleuth to make sure that is so.
I guess we'll never really know how important an individual page or link is to google, so it's probably best to treat them all with equal respect.
All the best
Col :-)
I guess google did scan the proper directories. When I was looking at the weblogs, I forgot that the url paths are what they look like after mod_rewrite, not what google actually saw.
www.example.com/pagex for spiders and bots andwww.example.com/pagex/cartnumber for users
Am I correct in assuming that you are doing something like:
www.example.com/blue_widgets
for bots and
www.example.com/blue_widgets/c2v3f56r65ws
for normal users?
If so, you could run into some real trouble with duplicate content penalties as well as getting slammed for cloaking. You would be much better off, IMHO, with the following:
www.example.com/blue_widgets.html
for bots and
www.example.com/blue_widgets.html?cart=c2v3f56r65ws
for normal users
Actually you would be even better off not using session ids at all since the vest majority of users have cookies enabled. I used to have my site setup as in my example and even using IP delivery to detect bots ended up with pages with session ids getting indexed.
I resisted the idea of eliminating SIDs for a long time assuming that a significant number of users would have cookies turned off. I set some code up to log whenever someone tried to add an item to their cart and had cookies disabled, and it was very rare. It usually happened when someone bookmarked a cart page and tried to go back to it. Of course, YMMV
As far as dulicate content, the old dynamic pages have no PR, and the new static looking pages will have PR, so if their is a duplicate content penalty, Google will get rid of the old dynamic pages, which is what I want anyways.
I do't think googleguy is around anymore, but I have heard these concerns before, and have heard of them refuted. It would be nice to hear it from an authoriative source, do dispell this once and for all.
Is this a must? I've been maintaining links to the old url that 301s. So far MSN and Yahoo have been quick to replace the old with the new. A month after I activated the 301s, Google is starting to index the new urls (which go to subdomains of the old url).
So far, running a search for a text snippet that belongs to both the old and the new url returns only the new url.