Forum Moderators: open

Message Too Old, No Replies

Changed all my URLs

Decided to remove the file and querystring..

         

hitchhiker

2:56 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



A sudden decision, but I went for it, last night changed all 470 new pages (this month, not in index yet) to rewritten clean URLS..

a) will google see those new urls (assuming the update is not within the next 2 days)
b) even tho deep got some of the old ones, (which still work), will it recognise the new ones and move over
c) IMPORTANT: Should i use

www.widgets.com/something/index.htm
or just
www.widgets.com/something/

It's amazingly refreshing to no-longer display dynamic urls!

magicsoftware

2:59 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



Hi hitchhiker,

how did you manage to change so many pages so quickly?
IMO www.widgets.com/something/ is better since it's shorter.

garylo

3:26 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



In any case the changes will not be reflected in the coming update.

hitchhiker

3:27 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



url-rewritting (in this case on IIS)

Previously I had:

.com/widget.aspx?lc=(xx Pageid)&lcid=(xx localeid)

I changed and then dropped LCID in favour of seperate domains:

.de/something/ = German something page
.ru/something/ = Russian something page

Critter

3:28 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



Little known secret for changing dynamic urls: do search on "apache mod_rewrite"

Peter
<^_^>

hitchhiker

3:28 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



garylo? why?

MetropolisRobot

3:30 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



I'm assuming that hitchhiker discovered the wonders of the rewriter within Apache (an arcane piece of software that is very addictive).

I did the same thing and here are my observations (although I did it with a servlet mapping system).

1) leave the old pages up for one deepcrawl update cycle and then phase them out. Seems that Google may get unhappy if your site radically changes.

2) It resulted in URLS that were easier to send to people and that was a Good Thing

3) It does not seem to have made any overall difference to the number of URLs in Google at the end of the day. Basically Google was indexing the parameter versions too.

MetropolisRobot

3:32 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



Oh and the reason you wont be in the THIS update?

The deepcrawl for this update happened 2 weeks ago.

Basically cycle is:

google update Tzero
deepcrawl T+5 to T+10
google update T+28(+) in which the index features what the deepcrawl found at T+5 to T+10

where T = time in days

cheers
p

hitchhiker

3:35 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



MetropolisRobot:
Had to re-invent the wonders of mod_rewrite:
<iis><mutter><fume><groan>

Exactly the info I was after tho, thanks, I'm just hoping in the long run it'll also pay off not having any file extensions showing up on my site.. as in that URI guidlines document (all urls should stay the same etc etc)

Every shift in tech has always meant a new set of extensions.. getting messy, im out of that game now..

MetropolisRobot

3:36 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



IMHO, Hitchhiker, a good move. I am well happy with my changes.

garylo

3:37 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



garylo? why?

The coming update is indexed by a deep crawl that has been already done, you probably missed that. The deep crawl that will affect your changes will start following the coming update.

hitchhiker

3:38 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



Yep, I was deepcrawled early March thoroughly, but I'm hoping fresh can do the trick by noticing the changes?, yeah yeah i know long shot..

perhaps if i PERM redirect the old urls instead of letting them through...

<MUTTER><MORE REWRITTING!>
<edit>I Got the deepcrawl time wrong</edit>

[edited by: hitchhiker at 3:55 pm (utc) on April 6, 2003]

mbennie

3:52 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



You should be pleased with the results hitchhiker.

I did the same thing - it's a bit tricky with IIS - a few months ago. The next cycle all of my pages got indexed and most of them hold the #1 spot for their keyphrase.

hitchhiker

3:58 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



nice mbennie, I'm pretty p**sd off with the .net model as far as indexing is concerned. (especially those js postbacks, no indexing EVER)

I think most IIS admins should look into this, it's just one of those things. As much as GG assured us that they try to crawl all pages, (and i'm sure they will soon) my aspx pages never seemed to do that well, I know this will change things.

MetropolisRobot

4:01 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



my something.php or something.jsp pages have always done better than my generic.php?param= or generic.jsp?param= pages, if that sentence makes any sense ;)

I'm assuming that the .asp, .aspx etc has the same issues.

hitchhiker

4:05 pm on Apr 6, 2003 (gmt 0)

10+ Year Member



Yep, no real evidence, but queryless pages just seem to do better. I've removed everything now, even the filename.type, maybe I went too far, but that's future proofing to be honest..

I am now proudly..

www.widgetworld.com/somewidgetpage/
www.widgetworld.se/somewidgetpage/ (Swedish widgets)

hitchhiker

7:44 am on Apr 10, 2003 (gmt 0)

10+ Year Member



Today, 5 days later, google @ 5am spidered 268 / 470 pages.
Previously it got around 80.

Perhaps it's a coincidence :)

Thanks ppl for an excellent forum!

snook

12:31 pm on Apr 10, 2003 (gmt 0)

10+ Year Member



we just got finished a few days ago doing the same thing to our site (mod_rewrite)
Glad the update hasn't come yet...

Regular Googlebot has come a couple times since our change, but we haven't been so lucky, none of the new urls have been picked up yet, only thing I see different, is they picked up some new text on our index page.

snook

12:44 pm on Apr 10, 2003 (gmt 0)

10+ Year Member



oops, I re read this thread, and see the deepcrawl did come :(

I hope I get as luck as you hitchhiker

killroy

1:07 pm on Apr 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just back from a massive rewritign session of around 20-30000 URLs from soem REALLY old script. This might be usefull to others. I finally managed my desired DOUBLE rewrite effect.
I now redirect all old URLs vai 301 to th new urls and hten from th enew URLs back to th eold ones internally. It works BEAUTIFULLY!

Catch all old URLs that are not new URLs:
RewriteCond %{REQUEST_URI} !^/sitearea(.*)$
RewriteRule ^/cgi-bin/script.ext(.*)$ /sitearea$2 [R=301,L]

Now catch all new addresses and bring them back to the original ones internally:
RewriteRule ^/sitearea(.*)$ /cgi-bin/script.ext$1 [L]

so all existing google links turn smoothly into the new addresses, and Googlebot is already busy following the 301s too.

BTW: what exactly does google do with a 301? consider it a link to the target page? index target page but don't consider link?

SN