Forum Moderators: open
I removed all links to page1.htm from my site and added links to page2.htm.
in this case, how does googlebot work? does she come to my site looking for page1.htm or looking for any links from my home page?
if i left page1.htm on the server without links, does googlebot and/or freshbot visit it again?
If no, Does page1.htm disappear from the SE with time?
Thanks for the help. even though these question look like hypothatical questions, there are not. It is a long story.
often google makes a first touch and takes only one page. then google comes back after a time and digs a bit deeper. and so on.
if you update your page quite often and the amount of pages grows (with different content of course), google will visit you more and more often.
if there are no more references to a page (not even from other pages in the internet), then google will take the page out. but this can take some time.
"This page has moved click here to continue" and send them to page2.htm. You could also add a few seconds Javascript redirect on it if you wanted. Google doesn't seem to mind those in this type of situation.
Google will eventually index page2 and probably stop listing page1 in the SERPs as it won't contain any relevant on page terms. Deleting any page is risky, Google may eventually remove it from the index, but how about other search engines?
So many webmasters delete pages still indexed by some engines. They must think the user is going to say "Ohhh a 404, I'll just retype the domain name and start from the home page again!" ;)
I have 400 dynamic pages, I did change the URL to a more friendly format for both humans and SEs.
Now i have 400 pages indexed in google with this format
hhtp://www.mydomain.com/items.asp?itemnumber=1
the new format is hhtp://www.mydomain.com/Widgets/green widget.htm, but if the SE serve the old format then the custom 404 file i created will take the user to correct page converting the url to the new format iam using.
Both pages created on the fly from a database, so i can not redirect one to another, etc.
Since all my new links on my home page in the new format
hhtp://www.mydomain.com/Widgets/green widget.htm and there is no refference to the old urls, do you think googlebot will figure that the old pages are gone and slowly forget about them?
Also, Would this in any way consider to be spam (having 2 urls for a page).
Thanks in advance.
if i left page1.htm on the server without links, does googlebot and/or freshbot visit it again?
Also, Would this in any way consider to be spam (having 2 urls for a page).
Yes ... and this is a dangerous practice. Having virtually every page mirroroed within your site and not linked to anything can get you banned. They are called orphaned pages. You need to take down the original pages (remove the content) and replace with a redirect.
[edited by: Marcia at 7:01 pm (utc) on Jan. 9, 2003]
[edit reason] formatting problem [/edit]
the forum has a bug here. i think this is related to the quotestyle code it cen be
disturbing using nestedly loosing one closing code.
[edited by: Marcia at 7:03 pm (utc) on Jan. 9, 2003]
[edit reason] formatting problem [/edit]
Liane, is that true? to get banned because of two links on the same page linking to the same page? i won't think so.
I am talking about having a page which is not linked to from the main site. That is an orphaned page and is viewed as a doorway page by some SE's, including Google. The double whammy in this case is to have an orphaned page or pages with duplicate content of pages that are on the site.
Definite mistake to take that route. Its best to take down the content and replace the url with a redirect to the new page.
Is this what we are looking for?.
Liane>
Sorry if i mislead you to think there are 2 physical pages on the site, actually there no pages at all. all my pages are created on the fly from a database, and all of them are accessable with the original url and now the new url format which also can access those pages.
Do you still think this is a spam?
hakre>
I think you are breaking the forum. Becoming a full member in 2 days, congratulations and keep up the good work.
Yes ... and this is a dangerous practice. Having virtually every page mirroroed within your site and not linked to anything can get you banned.
No it won't. I've dealt with this issue a zillion times. Worst case scenario is Google understands that there is duplication and drops/doesn't display the duplicates with the lowest PageRank. In this case, it would be the orphan pages.
The real issue this scenario causes is maxing out page limits for your site.
The thing to keep in mind is that once Google extracts links from a page and stores those pages in its database, it will revisit those pages on a regular basis. It doesn't "start from scratch" each month by revisiting and extracting links from your homepage, so simply removing all links to the old url will not remove them from the db.
So you need to tell Googlebot that you don't wabnt the old urls indexed anymore. Based on your examples, the easiest way would to simply use your robots.txt file.
User-agent: Googlebot
Disallow: /items
That should prevent Google from reindexing any url with the old format.
>So you need to tell Googlebot that you don't wabnt the old >urls indexed anymore. Based on your examples, the easiest >way would to simply use your robots.txt file.
>User-agent: Googlebot
>Disallow: /items
Even though Items is not a physical directory?
Should i do that right the way or wait for the new pages to be indexed first?.
/item
will disallow both item.asp as well as /item/
As far as putting it up, I would put it up right now. The old pages and new pages will be requested during the same crawl cycle, so there isn't any drawback from having it up.
Big mistake by Disallowing /Item
I should wated until i saw my new pages indexed or at least being visited by googlebot.
I watched my pages drop from google like flys, the good news google has been coming to the new location and even indexed all the links in the main page.
Does this mean i will have a deep crawl this month?
This post just to help others not to fall in the same mistake i did.