Forum Moderators: open

Message Too Old, No Replies

Saving Pagerank

saving internal PR with robots.txt or js links

         

squared

7:33 pm on Sep 12, 2002 (gmt 0)

10+ Year Member



I'm trying to use PR efficiently on my site. I have a TOS and privacy policy that are linked to. Since I'm interested in getting more PR to my content pages, I'd like to stop the PR leak to the TOS and PP. If I were to disallow Googlebot from crawling my TOS and privacy policy pages with robots.txt, would this stop PR from leaking to these pages? Or should I use javascript links instead? Or are there any other ways to stop the PR leak?

-Squared

Jane_Doe

11:50 pm on Sep 14, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



> If the page is in the cache it is real, otherwise it is guessed.

Are you sure? I was just wondering how you knew this to be true.

I have a site with a new page that was just added to Google mid-cycle. It does have a cache page. it is not showing any backward links thought it has a few. The page It has a PR3 on the toolbar, the same PR as the other new pages in the site that are not in the Google yet. Most of the rest of the site pages are PR4 to PR5.

This new page has links from two external PR5 sites, plus a number of internal links. It seems to me like the PR3 is still a "guess" PR, even though the page is in the index and has a cache page. I would expect the page to have at least a PR4 after the next update.

startup

3:55 am on Sep 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A site is your properity, don't allow the TOS of any SE to dictate how something you own can be viewed. It is your content and bandwidth. Try billing an SE for requesting a page that it can't read. Give it a link it can understand. You want any browser that visits your site to be able to display your content the way you want it displayed.
If for any reason you are not sure about what is and isn't acceptable, use a test site. And, make sure to employ the NO Cache tag.
I have the utmost respect for the quality of Google's results. Google does allow you to cloak even if the TOS states otherwise.

stuntdubl

5:02 am on Sep 15, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am using the two cache tags.......
<meta http-equiv="Cache-Control" content="no-cache">
<meta http-equiv="Pragma" content="no-cache">

I don't think I am accomplishing what I am trying to....here is what I am looking to do....I want google to cache pages, and I would also like users machines to only cache pages until they shut down their browser (if they start it again, it would re-download the page in case their is a page update).
Is their a way to do this, and what exactly am I doing right now?

cminblues

5:49 am on Sep 15, 2002 (gmt 0)

10+ Year Member



stuntdubl:

---------------------------
I want google to cache pages
---------------------------
-> It's ok for you.
If there isn't a meta like NAME='ROBOTS or GOOGLEBOT' CONTENT='NOARCHIVE,
and no 'robots.txt' denying content of your files to all spiders or to Googlebot,
then Google will cache yours pages.

---------------------------
I would also like users machines to
only cache pages until they
shut down their browser

---------------------------
-> This relies on how your web-server send header data to client,
and, of course, on how the client understand this, hehe ;)
In other words, this is not a Google question, is a RFC question.
[ietf.org ]

cminblues

ann

4:57 pm on Sep 16, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you have ssi enabled, (Not real sure if that is necessary but I believe I read somewhere that it is required to do this little trick), then you can chmod the page you wish the visitors to not cache to 745 which will cause the server to parse it each time it is called and served up fresh.

Google will still cache the page.

I do that with my Daily Horoscopes page.

:)
Ann

Ohh yeah, this is done on Apache, not sure about other servers.

This 35 message thread spans 2 pages: 35