Forum Moderators: phranque

Message Too Old, No Replies

Search engine spiders and encryption

How do spiders handle encrypted html?

         

hawkmva

4:18 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Hello;

I hope this is the right forum. I checked the robots.txt and the search engine forums and could not
find information about search engine spiders ignoring
encrypted (parts) of an html file. The robots.txt enables one to "disallow" from spidering whole pages
but what about parts of a page?

If only a segment of index.html is encrypted, is
there a way to have spiders ignore the encrypted part
of the file by using html tags or the like?

And what happens if the spider DOES crawl encryption.

Does this confuse the crawler and therefore the page-ranking?

Not a typical question but I would appreciate any help. (smile).

-Aaron

encyclo

4:49 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld [webmasterworld.com], hawkmva.

There is no method of excluding parts or pages - it's an all-or-nothing approach. When you mention encryption within an HTML file, what do you mean exactly?

Lord Majestic

5:01 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Encrypted parts of HTML will certainly not be understood as intended (ie they won't be decrypted) and likely to be ignored. It would be safer to embed encrypted data in comments tags, perhaps even JavaScript, however this may be treated as attempt to use JavaScript to generate content different from that shown to search engine.

hawkmva

8:39 am on Mar 24, 2005 (gmt 0)

10+ Year Member



Hello again and thanks for the replies....

Here is how my index.html is set up.

<html>

<head>Bob's Widgets</head>

<!-- html code

This is the html code for my page

yada... yada... yada...

-->

<!--

Following is the password code for entry into the site

It is javascript...

Since this code specifies the actual password, I encrypt it.

===================================================

First the un-encrypted version...

===================================================
-->

script type="text/javascript">

function check() {
input_pass=document.formular.pass.value;
if (input_pass=="come-on-in") {
alert('Password Correct! Click OK to enter!');
window.location.href=input_pass +".html";
}

else
{
window.location="pw-error.html";
}

</script>

<!--

=====================================================

Now the encryted version .....

========================================================
-->

<script language=JavaScript>[ Non human readable encrypted stuff ] </script>

<!---

regular HTMl follows......

yada yada yada

-->

</html>

I encrypted the segment by using the following...

<snip>

Was just wondering how a search engine spider would
treat the encrypted text

Thanks - Aaron

[edited by: trillianjedi at 8:29 am (utc) on Mar. 25, 2005]
[edit reason] Removed specifics and URL drop - please see TOS. Thanks. [/edit]

zCat

9:01 am on Mar 24, 2005 (gmt 0)

10+ Year Member



I hate to tell you this, but that code is not encrypted. It's merely "escaped", i.e. non-ASCII characters are converted into their hexadecimal equivalents. "%3C" is '<', "%20" is a space, that kind of thing.

What search engines do with such stuff I've no idea; technically it's absolutely no problem to convert the escaped code back into HTML or whatever, although unless the robot understands Javascript it won't know which context the code will appear in.

From the website you mentioned:

If you have sensitive information on your website that is subject to unauthorized use, you need to encrypt it! ... This utility will encrypt your HTML source code to prevent others from viewing it or copying it.

I haven't laughed so much in days.

As a way of hiding passwords... I presume you don't have any sensitive information beyond that page which you absolutely have to keep secret?

hawkmva

5:06 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



Zcat;

Thanks but huh?

As a way of hiding passwords... I presume you don't have any sensitive information beyond that page which you absolutely have to keep secret?

Correct. But why the question? (smile).

Also in the index.html file is my paypal payment code.
One line of sensitive information is the "success" page.

It contains the password the customer is purchasing.

Following is the un-encrypted and the encrypted versions of this paypal code. I believe paypal REALLY
encrypts their code correctly.

<SNIP>

Does the spider get confused by this paypal encryption? I don't know.

I read somewhere that encryption and I assume hexidecimal coding would hinder search engine crawling.

But then again, I don't suppose paypal would produce
encrypted code that would hamper search engine spiders.


-Aaron

-Hawkmva

[edited by: trillianjedi at 8:32 am (utc) on Mar. 25, 2005]
[edit reason] Too much for me to examplify! Repost if necessary without encryption and specifics. [/edit]

Lord Majestic

5:11 pm on Mar 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



d=unescape(m);document.write(d);

I say anybody who uses JavaScript to write anything is asking for trouble since search engine might not fully understand what you doing, but they can certainly deduce that your JavaScript will show some other content than what they (search engines) see, and therefore you page can be flagged as spam.

But anyhow, since your "encrypted stuff" is placed between comments tags it means that most search engines would simply ignore it.

Leosghost

5:16 pm on Mar 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Nice if folks didn't do that to the scrolling without coming back and fixing it via their edit button ..

zCat

7:01 pm on Mar 24, 2005 (gmt 0)

10+ Year Member




As a way of hiding passwords... I presume you don't have any sensitive information beyond that page which you absolutely have to keep secret?

Correct. But why the question? (smile).

Well, because putting the authentication mechanism in the client is about as secure as giving someone an envelope with the key to your store in it and trusting them not to open it until they've given the correct answer to a question written on the front ;-)

That other stuff from Paypal does look like real encryption though. I presume it would be simply ignored by spiders etc.

hawkmva

7:35 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



Well, because putting the authentication mechanism in the client is about as secure...

Hmmm. Ok I see...

Tell me. When did I put the who in the what :-)...

Hawkmva

hawkmva

9:03 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



Lord Majestic;

But anyhow, since your "encrypted stuff" is placed between comments tags

Actually the "encrypted stuff" is outside the comment
tags. I want the spider (and the browser)
to see them. That is they are outside the

<!-- and --> tags.

(smile) -Hawkmva

Lord Majestic

9:36 pm on Mar 24, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



yes I stand corrected -- your encrypted stuff is in JavaScript block that should be ignored by search engines regardless if it was inside comments or not. Either way having this kind of stuff is not a good idea.

hawkmva

10:50 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



Lord Majestic;

Ahhh... Now That's really helpful... *Smile*

All the replies are helpful and I appreciate them
all.

Knowing that I can go ahead and subit the site without worry of seach engine spider crawling is a great relief. Especially since all the encryption methods I know of involve embedding the javasvript between javascript tags * probaly wrong * Anyway
like I said, the replies have helped immensely and
I've never seen such a helpful and responsive forum!.

P.S.

Sorry about the super scrolling. I should have truncated and/or trucated the encryption I presented
here for display. Won't happen again.

-Hawkmva

hawkmva

10:55 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



Sorry about the super scrolling. I should have truncated and/or trucated the encryption I presented
here for display. Won't happen again.

Oops, I meant Sorry about the super scrolling. I should have truncated and/or WRAPPED the encryption...

-Hawkmva

larryhatch

1:53 pm on Mar 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I once told somebody not to worry about others who might eavesdrop
on our email exchanges.
I explained that I sent my messages in secret "ASCII code",
and that this was turned back into readable text on her end.
No, she wasn't blonde. - Larry