homepage Welcome to WebmasterWorld Guest from 184.73.52.98
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
Forum Library, Charter, Moderators: phranque

SEM Research Topics Forum

    
Links from PDF's on high quality sites like NASA.gov
How do my competitors do that?
webjuice




msg:3985183
 7:05 pm on Sep 6, 2009 (gmt 0)

I am using a tool to analyze what backlinks my competitors have and I was quite surprised to find out that they had links from places like NASA.gov and other quite trustworthy websites with a high PR.

The thing is I would like to understand how they do it. All these high quality links are from PDF files - so my analysis tools tells me. But when visiting the PDF's I cannot find the specific the link - also it would not make sense if my competitor had a link in these PDF's.

Obviously something sneaky is going on. But what? Anyone know anything about this?

/webjuice

 

abilitydesigns




msg:3987198
 7:48 am on Sep 10, 2009 (gmt 0)

Difficult to take a call on your claim of sneaky tactic without seeing the actual url.

But Google started using technology called optical character recognition ( OCR ) to extract text out of the PDF’s from late 2008 onwards.

What it basically does is that it takes the snapshots of PDF’s as input, runs optical character recognition on them and index the text just like regular text.

If it can see the text, it would be seeing the links too?

If you want to know geek details about the open source OCR software that Google sponsers, OCROPUS –
refer to: [code.google.com...]

(If you have Acrobat Pro 9, you can see the option under Documents => OCR Text Recognition => Recognize Text using OCR)

-AD

claus




msg:3988967
 3:10 am on Sep 14, 2009 (gmt 0)

If your tool is showing you something that you can't verify by manual inspection, especially something that doesn't seem logic or rational...

Well, I would think about the value of that tool for a minute, and just perhaps ... I'd ask the tool maker what was up with that before I scrapped it.

webjuice




msg:3989030
 6:56 am on Sep 14, 2009 (gmt 0)

Hi guys, thx for taking the time to giving your thoughts on the subject. I think you're both on to something. As I can see that all the PDF's actually contain the brand name of my competitor (he has a generic name: lets say it was "stool"). So OCR technology atleast recognizes the keyword "stool" to be important for the pdf - however how the connection to my competitors website is made I don't know (there are no visible links). Perhaps like Claus says it is an error (I just dont believe this as my competitor work in the field of SEO). There must be more to it - but what?

stephen186




msg:3989042
 8:08 am on Sep 14, 2009 (gmt 0)

I do not know how your competitors are getting links from this. But i know for sure such links do work. Try to get some links from internal pages of high domain PR sites and you will see the results.

martinibuster




msg:3989069
 9:14 am on Sep 14, 2009 (gmt 0)

Crosscheck your tool with [search.yahoo.com...]

In the following search, replace example.com with the domain name in question.

linkdomain:example.com site:.gov

Does it show links from those PDF files?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Marketing and Biz Dev / SEM Research Topics
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved