Welcome to WebmasterWorld Guest from 54.211.17.91

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Grabbing remote source code

   
7:20 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



Trying to read the source code from a website, find data in the source code, and save the data I am looking for, for use later. But I am getting an error that my browser is not supported. Is there anyway to get past this?
7:31 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member eelixduppy is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Welcome to WebmasterWorld! :)

You are getting this error from the URL you are requesting? Sounds like they are checking user-agents and displaying content accordingly.

7:33 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



yes, is there anyway to "fake" the user agent though? This is driving me crazy, lol
7:51 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member eelixduppy is a WebmasterWorld Top Contributor of All Time 5+ Year Member



You'd have to use cURL to change the user-agent. It looks like this:

[url=http://www.php.net/curl-setopt]curl_setopt[/url]($ch, CURLOPT_USERAGENT, $useragent);

Read up on cURL at the documentation: [php.net...]

7:52 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



Thanks for the info, and the Welcome :) I've been a reader here for quite some time, but decided it was time to register and get asking :) Thanks again, I will give this a go :)
8:07 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



alright, Is there any chance you could give me a snippet of code that would set a user agent, and read the source code from say google.com?
8:20 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



I got it :-P

but its still not working quite right. :( it will load pages like google.com fine, but when I try to load a page from facebook, it either gives me a blank page, or brings me to the login page. :-(

8:28 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can use cURL to login into password protected sites, like Facebook - you need to read the cURL docs, as mentioned above.
8:32 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



lol, thanks bcolflesh. Sorry for being a pain, just impatient, have been trying to get this working for sooooo long now!
8:34 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try a G search for:

php authenticating with curl

8:34 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member eelixduppy is a WebmasterWorld Top Contributor of All Time 5+ Year Member



If you are looking for quick, there may be scripts already written for logging into facebook. A quick Google search or a search at [phpclasses.org...] might be in order. :)

[edit]
beaten to it ;)

8:42 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



thanks again folks, I am just at a loss as to why there would be nothing returned at all from facebook?
8:59 pm on Jul 21, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



thanks again folks, I am just at a loss as to why there would be nothing returned at all from facebook?

1. The session you initiated with cURL isn't logged in.
2. Facebook is checking for a cookie to be set that you haven't set with cURL
3. The user-agent you are sending with cURL doesn't fit a pattern Facebook allows
...

The list goes on - most folks don't want Joe Blow to scrape them and don't make it easy.

9:00 pm on Jul 21, 2008 (gmt 0)

5+ Year Member



Great in, thanks alot, i will see what i can dig up
 

Featured Threads

Hot Threads This Week

Hot Threads This Month