Web Scraping - Grabbing Numerical Values from Source Code - PHP Server Side Scripting forum at WebmasterWorld

Forum Moderators: coopster

Message Too Old, No Replies

Web Scraping - Grabbing Numerical Values from Source Code

jamielaw

2:27 am on Nov 18, 2009 (gmt 0)

Hi,

I'm trying to grab some values from a webpage.

Example Data:

data[0] = 11;
data[1] = 37;
data[2] = 89;
data[3] = 23;
data[4] = 36;

I'd like to extract just the values into an array e.g. [11, 37, 89, 23, 36].

data[0-9] is unique to just this section of code. In other parts data[a-z0-9] is used.

The part I'm stuck with is matching the values and getting just those.

$html = file_get_html("http://www.example.co.uk/");
preg_match('/^data[.[0-9]*.]$/', $html);

My preg_match is wrong but I'm struggling to figure it out.

Any help would be appreciated - thanks!

jamielaw

2:41 pm on Nov 24, 2009 (gmt 0)

Any suggestions?

TheMadScientist

3:05 pm on Nov 24, 2009 (gmt 0)

Hi jamielaw,

Welcome to WebmasterWorld!

Personally, when someone says 'scraping' or something similar my first thought is my websites and how I can block your access entirely, not help you do it, so I usually don't even look at the thread, except to see what I might need to do to block yet another scraper...

Since it looks like you're only after some numbers and it doesn't look like it's going to do anyone else any damage, I don't mind saying you need to escape the []s you would like to match within your pattern for sure... \[\] and I would make the dots . optional .?

preg_match('/^data\[.?[0-9]*.?\]$/', $html);
That should actually get you quite a bit closer.

Sorry about the length of time for anyone to give you a reply, but AFAIK most members here generally avoid helping people scrape. It's one of those things we usually make you figure out on your own if it's what you're determined to do.