Forum Moderators: open

Message Too Old, No Replies

tricky webrequest? how to code this?

         

wancherng

4:31 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



If there is a website with a table of data in the middle of it, how do i write the code to perform a webrequest to extract the data only and put them in my database?
I already have code that doing a webrequest but it return the whole website, i don't want that, I only want the data in othe table only.
Is there anybody know how to do this?
thanx in advance for the helps

ziggystardust

8:15 pm on Apr 9, 2003 (gmt 0)

10+ Year Member



You're going to have to use a couple of basic functions.

Left

temp = left(strHtml, 10)

left takes the first 10 chars in strHtml and returns them to temp.

Right

temp = right(strHtml, 10)

right takes the last 10 chars in strHtml and returns them to temp.

Instr

temp = instr(strHtml, "c")

instr returns the position (an integer) in strHtml where the searched string (in this case "c") is located.
ex.
strHtml = "abcdefghijk" 
temp = instr(strHtml, "c")

temp would then be 3, since "c" is the third char in strHtml.

Len

temp = len(strHtml)

Returns the lenght of strHtml to temp. In the above case, len would return 11.

There are a couple more, but this is all you need to start. I don't really know what info you want, but say you wanted to grab just the table (and there was only one table on the page):

temp = instr(strHtml, "</table>") 'Check where the table ends

strHtml = left(strHtml, temp - 1)

temp = instr(strHtml, "<table ") 'Check where the table starts 

strHtml = right(strHtml, (len(strHtml) - temp) + 1)

Hope this helped you
//ZS

jpjones

10:59 am on Apr 10, 2003 (gmt 0)

10+ Year Member



As long as the website you are pulling information from does not change its structure, or if there are any markers around the table you ultimately want to rip (i.e. a unique image name etc), then it should be quite easy.

You need to delete all content up to a known point in the page (e.g. the marker), either using regular expressions or a function to do this.

You then step through the table, again using regular expressions or a specific function you create, to extract the relevant info from the table and insert it into a database.

If you want the individual data elements in the table inserting into individual columns of your database, then one way of doing it would be to break the table in an array of <tr></tr> tags, then step through this, breaking the content into an array of <td>Content</td>s. From there it should be simple to extract the content and do the database insert.

HTH,
JP

duckhunter

3:10 am on Apr 11, 2003 (gmt 0)

10+ Year Member



I'll add to ziggy's code. Grab the table as he showed:

temp = instr(strHtml, "</table>") 'Check where the table ends
strHtml = left(strHtml, temp - 1)
temp = instr(strHtml, "<table ") 'Check where the table starts
strHtml = right(strHtml, (len(strHtml) - temp) + 1)

Now, assuming strHtml is = "<TABLE><TR><TD>1.1</TD><TD>1.2</TD></TR><TR><TD>2.1</TD><TD>2.2</TD></TR>"

arRows = Split(strHtml,"<TR>") 'Array of Rows

For x = 0 to Ubound(arRows) 'Start Looping through Rows

arColumns = Split(arRows(x),"<TD>") 'Array of Columns for current row

For y = 1 to Ubound(arColumns) 'Start Looping through Cols
If len(trim(arColumns(y))) > 0 Then 'Make sure you have data
iTDPos = instr(1,arColumns(y),"</TD>")
strTDRemoved = left(arColumns(y), iTDPos-1) ' Removes everything right of the </TD> including </TABLE>
response.Write "Row: " & x & " Col: " & y & " = " & strTDRemoved & "<BR>"
End If
next
next