Scrap two level table data queried from a website. from second level data, look for a keyword and update that record with that keyword if found.
output required : csv
Tools to be used: web-harvest and solvent to generate xquery with Java is best solution i think but you can use some other open source tool if you want.
Steps to browse website
1. got to URL
2. Enter filter criteria from a file or UI( whatever is cheaper to code)
3. hit search button
4. Click View button next to each record to go to next level.
5. Click "Get data" button on second level page.
6. Page will display one more table.
7. look for keyword "cancellation" in the new table
8. if found, update the record with cancel
9. for each record that is not cancelled get more details for it.
a. got to URL
b. put in search criteria from upto 3 fields of record looking up more detail for.
c. click search button
d. get data from page and update the record.
Hi there i atleast did a dozen scraping project at GAF. All u wrote make sense to me and is achivable. I have done several similar kind of things in past. please hire me and please pm me in any case!
thanks
umer