Need tools/scripts built to scrape unstructured biographical text data from a few specific websites, process the text to find entities' current location, then use that data to search 3 phone directories (paid and unpaid, some with API, some without) to find the most likely matches for the person described in aforementioned biographical text and dump it into CSV.
some of the bio text includes multiple locations, only one of which is current, hence the need for natural language processing. eg: "john smithe was born in london, england and went to college in cambridge. he now lives in dublin with his wife and 3 cats and works in belfast.
The tool needs to identify that the entity is named John Smith and that he lives in Dublin as this will be used to search the phone directories to find the most likely match. To make it more difficult, there may be multiple john smiths in dublin so a plan to save these as multiple possible results needs to be included.
Please include in your bid a general outline regarding your understanding of the problem and how you intend to solve it and the word bioscrape to show you read this far. Generic cut/paste messages and those ignoring the above will not be read.
29 freelancers are bidding on average $575 for this job
I need further more details to get the exact idea,contact me via chat Relevant Skills and Experience I have 3 years experience in building web scrappers in python Proposed Milestones $500 USD - Project complete