Web crawler / scrapers for bloomberg
Budget $80-240 HKD
Detected by the web site as a robot
Need to fix below to crawl the web successfully
$ch = curl_init();
$headers = array();
$headers[] = "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0";
$headers[] = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language: zh-TW,zh-HK;q=0.8,en-US;q=0.5,en;q=0.3";
$headers[] = "Dnt: 1";
$headers[] = "Connection: keep-alive";
$headers[] = "Upgrade-Insecure-Requests: 1";
$headers[] = "Te: Trailers";
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
curl_setopt($ch, CURLOPT_URL, "[login to view URL]:IND");
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 100);
curl_setopt($ch, CURLOPT_REFERER, "[login to view URL]:IND");
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST,2);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE, $cookie);
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE);
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE);
$result = curl_exec($ch);
if (curl_errno($ch)) {
echo 'Error:' . curl_error($ch);
}
curl_close ($ch);
9 freelancers are bidding on average $234 for this job
Hi sir, This is Lin and I am scraping expert, please check my reviews then you will know. Can we discuss more details about this? Thanks, Lin
I can update your PHP scraping script so it will be not detected as a robot. I can start right now and complete it in next 2 hours.
Hi, dear. Nice to meet you. i've read your post carefully. I'm php expert. Please discuss more details by chatting. Regards. Gao M.
Hi, I have checked the requirements. Will do this script using PHP selenium web driver. I'm available to start now. Please share additional details. Regards, Mohan
I see the code is only for loading. What exactly need to be scrapped and where to store this? Want to discuss more details on the project
Hi, do you know how Bloomberg website detected your web crawler as a robot when you're presenting it as Firefox 62.0. They use JavaScript based pings to monitor their website traffic. When we fetch the page using cURL More
this sounds pretty straightforward. i have done a lot of webscraping work. please have a look over my feedback and get in touch if you have any questions. thanks, simon
Hello my friend, I HAVE FIXED YOUR CODE! Send me a message to send you the result of fixed script! I can help you in any problem you have.