Python Scrapy Project (SQLite) Web & FTP scraping
Language: Python3 (Anaconda Environment)
Other Module: Beutifulsoup, Re, lxml, xpath & etc nessacary
1) scrap require web & ftp data to local by different possible schedule (daily, weekly, monthly, specific time)
2) use SQLite to store data by sources
3) make a index/log with SQLite for track back history status success/not found/fail/error in [login to view URL]
4) base of the index mention in (3) to decide 3 scraping action - continue routine / if not found, retry after 2 months / not continue routine forever.
4) Proxies list, User Agent list is ready to modify anytime
5) concurrent scraping (50-100 thread)
6) dynamic schedule with scrap manner (100,000 thousand link or more to scrap per day)
7) Must prevent ‘database is locked’ error in sqlite DB during concurrent scraping
8) As long as any device installed Anaconda3 & Scrapy, a simple copy & paste of the project folder can migrate the project to other PC
final version of source code submit to us as the indicate of project is complete
Please mention "pythonscrapyframework" as a secret passphrase indicating that you have read above project requirements.
Communicate [Removed by Freelancer.com Admin for offsiting - please see Section 13 of our Terms and Conditions]
10 freelancers are bidding on average $384 for this job
Hello! I am a python developer. I looked at your project and it seems interesting. I have all necessary skills required for this project. Ping me to discuss in detail.