Hello
I have a website link crawler/fetcher already, its pretty good but it needs some modification.
-after link crawl it will pick up first link from the pile, it will then load the page and find first url on this page that contains [login to view URL] inside the link,
then it will crawl for links for this new [login to view URL] url, after it retreives all links it will then find a new blogspot url and automatically crawl all links for it too.
Checkbox will be available to switch automatic crawling on or off.
-Speed improvement of links retreival is also needed, i m looking for a way to improve the speed needed to fetch a number of links.
the goal is to automatically and faster crawl for new links.
Additional Info:
IMPORTANT! 80% of the described task is already done by a programmer who completed almost the whole project but didnt have knowledge to finish it completely,
so i require you to add the following new features too.
-change proxy after certain number of requests to search engines based on list of given proxies
- add aol, lycos, looksmart, teoma, hotbot, ask jeeves to search engines sources
- increase speed with which links are downloaded
- make that filter returns all links that engine returns when used with inurl: command , now it returns filtered links from first 1000 links