I need a crawler written in JAVA that will crawl all the products on [login to view URL], [login to view URL] and [login to view URL] with all Meta-Data and put it in a predefined XML structure.
Technologies to be used:
- Clean JAVA-POJO Code with minimal library use
- [login to view URL] to convert HTML to XHTML
- [login to view URL] or Network IO
- Link extraction using Regex
- XSLT to transform XHTML to Output-XML
Other requirements:
- Mail escalation if transformation fails due to HTML-changes
- Configurable: Threads, Hostname, XSLT per Hostname
- Output as XML-Dump in a directory
- Stateless / No Database
Meta:
- Simple!
- Unit-Tests provided
- Maven2 Build
- Low CPU Usage
- Massive Multithreaded (> 200 Download Threads)
- Prooved Test-Run
- Rock-Solid
I have plenty more jobs if this is programed ok.
doing work in night and day iu can complete it in 5 days. but i want to eager to get thsi project. can u give me a chance...to get betttttterrrrr service. u can trust me to give u the best.
i am eager to get thsi project that's why i am giving this low cost bid.