What I need or Require
The development of a web crawler that searches Arabic blogosphere and social media networks for representation in a similar way to [login to view URL] - The requirement mainly is to develop a web crawler and indexer almost similar to [login to view URL] - Please refer to: [login to view URL]
Functional and Content Identification
The required solution is inspired by the recent trend in web development ‘Social Story Telling’. The trend aims to produce meaningful stories through connecting online expressions in emotional composition.
The application will consist of two core components:
1. Content collection and qualification.
2. Content delivery and distribution.
Automated blogosphere and social media search crawler / indexer / data collector
• The search crawler will methodically search the Arabic blogosphere and social media sites such as Flickr for tags and textual content for term or a phrase.
• Once a sentence containing one of the predefined search terms is found, the system looks backward to the beginning of the sentence, and forward to the end of the sentence, and then saves the full sentence in a database.. Alternatively, the application could extract a pre-defined number of words before and after the identified search term and re-cord them in the database.
• Once saved, the sentence will be scanned to see if it includes one or more terms in a pre-identified list.
• Every qualified sentence, the sen-tence/extract will represent an Arabic Voice.
• If an image is found in the post, the image is saved along with the sentence, and the image.
• The application will extract the date and time of the post where the search term / qualified voice/sentence was found.
• A high percentage of all blogs are hosted by one of several large blogging companies (Blogger, MySpace, MSN Spaces, LiveJournal, etc), the application will examine the URL format of the blog posts and use it to extract the username of the post's author. Given the author's username, the application will automatically traverse the given blogging site to find that user's profile page. From the profile page, the application will extract the age, gender, country, state, and city of the blog's owner.
Other Requirements
The application will be in Arabic.
Timeframe
3-4 weeks for the development of the crawler and indexer (excluding interface and conten[login to view URL] delivery)
READY TO ACCEPT YOUR WORK.
Hello sir, we have gone through your details and we can complete your work very easily and efficiently. please let us know.
Sincerely,
R.K