SPECS This project is a proof of concept that will require a very simple user interface, some kind of backend DB, and a custom-made primitive speech recognition engine (really just audio analysis) or use of an SDK. I am not picky about how this is done, it just needs to work on Windows. The gist of this project is to take a video clip of someone speaking (audio would be very clear and very good, think broadcast quality mic and no background noise) and its corresponding transcript, and allow the user to jumped to various video segments based solely on the words in the transcript that are highlighted. Think of it as Microsoft Word with a little video window. Users should be able to highlight any section of the transcript, and play the corresponding video clip. NOTE: the transcript the user will provide is JUST words, no timecode annotations. The first task is to analyze the audio and figure out when the words (or groups of words) are spoken in the video. This doesn't have to be incredibly precise, but it has to be more or less on cue. EXAMPLE Each word should be loaded into an XML database and tagged with a timecode that corresponds to the point in the video that the word corresponds to. So if the user selects the sentence “We don’t know how strong that is but potentially if that’s quite strong they may not use the sites if they can’t nest in the ones or nearby where they’ve used.??, the backend should be able to note the starting timecode (of “WE??) and the ending timecode (of “USED??). The user will click a PLAY button and the video clip will play that section.
## Deliverables
GUI AND USER EXPERIENCE The gui will be simply a video window and a transcript (word processing) window, with no functionality except that you can select words. Workflow: 1) User clicks “browse?? to locate the video clip. Clip loads into window. 2) User pastes in transcript text into the transcript box. 3) Clicks “CREATE?? button, which runs the analysis to generate the timecode and xml for the transcript and load it into XML database. 4) User highlights any section of transcript and clicks “PLAY?? and video plays that portion. ------- 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
## Platform
Windows XP