16th MuMe Work Meeting

imec, 27 mei 2010 (Only for members)
The goal of the work meetings is to have interactive sessions amongst the MuMe-Community members:
- Discussing recent evolutions of multimedia techniques.
- Discussing operational issues of the community.
- Discussing upcoming activities.
| 13:30 | Registration |
| Welcome/Coffee | |
| 14:00 | Welcome
Paul Six, Coordinator MuMe Community - imec |
| Welcome + status of the community | |
| 14:20 | Digital Signage and Mobile Office applications @ dZine.
Marc Goovaerts, CTO - dZine |
| A presentation of dZine, its activities and applications. Digital signage applications range from airports and movie theaters, over implementations in buses and trains to advertising in stores and supermarkets. The second set of activities are grouped under our Mobile Office applications. These include rescue services and home care services. We will briefly cover our multimedia capabilities and challenges. | |
| 14:50 | Media Asset Management in a Professional Context.
Maarten Verwaest, CEO and founder - Limecraft |
| Media production technology suppliers and system integrators have been focusing on the transition to high-definition and to tape-less workflows lately . Usually architectures are based on a central media asset management system, surrounded by specialised software satellites, such as ingest, post-production and playout applications. However, given the fact that none of the commercially available media asset management systems implements proper indexing mechanisms, it is plausible that this approach is not scalable in the sense any potentially relevant items are quickly hidden by large amounts of irrelevant items. During earlier research conducted by VRT-medialab, we have quantified this problem in medium-sised databases by defining the retrievability of audiovisual assets in terms of precision and recall, and we have investigated the solution space. Limecraft, an ambitious start-up company and a spin-off of VRT-medialab and IBBT, accounts for the problem as well and we will explain how we are facing this problem in the particular case of high-end production and large-scale production facilities. | |
| 15:30 | A Multimodal Approach to Audiovisual Text-to-Speech Synthesis.
Wesley Mattheyses, researcher - VUB/ETRO |
| Audiovisual text-to-speech systems convert a written text into an audiovisual speech signal. Typically, the visual mode of the synthetic speech is synthesized separately from the audio; the latter being either natural or synthesized speech. The possible perception of mismatches between these two information streams, which could degrade the quality, requires experimental exploration. In order to increase the intermodal coherence in synthetic 2D photorealistic speech, we extended the well-known unit selection audio synthesis technique to work with multimodal segments containing original combinations of audio and video. In this presentation we discuss our synthesis strategy and we summarize the results of listening experiments we conducted. | |
| 16:00 | Break |
| Coffee | |
| 16:30 | Robust temporal alignment of speech utterances and its application to the automatic replacement of motion picture dialogues.
Pieter Soens, researcher - VUB/ETRO |
| A system for the temporal alignment of speech utterances modifies the timing structure of a first utterance (replacement, dub) in such a way as to synchronize it with a second utterance (reference, guide), which has the same textual content and has been produced by the same or by a different speaker. In the past, a number of systems have been developed that allow to perform this task in an automated manner, thereby significantly reducing the amount of time required for manual or natural alignments. However, it was also observed that these systems often deliver results that are of unacceptable quality and/or insufficiently synchronized with the reference utterance. In this talk, we present a robust system for the automatic temporal alignment of 2 renditions of a same speech utterance and demonstrate its usefulness for Automatic Dialogue Replacement, a well-established post-production procédé in the audio-for-video industries. The proposed system operates in 2 steps: during analysis, the timing relationships between the speech segments of the utterance that serves as a timing reference and the corresponding speech segments in the original utterance are measured by means of a dedicated dynamic time warping algorithm. The obtained warping paths are then processed and used to synthesize a high-quality natural-sounding speech utterance that precisely time-synchronizes with the reference. Subjective audio-visual listening tests performed within the context of a difficult Automatic Dialogue Replacement task demonstrated that the proposed system achieves a significant improvement compared to the industry-standard benchmark, both in terms of achieved lip-synchronization accuracy as well as in overall sound quality of the synthesized speech utterances. | |
| 17:00 | FPGA-based System-on-Chip Design for DSP Applications
Ramses Valvekens, CEO - EASICS |
| Recent FPGAs are implemented in 65 and 40 nanometer technology. Their enormous integration density allows for the implementation of a complete System-on-Chip on a single FPGA. This lecture will elaborate on this idea, showing benefits as well as limitations. It will compare the FPGA-based solution with alternative implementation candidates such as ASIC, ASSP and DSP processor. A case study will be presented: real-time image processing for an industrial machine. | |
| 17:40 | Closing Networking drink with sandwiches |
Members please register your attendance via MuMe-Community@imec.be
If you are interested to join our Multimedia community, please check the informantion on "Membership" via New member?
| Date | Hour | Location | Street | City |
|---|---|---|---|---|
| 27 mei 2010 | 13.30-18.00 | imec | Kapeldreef 75 | Leuven |
