Here, you have several projects and demos I am currently involved in or I have been involved in the past:
THOFU: Technologies for the HOtel of the FUture
The main objective of this project is to design the hotel of the future, since its construction, its parts,
the interaction with users, security and integration with its surroundings and the Internet.
Gesfor Group leads this project with a budget of 23 million euros.
In the project, our team will be involved in the work-package related to intelligent and adaptive interfaces within
a high-tech hotel, researching on new paradigms of interaction and studying usability and user experience.
National Consortium (CENIT) granted by the Spanish Science and Innovation Council (CEN-2010-1019). Period: September 2010 - December 2013
EmoLib: Emotion identification from text
EmoLib is a library that extracts the affect and emotions from an incoming text by tagging such text according to the feeling that is written or being
conveyed. EmoLib has been coded in the Java programming language.
This demo has been developed by Alexandre Trilla within his PhD Thesis.
evMIC: Multimodal, Immersive and Collaborative Virtual Environments
The main objective of this project is to create an interoperable platform, user-centric, allowing the creation of virtual
learning environments, overcoming the current limitations and aligning with the current definition of what will be "The Future Internet."
Besides contributing to state-of-the-art documents on speech technologies, multimodal processing and graphics and virtual reality, our team is going to participate in developing
interfaces to interact with the virtual environment involving expressive text-to-speech synthesis, multimodal affect analysis, and 3D avatars modelling and synthesis.
Singular Strategic Project granted by the Spanish Industry, Tourism and Trade Council (TSI-020301-2009-25). Period: 2009 - 2011
INREDIS: INterfaces for RElations between Environment and people with DISabilities
The main objective of this project consists of developing grounding technologies to allow creating communication and interaction channels between disabled people and their environment.
Technosite leads this project with a budget of 23.6 million euros.
Besides contributing to detailed state-of-the-art documents on speech technologies, multimodal processing and graphics and virtual reality, our team is going to participate in developing applications
involving expressive text-to-speech synthesis, multimodal affect analysis, and 3D avatars modelling and synthesis.
National Consortium (CENIT) granted by the Spanish Industry, Tourism and Trade Council (CEN-2007-2011). Period: 2007 - 2010
MD-TTS: Multi-Domain Text-To-Speech Synthesis by Automatic Domain Classification
OK wav files stand for correct classification results with respect the manual labellings, whereas KO wav files stands for:
i) neutral domain for HAPPY and SENSUAL files, and ii) wrong classification domain for MISSCL files.
I would like to thank David García for his support when developing this web demo.
MAGNUS: Mouse Advanced GNU Speech
It is a speech controlled mouse pointer application through Catalan voice commands.
This application aims to provide oral accessibility for people with reduced mobility.
This project has been developed by Alexandre Trilla within his Master Thesis.
Project members:
Project Coordinator: The Generalitat de Catalunya's Education Department
The Acoustics Section of Enginyeria i Arquitectura La Salle
SAVE: Expressive AudioVisual Synthesis
The project is focused on the research of a multimodal output interface
with high expressivity content, which makes it possible to give a high naturality
perception to the end user. The project proposes the study and development of a novel
expressive audiovisual synthesis system based on a photo realistic talking head.
Project granted by the Spanish Science and Technology Council (TEC2006-08043/TCM).
Period: 2007 - 2009
SALERO: Semantic AudiovisuaL Entertainment Reusable Objects
Our group is involved in this project to develop innovative Multilingual Text-to-Speech
techniques for the achievement of expressive speech synthesis in the cross media-production framework
(e.g. movies, games, broadcast, etc.)
Project supported by the European Comission (IST-FP6-027122). Period: 2006 - 2009
VIRTUAL WEATHERMAN: SAM
Automatic service for weather forecast on demand (TV, Internet and mobile devices) by means of a virtual speaker called Sam.
Our group has developed the corpus-based Text-to-Speech system embedded in the forecast application.
Project members:
The Catalan Broadcasting Corporation (CCRTV)
The Interactive Technology Group (Pompeu Fabra University)
The Speech Technologies Area of Enginyeria i Arquitectura La Salle
Project supported by CCRTV and CIDEM (RDITSCON04-0005). Period: May 2004 - April 2005
IntegraTV-4all
Adapted leisure, information and remote assistance services via a television set, with advanced natural language voice
communication functionalities for people with sensory disabilities and the aged.
Our group has developed an audio-visual alarm clock, which integrated in the hotel TV menu, as a result of improving our
previous virtual speaker (see virtual speaker section).
Project members:
Project Coordinator: TMT Factory
Fundación ONCE
Universidad Politécnica de Madrid (UPM)
Universidad Carlos III de Madrid
The Speech Technologies Area of Enginyeria i Arquitectura La Salle
Project supported by the Spanish Science and Technology Council (FIT-350301-2004-2).
Period: September 2004-December 2005
VIRTUAL SPEAKER
Here you can download three videos of the first version of our realistic virtual speaker:
Project granted by the Spanish Science and Technology Council (FIT-150500-2002-410). Period: year 2002
ONLINE TEXT-TO-SPEECH SYNTHESIS (CATALAN)
Interactive demo for synthesizing an input text in Catalan. It generates an audio file,
which can be downloaded. Moreover, the demo allows the specification of different speech parameters, like rythm and emotion.
WEIGHT TUNING INTERFACE FOR SPEECH SYNTHESIS (CATALAN)
This is a web platform, based on evolutive computation, which has beend designed to find the optimal weight configuration of the cost function
for unit selection text-to-speech synthesis:
(restricted access: e-mail me to ask for a login).
ITP - Speech Processing Interface
This is an interface for speech labelling (automatic and/or manual). Pitch
marks, phoneme boundaries, pitch curve, spectogram and prosodic features are some of the speech parameters that can be extracted by means of this interface.
It runs under Windows platforms.
(Under Development)
ALGTEC (ALGEBRA & TECHNOLOGY)
ALGTEC is a multimedia application that helps and motivates the engineering student
for the learning of Algebra concepts. It describes some Algebra concepts applied to
technologic situations by means of a virtual teacher.
First version: based on Microsoft Agents and Text-to-Speech
In the future it will incorporate our Virtual Speaker
It runs under Windows platforms.
(Under Development - Catalan and Spanish versions)