Feature #2143
audition/Utterance proto file
Status: | Resolved | Start date: | 12/19/2014 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | S. Schulz | % Done: | 100% | |
Category: | Type Proposal | |||
Target version: | - |
Description
please see the attached patch
Associated revisions
Added proto/sandbox/rst/audition/{Phoneme,Utterance}.proto
fixes #2143
- proto/sandbox/rst/audition/Phoneme.proto: new file; contains Phoneme
type which is a phoneme category and a duration - proto/sandbox/rst/audition/Utterance.proto: new file contains
Utterance type which is a sequence of phonemes which associated audio
Signed-off-by: Johannes Wienke <jwienke@techfak.uni-bielefeld.de>
Added proto/sandbox/rst/audition/{Phoneme,Utterance}.proto
fixes #2143
- proto/sandbox/rst/audition/Phoneme.proto: new file; contains Phoneme
type which is a phoneme category and a duration - proto/sandbox/rst/audition/Utterance.proto: new file contains
Utterance type which is a sequence of phonemes which associated audio
Signed-off-by: Johannes Wienke <jwienke@techfak.uni-bielefeld.de>
(cherry picked from commit 08cceebed230721249629fd64d28c0d68a9135d8)
History
#1 Updated by J. Wienke over 9 years ago
- Category set to Type Proposal
I had a look at the patch and there are several smallish things we need to sort out:
Formally:- Code style: please run the
codeCheck.py
from theproject
subfolder and add missing newline around variable etc. - Please convert endline comments and
//
comments to real doc comments.
- Phoneme message comment...
Phoneme::symbol
: the comment seems to be specific for some kind of software that I don't even know. Since RST should be generic, we need a software-independent and -understandable description of this field. Is there an alphabet description that could be referenced? Or is it implementation-specific? Could you please try to adapt the comment accordingly.- What is the purpose of
Phoneme::text
? This is not clear from the comment.
#2 Updated by S. Schulz over 9 years ago
formally:
i cant do that right now as i am not at work
content:
1+2) a phoneme is not software dependant, see
http://www.phon.ucl.ac.uk/home/sampa/german.htm
http://en.wikipedia.org/wiki/Phoneme
The comment referring to mbrola is there as this seems to be some kind
of "standard" implementation known to everybody in that field.
3) you mean utterance.text? That field is a descriptor for the content.
For example you send an utterance with an audio file saying "my name is flobi"
you will have the associated phoneme list and a textual description (=.text)
One purpose is debugging (e.g. when looking at rsb lggger you can see what this binary blob represents)
The other one is that one might need the textual description to show it on a GUI while teh robot
is playing the audio file. Converting the phonemes back to text is not possible or lets say not easy:
example: Trotz=trOts / Schutz=SUts / hübsch = hYpS
#3 Updated by S. Schulz over 9 years ago
- File 0001-test.patch added
i just revised my patch and fixed all your comments. please have a look.
#4 Updated by J. Moringen over 9 years ago
We came up with the attached questions and improvements. Please do one final revision.
#5 Updated by J. Moringen over 9 years ago
- Status changed from New to In Progress
- Assignee changed from J. Wienke to S. Schulz
- % Done changed from 0 to 80
#6 Updated by J. Wienke about 9 years ago
ping
#7 Updated by J. Wienke over 8 years ago
Simon? Any feedback here?
#8 Updated by J. Wienke over 8 years ago
Simon?
#9 Updated by S. Schulz over 8 years ago
ok, please use your suggested patch and include it into RST.
thanks!
#10 Updated by J. Wienke over 8 years ago
There are TODOs in the patch. We cannot merge it until these are fixed / specified.
#11 Updated by S. Schulz over 8 years ago
there you go ;)
#12 Updated by J. Wienke over 8 years ago
One last question. Is it necessary to make all members of Utterance required? E.g. including a SoundChunk sounds quite hard for some potential use cases of this type.
#13 Updated by S. Schulz over 8 years ago
we can discuss that.
soundchunk: i could not imagine a use without this.
when you do not need the audio one would have a plain phonemelist.
What about introducing a PhonemeList (or whatever you would like to call it) datatype:
mesage PhonemeList{ repeated Phoneme phonemes = 1; }
And then including this in the Utterance?
I would vote for having the string description as required so that the users
are forced to fill this in. Otherwise they just skip that part and that makes debugging
quite hard. Another use of the description might be additional on screen text rendering in noisy environments.
#14 Updated by J. Wienke over 8 years ago
Ok, so I propose that you add the annotation to Phoneme to generate the list type automatically.
For the description field: can we rephrase this as the "textual representation of the utterance" or something like transcription? So far we do not have debugging information in the data. But all use cases sound like this is basically the represented textual representation (screen rendering requires this).
#15 Updated by S. Schulz over 8 years ago
Ok, so I propose that you add the annotation to Phoneme to generate the list type automatically.
how do i do that?
#17 Updated by S. Schulz over 8 years ago
And how is the collection named afterwards?
or is there a documentation for that collection feature somewhere?
#18 Updated by J. Wienke over 8 years ago
Not yet documented. It's quite new. The default name will be TypeCollection, but can be changed:
https://code.cor-lab.org/projects/rst/repository/revisions/a91af7f95f1db3bc53204fb92cb9a72e4464e319
#19 Updated by S. Schulz over 8 years ago
i attached an updated patch to this thread
#20 Updated by S. Schulz over 8 years ago
- Status changed from In Progress to Resolved
- % Done changed from 80 to 100
Applied in changeset rst-proto|08cceebed230721249629fd64d28c0d68a9135d8.