Motion Speech Sync » History » Version 1

T. Dankert, 07/15/2016 03:53 PM

1 1 T. Dankert
h1. MSS - MotionSpeechSync
2 1 T. Dankert
3 1 T. Dankert
The MotionSpeechSync (MSS) component provides a way to play back speech with synchronized motions and pauses on the Nao robot.
4 1 T. Dankert
5 1 T. Dankert
h2. Dependencies
6 1 T. Dankert
7 1 T. Dankert
At the moment, MSS depends on RSB-Python, NAOqi(-Python) and (only if you want to use the XTT input interface) XTT-Python.
8 1 T. Dankert
9 1 T. Dankert
h2. Building/Installing
10 1 T. Dankert
11 1 T. Dankert
MSS is written in Python. If the prerequisites are installed, building and installing MSS should basically be as simple as: 
12 1 T. Dankert
<pre>
13 1 T. Dankert
python setup.py build
14 1 T. Dankert
python setup.py install --prefix=/whereever/you/want/to/install/it
15 1 T. Dankert
</pre>
16 1 T. Dankert
17 1 T. Dankert
If your Nao robot has the RSB "cross-image" installed, you should already have a current version of MSS installed on the robot and do not need to install anything (besides maybe RSB and XTT, if your clients want to use them) on your local machine.
18 1 T. Dankert
19 1 T. Dankert
h2. Starting
20 1 T. Dankert
21 1 T. Dankert
-The @setup.py@ build installs a launcher script which should by default be called just @mss@. If you can't or don't want to use this generated script- (see bug #278 for why this script is currently unusable), you can also just directly launch something like @python mss/__init__.py@
22 1 T. Dankert
23 1 T. Dankert
h3. Starting automatically with NAOqi
24 1 T. Dankert
25 1 T. Dankert
TODO
26 1 T. Dankert
27 1 T. Dankert
h2. Usage
28 1 T. Dankert
29 1 T. Dankert
This are the command line options at the time of this writing, but be sure to check out @mss --help@ by yourself. They should hopefully be more or less self-explanatory: 
30 1 T. Dankert
31 1 T. Dankert
<pre>Usage: mss [options]
32 1 T. Dankert
33 1 T. Dankert
Options:
34 1 T. Dankert
  -h, --help            show this help message and exit
35 1 T. Dankert
  --output=INTERFACE    Output interface to be used (default naoqi)
36 1 T. Dankert
  --input=INTERFACE(S)  Input interface(s) to be used, comma separated
37 1 T. Dankert
                        combination of xtt and rsbrpc (default 'rsbrpc,xtt')
38 1 T. Dankert
39 1 T. Dankert
  XTT options:
40 1 T. Dankert
    Options specific to the XTT input interface.
41 1 T. Dankert
42 1 T. Dankert
    --xtt-scope=SCOPE   Scope on which the XTT task server operates (default
43 1 T. Dankert
                        /mss/xtt)
44 1 T. Dankert
45 1 T. Dankert
  RSB RPC options:
46 1 T. Dankert
    Options specific to the RSB RPC input interface.
47 1 T. Dankert
48 1 T. Dankert
    --rpc-scope=SCOPE   Scope on which the RSB RPC server listens (default
49 1 T. Dankert
                        /mss/rsbrpc)
50 1 T. Dankert
51 1 T. Dankert
  NAOqi options:
52 1 T. Dankert
    Options specific to the NAOqi output interface.
53 1 T. Dankert
54 1 T. Dankert
    -b IP               IP of the local broker (for the NaoQi interface,
55 1 T. Dankert
                        default 0.0.0.0)
56 1 T. Dankert
    -p PORT             Port of the local broker (for the NaoQi interface,
57 1 T. Dankert
                        default 9699)
58 1 T. Dankert
    --pip=IP            The parent broker IP (for the NaoQi interface, default
59 1 T. Dankert
                        127.0.0.1)
60 1 T. Dankert
    --pport=PORT        The parent broker port (for the NaoQi interface,
61 1 T. Dankert
                        default 9559)
62 1 T. Dankert
</pre>
63 1 T. Dankert
64 1 T. Dankert
h3. Interfaces (XTT/RPC)
65 1 T. Dankert
66 1 T. Dankert
MSS currently provides two interfaces for submitting tasks to the program. One is via the eXtensible Task Toolkit (XTT), the other through a simple RSB-based remote procedure call (RPC) server.
67 1 T. Dankert
68 1 T. Dankert
The RSB RPC interface provides an RPC server callable at the configured scope with one configured method, @say(text)@. This methods returns when the robot is done saying the text and doing all the motions provided in the input text. As an example, calling this from e.g. Python is quite easy: 
69 1 T. Dankert
<pre>mss = rsb.createRemoteServer("/mss/rsbrpc")
70 1 T. Dankert
mss.say("[bow] Hello, I am Nao!")</pre>
71 1 T. Dankert
72 1 T. Dankert
The XTT interface accepts tasks provided from task clients (using XTT-Java or XTT-Python) at the configured scope and sets the task state to COMPLETED once the robot is done. The task specification XML document should look like this:
73 1 T. Dankert
<pre><?xml version="1.0" encoding="utf-8"?>
74 1 T. Dankert
<MSS>
75 1 T. Dankert
  <say text="text" />
76 1 T. Dankert
</MSS>
77 1 T. Dankert
</pre>
78 1 T. Dankert
79 1 T. Dankert
The @text@ in both interface inputs should follow the rules described below.
80 1 T. Dankert
81 1 T. Dankert
h3. Input format
82 1 T. Dankert
83 1 T. Dankert
The input strings provided to this component follow a simple text format, where:
84 1 T. Dankert
* any normal *words* (and basic punctuaction like ., comma, ? or !) are treated as text that should be said through Text-to-Speech (e.g. @'I am a talking robot!'@), 
85 1 T. Dankert
* anything in *[square brackets]* is taken to mean the name of a *motion* that should start playing at this point in the sentence (e.g. @[lookRight]@)
86 1 T. Dankert
* and anything in *(parentheses)* is taken to be a *pause*, specified in milliseconds (e.g. @(1000)@).
87 1 T. Dankert
88 1 T. Dankert
The beginning of any motion is synchronized (as close as this is possible with the limitations of the NAOqi API) to the beginning of the word following the motion in the input string. If there is no such word, the motion is played without any accompanying text.
89 1 T. Dankert
90 1 T. Dankert
To give an example: 
91 1 T. Dankert
92 1 T. Dankert
<pre>Hello, World! [bow] I am Nao, (2000) a small humanoid robot. [standardPose]</pre>
93 1 T. Dankert
94 1 T. Dankert
This translates basically to the following sequence of events happening on the robot: 
95 1 T. Dankert
# Say "Hello, World!", 
96 1 T. Dankert
# Start the motion with the name "bow", synchronized with saying the first word of the following sentence ("I" in this case), 
97 1 T. Dankert
# Say "I am Nao" (while in parallel the motion named "bow" executes as long as it takes)
98 1 T. Dankert
# Wait two seconds (2000 ms)
99 1 T. Dankert
# Say "a small humanoid robot"
100 1 T. Dankert
# Do the motion named "standardPose".
101 1 T. Dankert
102 1 T. Dankert
It is of course also possible to specify just some text (e.g. @'Hello, World!'@) or just a motion (e.g. @'[lookLeft]'@), or basically any combination of any number of the three types (motions, pauses and normal words).
103 1 T. Dankert
104 1 T. Dankert
TODO: More examples?!
105 1 T. Dankert
106 1 T. Dankert
You can also create a [[newBehavior| new motion]].
107 1 T. Dankert
108 1 T. Dankert
h3. Supported Motions
109 1 T. Dankert
110 1 T. Dankert
TODO