ISR » History » Version 1

D. Klotz, 06/27/2011 07:23 PM
Started to describe the RSB interfaces of the ISR component

1 1 D. Klotz
h1. ISR
2 1 D. Klotz
3 1 D. Klotz
ISR (short for Incremental Speech Recognition) is a speech recognition component based on the "ESMERALDA framework":http://www.irf.tu-dortmund.de/cms/en/IS/Research/ESMERALDA1/index.html. 
4 1 D. Klotz
5 1 D. Klotz
h2. Interfacing ISR through RSB
6 1 D. Klotz
7 1 D. Klotz
ISR exposes several (input and output) interfaces through RSB, all of them under some sub-scope of the top-level scope @/isr@ (or another top-level scope that can be configured as described below).
8 1 D. Klotz
9 1 D. Klotz
h3. Building and starting ISR with RSB-support
10 1 D. Klotz
11 1 D. Klotz
To build Esmeralda/ISR with RSB-support, you can supply a configuration argument like the following to cmake:
12 1 D. Klotz
<pre>cmake [...] -D WITH_RSB=1 [...]</pre>
13 1 D. Klotz
14 1 D. Klotz
To start ISR with RSB-support, you should supply the following options to the isr binary:
15 1 D. Klotz
<pre>isr [...] -o rsb:isr -m rsb [...]</pre>
16 1 D. Klotz
The @-o rsb:isr@ option tells ISR to use RSB for communication and publish at / listen for everything under the superscope @/isr@. If you want a different top-level scope, you can supply it like this: @-o rsb:ScopeName@. The rest of this document assumes @/isr@ as the top-level scope, so you may need to adjust the sub-scopes accordingly.
17 1 D. Klotz
18 1 D. Klotz
*TODO:* I'm not sure what the @-m rsb@ option is actually doing. Sebastian? Lars?
19 1 D. Klotz
20 1 D. Klotz
h3. Speech recognition hypotheses
21 1 D. Klotz
22 1 D. Klotz
ISR publishes its recognition results under the scope @/isr/hyp@. They are sent as XML-strings with a format like that which is shown in the following example. If isr is configured to work incrementally, incremental results will also be published there and the final stable hypotheses can be found by looking for hypotheses where the @stable@ attribute of the @speech_hyp@ element is @1@.
23 1 D. Klotz
24 1 D. Klotz
*Example data:*
25 1 D. Klotz
26 1 D. Klotz
This is an example of the recognition result of the sentence "hello robot" and the resulting grammar tree (with a specific pre-defined grammar) that would have been published under @/isr/hyp@:
27 1 D. Klotz
28 1 D. Klotz
<pre>
29 1 D. Klotz
<speech_hyp [...] generator="isr" origin="grm_track_xcf" stable="1" update="0" bxml:id="29">
30 1 D. Klotz
<TIMESTAMP>1290275288296</TIMESTAMP>
31 1 D. Klotz
<seq>
32 1 D. Klotz
  <part begin="330" end="630" id="0" type="word">hello</part>
33 1 D. Klotz
  <score acoustic="362.48" combined="362.48" id="0" lm="0.00" />
34 1 D. Klotz
  <part begin="630" end="1030" id="1" type="word">robot</part>
35 1 D. Klotz
  <score acoustic="602.56" combined="602.56" id="1" lm="0.00" />
36 1 D. Klotz
</seq>
37 1 D. Klotz
<grammartree cancel="0" fault="0" skip="0">
38 1 D. Klotz
  <nonterminal name="$S">
39 1 D. Klotz
    <nonterminal name="Greeting">
40 1 D. Klotz
      <terminal refid="0" />
41 1 D. Klotz
      <nonterminal name="$RobotName">
42 1 D. Klotz
        <terminal refid="1" />
43 1 D. Klotz
      </nonterminal>
44 1 D. Klotz
    </nonterminal>
45 1 D. Klotz
  </nonterminal>
46 1 D. Klotz
</grammartree>
47 1 D. Klotz
</speech_hyp>
48 1 D. Klotz
</pre>
49 1 D. Klotz
50 1 D. Klotz
h3. Signal information
51 1 D. Klotz
52 1 D. Klotz
TODO: Add information about the signal status optionally published at the scope @/isr/status@ (if enabled via the isr option @-x@).
53 1 D. Klotz
54 1 D. Klotz
h3. Changing the acoustic energy thresholds
55 1 D. Klotz
56 1 D. Klotz
TODO: Add information about the signal thresholds read unter the scope @/isr/param@