<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Spokestack's Blog RSS]]></title><description><![CDATA[AutoML tools and open source libraries for mobile, web, and embedded software. Built by Developers, for Developers. Get started free.]]></description><link>http://github.com/dylang/node-rss</link><generator>GatsbyJS</generator><lastBuildDate>Fri, 27 Aug 2021 15:00:12 GMT</lastBuildDate><item><title><![CDATA[Learn to Use Custom Wake Word and Text-to-Speech on a Raspberry Pi]]></title><description><![CDATA[Following this guide teaches you how to deploy your models on an embedded device.]]></description><link>https://www.spokestack.io/blog/learn-to-use-custom-wake-word-and-text-to-speech-on-a-raspberry-pi</link><guid isPermaLink="false">https://www.spokestack.io/blog/learn-to-use-custom-wake-word-and-text-to-speech-on-a-raspberry-pi</guid><pubDate>Wed, 23 Jun 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;One of the primary motivations for working on &lt;a href=&quot;https://github.com/spokestack/spokestack-python&quot;&gt;spokestack-python&lt;/a&gt; was to allow our models to run on embedded devices like &lt;a href=&quot;https://www.raspberrypi.org/&quot;&gt;Raspberry Pi&lt;/a&gt;. We are excited to show you how easy it is to use &lt;a href=&quot;/docs/concepts/wakeword&quot;&gt;Wake Word&lt;/a&gt; and &lt;a href=&quot;/docs/concepts/tts&quot;&gt;TTS&lt;/a&gt; models on these devices.&lt;/p&gt;&lt;h2 id=&quot;spokestack-account&quot;&gt;&lt;a href=&quot;#spokestack-account&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Spokestack Account&lt;/h2&gt;&lt;p&gt;You will want to &lt;a href=&quot;/account/login&quot;&gt;login&lt;/a&gt; and get your API keys for this tutorial. If you do not already have a Spokestack account, please &lt;a href=&quot;/account/create&quot;&gt;create one&lt;/a&gt;.&lt;/p&gt;&lt;h2 id=&quot;hardware&quot;&gt;&lt;a href=&quot;#hardware&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Hardware&lt;/h2&gt;&lt;p&gt;This tutorial is geared toward the &lt;a href=&quot;https://www.raspberrypi.org/products/raspberry-pi-4-model-b/&quot;&gt;Raspberry Pi 4B&lt;/a&gt; and &lt;a href=&quot;https://www.raspberrypi.org/products/raspberry-pi-zero-w/&quot;&gt;Zero W&lt;/a&gt;. Technically, the minimum hardware requirements for this tutorial are a device that runs &lt;a href=&quot;https://www.python.org&quot;&gt;Python&lt;/a&gt;, a microphone, and at least one speaker. The recommended hardware is listed below. In addition, we have created a &lt;a href=&quot;https://www.adafruit.com/wishlists/524930&quot;&gt;wishlist&lt;/a&gt; on Adafruit that is exactly what we used for this tutorial. If you have issues using other hardware or want to show us what you made, feel free to &lt;a href=&quot;/support&quot;&gt;contact us&lt;/a&gt;. We are working on more hardware guides, so stay tuned!&lt;/p&gt;&lt;h3 id=&quot;parts-list&quot;&gt;&lt;a href=&quot;#parts-list&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Parts List&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.adafruit.com/product/4296&quot;&gt;Raspberry Pi 4 Model B&lt;/a&gt; or &lt;a href=&quot;https://www.adafruit.com/product/3708&quot;&gt;Zero W&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.adafruit.com/product/4757&quot;&gt;Adafruit Voice Bonnet&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.adafruit.com/product/3351&quot;&gt;Mono Enclosed Speaker&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.adafruit.com/product/4298&quot;&gt;Official Raspberry Pi Power Supply&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.adafruit.com/product/2693&quot;&gt;SD/MicroSD Memory Card&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note: The Zero W runs a little slower than we would like&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;raspberry-pi-setup&quot;&gt;&lt;a href=&quot;#raspberry-pi-setup&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Raspberry Pi Setup&lt;/h2&gt;&lt;p&gt;For the initial setup of the Raspberry Pi we recommend following the &lt;a href=&quot;https://learn.adafruit.com/adafruit-voice-bonnet/overview&quot;&gt;Adafruit Voice Bonnet tutorial&lt;/a&gt;. This guide walks you through everything from OS installation to sound configuration. In addition to the Adafruit instructions, there are a few Spokestack-specific instructions/tips in the following. These instructions should be followed while connected to your Raspberry Pi via SSH.&lt;/p&gt;&lt;h3 id=&quot;audio&quot;&gt;&lt;a href=&quot;#audio&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Audio&lt;/h3&gt;&lt;p&gt;PulseAudio and the Adafruit Voice Bonnet do not interact well so you will want to disable PulseAudio with the following:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;systemctl --user stop pulseaudio.socket pulseaudio.service&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you would like to enable PulseAudio afterward you can restart the service with:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;systemctl --user start pulseaudio.socket pulseaudio.service&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;system-dependencies&quot;&gt;&lt;a href=&quot;#system-dependencies&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;System Dependencies&lt;/h3&gt;&lt;p&gt;The following are some system dependencies that need to be installed before installing &lt;code&gt;spokestack&lt;/code&gt; on the Raspberry Pi.&lt;/p&gt;&lt;p&gt;&lt;code&gt;sudo apt-get -y install portaudio19-dev libblas-dev libmp3lame-dev&lt;/code&gt;&lt;/p&gt;&lt;h3 id=&quot;install-rust-for-tokenizers&quot;&gt;&lt;a href=&quot;#install-rust-for-tokenizers&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Install Rust for Tokenizers&lt;/h3&gt;&lt;p&gt;The command to install Rust is taken directly from the &lt;a href=&quot;https://www.rust-lang.org/tools/install&quot;&gt;instructions&lt;/a&gt;. On a Raspberry Pi 4, I didn’t have any issues compiling Rust, but if you are using the Zero, you may need to cross-compile. We are currently working on an easy solution for the smaller embedded devices.&lt;/p&gt;&lt;p&gt;&lt;code&gt;curl --proto &amp;#x27;=https&amp;#x27; --tlsv1.2 -sSf https://sh.rustup.rs | sh&lt;/code&gt;&lt;/p&gt;&lt;h3 id=&quot;tflite-interpreter&quot;&gt;&lt;a href=&quot;#tflite-interpreter&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;TFlite Interpreter&lt;/h3&gt;&lt;p&gt;For this, we can go with TensorFlow’s recommended apt package. We’ve used the &lt;code&gt;pip&lt;/code&gt; versions in the past, but this one is easier to install. These commands are directly from the original &lt;a href=&quot;https://www.tensorflow.org/lite/guide/python#install_tensorflow_lite_for_python&quot;&gt;instructions&lt;/a&gt;.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;&lt;span class=&quot;token builtin class-name&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;deb https://packages.cloud.google.com/apt coral-edgetpu-stable main&amp;quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;tee&lt;/span&gt; /etc/apt/sources.list.d/coral-edgetpu.list
&lt;span class=&quot;token function&quot;&gt;curl&lt;/span&gt; https://packages.cloud.google.com/apt/doc/apt-key.gpg &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; apt-key &lt;span class=&quot;token function&quot;&gt;add&lt;/span&gt; -
&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;apt-get&lt;/span&gt; update
&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;apt-get&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; python3-tflite-runtime&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;installing-spokestack&quot;&gt;&lt;a href=&quot;#installing-spokestack&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Installing Spokestack&lt;/h2&gt;&lt;p&gt;Spokestack should be installed through &lt;code&gt;pip&lt;/code&gt;. We are currently using &lt;code&gt;v0.0.20&lt;/code&gt; for this tutorial.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;pip &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;token assign-left variable&quot;&gt;spokestack&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0.0&lt;/span&gt;.20&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;testing-with-a-project&quot;&gt;&lt;a href=&quot;#testing-with-a-project&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Testing with a Project&lt;/h2&gt;&lt;p&gt;We will test with our &lt;a href=&quot;https://github.com/spokestack/python-hello-world&quot;&gt;“Hello, World!” project&lt;/a&gt;. Keep in mind we installed the dependencies in the previous sections, so you will not need to follow that project’s README.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;&lt;span class=&quot;token function&quot;&gt;git&lt;/span&gt; clone https://github.com/spokestack/python-hello-world.git
&lt;span class=&quot;token builtin class-name&quot;&gt;cd&lt;/span&gt; python-hello-world&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You will want to add your API keys to the &lt;code&gt;const.py&lt;/code&gt; in &lt;code&gt;KEY_ID&lt;/code&gt; and &lt;code&gt;KEY_SECRET&lt;/code&gt;. Now we should be able to run the app. The project will automatically download the default wake word models. Once running, the default text-to-speech voice will respond when you say “Hey, Spokestack!”&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;shell&quot;&gt;&lt;pre class=&quot;language-shell&quot;&gt;&lt;code class=&quot;language-shell&quot;&gt;python app.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;wrapping-up&quot;&gt;&lt;a href=&quot;#wrapping-up&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Wrapping Up&lt;/h2&gt;&lt;p&gt;In this tutorial, we covered how to set up Spokestack on an embedded device. This should get you started with using Spokestack with your projects. If you run into any trouble be sure to reach out through our &lt;a href=&quot;/support&quot;&gt;support channels&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[What's a Keyword Model, and Why Would I Use One?]]></title><description><![CDATA[We'll show you where keyword recognition models fit in with wake words, ASR, and NLU; and we'll help you decide if they're right for your app.]]></description><link>https://www.spokestack.io/blog/whats-a-keyword-model</link><guid isPermaLink="false">https://www.spokestack.io/blog/whats-a-keyword-model</guid><pubDate>Mon, 14 Jun 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/78f9b5a55187c8fc263cc030d53655df/8537d/concepts-keywords.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACXklEQVQoz42Rz0sUYRjH5w8QYql0TV13V3dm3Z2dmZ1f+8M0NZCgxW01RQNbLx0yD9Ul8lARbUVQFETRoaKigxJUongOIgiCIpAtQslT1qWgi1jwife19dShge+8z/vOM5/3+zyPkkwm0eJx1PYYdcEAO3c1EAg2EWxqIhJtobG1nuZYCwmtDdspoLtXyHVPcag8zeKNWc4dnyab9zAMA13XUQQwkUwSVzUa21sIRcM0hiJEomE6tDChtlZCkWZ836UwcRv34DcCxjJG3yozC6vMPZ/lzqVJejpdOhJJlFQqJclitQwT0zCxDAPTNDEMk3g8xc2z+5mfu87pB1UuPP1Kd3kZpa7Ktugcrm1yYN9u7LRJUjgUIE3TiMViUqqq/tVm3BpuY3Kih1PH9pCwjrBN/UgwtUSDsULEWcTP+MTUOLqekqYkcGBggFKptLWOjo5SKBQoFouMjIzQ318gpatY/lF26Mu0OFWC5gphexHf99D1JLVKFc/zJGhsbEz+LICDg4MMDw8zPj7O0NAQxWKJzryFlZlCqa+iBN6hBD6wXZsnm81IkBiKdOj7vuyX2AiJ2LKsrTORaFlpdD3BiZNneP12gxevfvDyzTrPFpbo7e3Ftm1c15VSHMdBSBwKpdNpua8liNj3PDQtTqVyns3nt3x/WfvM3r4+MpmMlAQKgHBUk3BWcykuqMHF4CqVi2wAP3/B97V1qu8/0dXdJfNFpRJYA/xLNajosyj/cLnMvcdPuPtohvu3HnL18jU5ZdGW2uVKLpfjfyXAjm3hpC0cz8Z2bLLZ7Nb3fD7PHxNAl0qq9e5wAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;What are Keyword Recognition Models?&quot; title=&quot;What are Keyword Recognition Models?&quot; src=&quot;/static/78f9b5a55187c8fc263cc030d53655df/05162/concepts-keywords.png&quot; srcSet=&quot;/static/78f9b5a55187c8fc263cc030d53655df/2eeed/concepts-keywords.png 294w,/static/78f9b5a55187c8fc263cc030d53655df/0d6a1/concepts-keywords.png 588w,/static/78f9b5a55187c8fc263cc030d53655df/05162/concepts-keywords.png 1175w,/static/78f9b5a55187c8fc263cc030d53655df/8537d/concepts-keywords.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;So you’ve read the &lt;a href=&quot;/features/keyword&quot;&gt;description of a keyword recognition model&lt;/a&gt;, but you still have questions. Not a problem! Keyword recognition (sometimes called “keyword spotting”) isn’t quite the household term that “ASR”, “NLU”, and “TTS” are. In this post, we’ll lay out what this type of model does, where it fits into the Spokestack speech processing model, and why you might (or might not) want to use one.&lt;/p&gt;&lt;h2 id=&quot;how-keyword-recognition-works&quot;&gt;&lt;a href=&quot;#how-keyword-recognition-works&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How Keyword Recognition Works&lt;/h2&gt;&lt;p&gt;The Spokestack libraries support keyword recognition via three separate TensorFlow Lite models, much like our &lt;a href=&quot;/features/wake-word&quot;&gt;wake word models&lt;/a&gt;. They’re very lightweight and run &lt;em&gt;entirely on-device&lt;/em&gt;. The first two models preprocess and encode audio for the third, which is trained to detect a number of specific words or phrases chosen ahead of time. In the research literature, each word or phrase would be called a “class”, and the task of choosing one of these from a collection of many is known as &lt;a href=&quot;https://en.wikipedia.org/wiki/Multiclass_classification&quot;&gt;multiclass classification&lt;/a&gt;. In our web interface, we call the classes “keywords” to avoid overusing technical jargon. The notion of a “class” is important, though, as we’ll see later on.&lt;/p&gt;&lt;p&gt;All this processing happens in near-real time, with keyword recognition running every time Spokestack’s speech pipeline deactivates, and we’ll talk about what &lt;em&gt;that&lt;/em&gt; means in the next section.&lt;/p&gt;&lt;h2 id=&quot;so-is-it-asr-nlu--a-unicorn&quot;&gt;&lt;a href=&quot;#so-is-it-asr-nlu--a-unicorn&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;So Is It ASR, NLU, … A Unicorn?&lt;/h2&gt;&lt;p&gt;First, the short and sweet answer: in Spokestack’s libraries, keyword recognition is a type of automatic speech recognition (ASR). It can be used with or without a wake word, but using it as your app’s ASR means that your app shouldn’t need NLU at all. Let’s unpack that a bit.&lt;/p&gt;&lt;p&gt;Spokestack transforms speech audio into text via a series of processing steps collectively called the &lt;a href=&quot;/features/speech-pipeline&quot;&gt;speech pipeline&lt;/a&gt;. The speech pipeline includes voice activity detection (VAD), wake word detection, and ASR. These stages are mix-and-match: you can use any or all of them (though it only makes sense to run them in the order I’ve listed them in here).&lt;/p&gt;&lt;p&gt;When ASR is actively processing audio to transcribe it into text, we say the pipeline is “active”. You get to choose whether that activation happens as a result of:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;the VAD detecting speech (all speech audio will be run through ASR)&lt;/li&gt;&lt;li&gt;a wake word being detected (in which case you should be running VAD to send only speech audio through the wake word detector)&lt;/li&gt;&lt;li&gt;a button press in the app’s UI&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;All the most common configurations are made available via &lt;a href=&quot;/docs/concepts/speech-pipeline#customizing-the-pipeline&quot;&gt;pipeline profiles&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Using a keyword recognition model as ASR means that it will process audio received while the speech pipeline is active, making its decision when the pipeline deactivates. Deactivation happens when speech stops for a preset amount of time (usually a few hundred milliseconds)or when the pipeline has been active for a particularly long time. Both of these durations are &lt;a href=&quot;/docs/machine-learning/pipeline-configuration#runtime-tunable-parameters&quot;&gt;configurable&lt;/a&gt;, but the profiles include sensible defaults that should work in most situations.&lt;/p&gt;&lt;p&gt;The speech pipeline informs your app about activation, deactivation, and speech recognition events asynchronously through an event listener, or &lt;a href=&quot;https://en.wikipedia.org/wiki/Observer_pattern&quot;&gt;observer&lt;/a&gt;. See the documentation for your chosen platform on how to establish a listener.&lt;/p&gt;&lt;h2 id=&quot;did-you-say-i-dont-need-nlu&quot;&gt;&lt;a href=&quot;#did-you-say-i-dont-need-nlu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Did You Say I Don’t Need NLU?&lt;/h2&gt;&lt;p&gt;OK, so it’s ASR, but how does a keyword model make NLU unnecessary? Recall our discussion about keyword classes above. A single class — or keyword, in the web interface — can have multiple members, just like a class in a schoolroom has multiple students. We call these class members “utterances”, and combining utterances with keywords gives you super powers. OK, not quite, but it’s close.&lt;/p&gt;&lt;p&gt;When you included multiple utterances in a keyword, the model will detect any utterance in the user’s speech, but it will transcribe that utterance using the name of the &lt;em&gt;class&lt;/em&gt;. This means that, for example, you can name a keyword “volume_up” and have utterances that detect “volume up”, “turn it up”, “louder”, and so on. If a user says any of those utterances, your app will receive a transcript of &lt;code&gt;volume_up&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;This collapsing of a collection of utterances into a single command is a stripped-down version of what NLU does. You’ve configured the keyword model by hand, so you have a complete list of all the keywords you could possibly see in transcripts. If the model isn’t confident enough that it heard one of the utterances it was trained with, it will fire a timeout event instead of a recognition event, which is the equivalent of a &lt;a href=&quot;/docs/concepts/nlu#explicit-fallback-intents&quot;&gt;fallback intent&lt;/a&gt; in NLU.&lt;/p&gt;&lt;h2 id=&quot;great-im-using-this-for-all-the-things&quot;&gt;&lt;a href=&quot;#great-im-using-this-for-all-the-things&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Great! I’m Using This for All the Things!&lt;/h2&gt;&lt;p&gt;Not so fast. Keyword models are a powerful tool, but other forms of ASR and NLU do exist for a reason. Let’s talk about some of the reasons for and against using keyword recognition ASR.&lt;/p&gt;&lt;h3 id=&quot;pro--network-usage&quot;&gt;&lt;a href=&quot;#pro--network-usage&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Pro ✅: Network Usage&lt;/h3&gt;&lt;p&gt;Keyword recognition runs entirely on the device running your app. No network connection is necessary, saving both time and data, and making the primary functionality of your app’s voice interface available anywhere your user happens to be. This is great for, say, a music app used by trail runners out in the woods.&lt;/p&gt;&lt;h3 id=&quot;pro--power-consumption&quot;&gt;&lt;a href=&quot;#pro--power-consumption&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Pro ✅: Power Consumption&lt;/h3&gt;&lt;p&gt;No network requests means less battery drain, and keyword models are more lightweight than on-device ASR models, requiring fewer cycles to do their work.&lt;/p&gt;&lt;h3 id=&quot;pro--privacy&quot;&gt;&lt;a href=&quot;#pro--privacy&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Pro ✅: Privacy&lt;/h3&gt;&lt;p&gt;On-device ASR means your users’ speech never travels to the cloud.&lt;/p&gt;&lt;h3 id=&quot;pro--simplicity&quot;&gt;&lt;a href=&quot;#pro--simplicity&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Pro ✅: Simplicity&lt;/h3&gt;&lt;p&gt;If you have an app that is naturally controllable via a limited set of commands, keyword models make your life easier. You don’t need to use Spokestack’s NLU module at all, which saves you writing training data for said model, downloading its files, and accounting for it in your app model.&lt;/p&gt;&lt;h3 id=&quot;con--limited-vocabulary&quot;&gt;&lt;a href=&quot;#con--limited-vocabulary&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Con ❌: Limited Vocabulary&lt;/h3&gt;&lt;p&gt;The last point has a flipside. The fact that a keyword model’s vocabulary is limited means that your app won’t be good at handling unexpected requests — in fact, it’s likely to give you a timeout event if the user says something you didn’t anticipate, leaving you with no idea what the user said. This makes it more difficult to log user requests and use that information to improve your model.&lt;/p&gt;&lt;p&gt;In other words, if you want to support requests made in full sentences or long phrases and allow your users to come up with novel ways to word their requests, a keyword model isn’t for you.&lt;/p&gt;&lt;h3 id=&quot;con--no-nlu&quot;&gt;&lt;a href=&quot;#con--no-nlu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Con ❌: No NLU&lt;/h3&gt;&lt;p&gt;Again with the flipsides. NLU isn’t &lt;em&gt;just&lt;/em&gt; for collapsing similarly worded requests into the same command to make them easy to process. The command part is the &lt;code&gt;intent&lt;/code&gt; in an NLU result, but there are also &lt;code&gt;slots&lt;/code&gt;. You need slots if your users’ request can be parameterized — if you think about a spoken command like it’s a method/function call in your app, the &lt;code&gt;intent&lt;/code&gt; is the function’s name, and the &lt;code&gt;slots&lt;/code&gt; are its arguments.&lt;/p&gt;&lt;p&gt;You can’t reasonably support a command like “Order me a large pizza with pepperoni and sausage” with a keyword model, because you’d need a different keyword for &lt;em&gt;each&lt;/em&gt; size and topping combination possible. Talk about a combinatorial explosion. That sort of command is exactly where NLU models excel.&lt;/p&gt;&lt;h3 id=&quot;con--data-hungry&quot;&gt;&lt;a href=&quot;#con--data-hungry&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Con ❌: Data-Hungry&lt;/h3&gt;&lt;p&gt;Keyword recognition models have to be trained. For a small project with just a few users, a &lt;a href=&quot;/blog/what-are-personal-ai-models&quot;&gt;personal keyword model&lt;/a&gt; should be fine; you can just ask your users to record data samples and use that as your training data.&lt;/p&gt;&lt;p&gt;For an app you want to distribute widely, you’ll want to collect data for each of your utterances from a variety of voices to ensure it works well for as many of your users as possible. &lt;a href=&quot;/pricing#pro&quot;&gt;We can help with data collection&lt;/a&gt;, but there is a cost involved, and it gets larger the more samples you need to collect.&lt;/p&gt;&lt;h2 id=&quot;summary&quot;&gt;&lt;a href=&quot;#summary&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Summary&lt;/h2&gt;&lt;p&gt;Here’s what we’ve covered:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Keyword recognition is a type of ASR, taking the place of platform-provided or cloud ASR in the speech pipeline.&lt;/li&gt;&lt;li&gt;You decide the vocabulary for a keyword model ahead of time.&lt;/li&gt;&lt;li&gt;When you use keyword recognition-based ASR, you don’t need to train an NLU model.&lt;/li&gt;&lt;li&gt;They’re a great tool, but whether they’re the &lt;em&gt;right&lt;/em&gt; tool is specific to your use case.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you still have questions, let us know &lt;a href=&quot;https://forum.spokestack.io/&quot;&gt;in the forum&lt;/a&gt; or via any other link in the Community section below!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[A Swear Jar in 100 Lines of Python]]></title><description><![CDATA[Use Spokestack's keyword recognizer model and Python library to help save up for a rainy day.]]></description><link>https://www.spokestack.io/blog/keyword-recognizer-python-tutorial</link><guid isPermaLink="false">https://www.spokestack.io/blog/keyword-recognizer-python-tutorial</guid><pubDate>Thu, 10 Jun 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/b1200aeff60fda2979f91bcfccda0fd4/8537d/keyword-recognizer-python-tutorial.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACSUlEQVQoz22QbUtTcRjG9zVCQtZc55ydh+08budsWg4yCmNIpjjUzJymKb0oJA1MkoSyhqmJlIiQJVjNh7RWQQjlB+h93+UX59iWSS9+/G/+XPd1X/cdUmQR1e1A8/IEtaqixTUipwS6B3Pc+zqNuT2PtDZNodjNhVyKcDiKpqmBVlGUAL/2CcliBKNjCyP/AUWMoKoamqYFosJAH9ffzhP/skDf3iqPX8wxNjVBJp2paipGVUNFltCtM+jWWfzaFwqiTKQ2QtvYJPlyCX13lgcHO+wfHFD6uMd4YZDm+gbEWAztT8q/horKydoawuETqIpCJCrRn0+zsTLM852fzHz7Rf3nl6jrD6m5cYWaBhtXjOEaBvJ/EyoKglCHHKtD1xOcFhSGerKsFPMUn95k9Psy6voj6jdnyezOY410IUWixI7c7ygh27ZJJpM4yRSp1CG65aDUiTTeHqC5vExmaxbn/ROSGzPYw13YahzTsojH48GJKgQJHcfBsiwsywzeYICbwonrtIzf4s6nN3SuL5AtL5HcLOIMdWJrCdy0h+d5QYDD/kNCpmliGAa6rgckEgl0y8QWZKTRXlpX5xhpK9D+bhFt+xn6UCeGJGPadjDcN3Rdl0qwUMWkgr9GwtBRoxLnRnu4ur/E+bUi3u4CLT8WabrbiybI6KZR1fuBfPPA0P84ji8UBZn2riYmSve5Xn5N/+4rpkqTtF67iCQqJPTEP3o/mL9tyI97nJTr4rkuppMk3ZiltTnH5Us5vMYslpOsao73+Tf9DdEgks/ccyNtAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;swear jar&quot; title=&quot;swear jar&quot; src=&quot;/static/b1200aeff60fda2979f91bcfccda0fd4/05162/keyword-recognizer-python-tutorial.png&quot; srcSet=&quot;/static/b1200aeff60fda2979f91bcfccda0fd4/2eeed/keyword-recognizer-python-tutorial.png 294w,/static/b1200aeff60fda2979f91bcfccda0fd4/0d6a1/keyword-recognizer-python-tutorial.png 588w,/static/b1200aeff60fda2979f91bcfccda0fd4/05162/keyword-recognizer-python-tutorial.png 1175w,/static/b1200aeff60fda2979f91bcfccda0fd4/8537d/keyword-recognizer-python-tutorial.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;In this tutorial, we’re going to use Spokestack’s AutoML training tool to create a keyword recognizer model and use it, along with the &lt;a href=&quot;https://github.com/spokestack/spokestack-python&quot;&gt;spokestack-python&lt;/a&gt; library, to make a digital swear jar. The full code is &lt;a href=&quot;https://github.com/spokestack/swear-jar-python&quot;&gt;on GitHub&lt;/a&gt; if you’d like to run it for yourself.&lt;/p&gt;&lt;p&gt;The concept of a swear jar is simple: you have a list of words or short phrases that you’d like to stop saying, or stop &lt;em&gt;someone else&lt;/em&gt; from saying. Every time one of those words does slip out, you drop a coin in a jar. When the jar’s full, you take it to the bank and open a savings account to spend when you’re better behaved.&lt;/p&gt;&lt;p&gt;We should note that if swearing’s not your thing, this concept also lets you reward yourself for saying nice things—call it a “compliment jar”. We wouldn’t want to limit this tutorial to us reprobates.&lt;/p&gt;&lt;p&gt;Anyway, the first thing that might come to mind for this use case is a &lt;a href=&quot;/docs/concepts/wake-word&quot;&gt;wake word detector&lt;/a&gt;: you don’t want your app running ASR all day, continuously streaming a pipe of data to wherever; you just want it to notice when you’ve said certain things. But there’s a wrinkle: if movie ratings have taught us anything, it’s that certain words are just … ickier than others. We’d like to make it more expensive to say &lt;em&gt;\&amp;lt;insert epithet here&amp;gt;&lt;/em&gt; than, say, &lt;em&gt;\&amp;lt;insert milder epithet here&amp;gt;&lt;/em&gt;. (You didn’t think we were actually going to suggest profanity for your list, did you? This is a family-friendly tech site we’re running here.)&lt;/p&gt;&lt;p&gt;A &lt;a href=&quot;/docs/concepts/keywords&quot;&gt;keyword recognizer&lt;/a&gt; is perfect for this. Trained on a small set of utterances, it acts as a wake word recognizer and ASR in one, letting you know which utterance it heard without activating a cloud service in the process—your bad habits stay on your device forever.&lt;/p&gt;&lt;p&gt;Now that we know what we’re doing and why, let’s get to &lt;em&gt;doing&lt;/em&gt; it.&lt;/p&gt;&lt;h2 id=&quot;creating-a-model&quot;&gt;&lt;a href=&quot;#creating-a-model&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Creating a Model&lt;/h2&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;Note&lt;/strong&gt;: You’ll need a Spokestack Maker or higher account to follow along with model creation. Running the sample app is totally free, so go ahead and take it for a spin first. Keep in mind that the models provided with the sample code are &lt;a href=&quot;/docs/concepts/keywords#personal-keyword&quot;&gt;personal models&lt;/a&gt;, so &lt;a href=&quot;what-are-personal-ai-models&quot;&gt;performance with your voice may vary&lt;/a&gt;. They were trained with a relatively deep voice.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;If this is your first time using Spokestack, you’ll need &lt;a href=&quot;/create&quot;&gt;an account&lt;/a&gt;. If you have a free account, you’ll need to &lt;a href=&quot;/pricing#maker&quot;&gt;upgrade to the Maker tier&lt;/a&gt;. Once that’s taken care of, head over to the &lt;a href=&quot;/account/keyword&quot;&gt;keyword tool&lt;/a&gt;. Click the “New Model” button and name the model whatever you like. We’re going with “swear jar” here, saving our creativity for the actual list of words.&lt;/p&gt;&lt;p&gt;Speaking of, now’s the moment you’ve been waiting for: channel your inner George Carlin and click that “add keyword” button until you’ve accumulated enough to break (or reinforce) your habit.&lt;/p&gt;&lt;p&gt;While adding keywords, you’ll notice that you also have the option to add multiple utterances to each keyword. Keywords and utterances interact like this: the keyword is the text that will be returned to your app when the model recognizes any utterance listed under it. So if one of your swears is, say, “beef”, but you also want to stop saying “beefsteak”, you might group those together by making them separate utterances under the keyword “beef”.&lt;/p&gt;&lt;p&gt;Once everything is typed in, find a nice secluded space where no one will overhear enough to be concerned about your mental state, and prepare to swear at your computer. Or, as we call it, “Tuesday”.&lt;/p&gt;&lt;p&gt;Click the “record” button next to an utterance to start recording samples for it. You’ll need at least three samples of each utterance to train a model, but there’s no upper limit. More samples only help performance, so really get in touch with your inner thespian. Don’t go &lt;em&gt;too&lt;/em&gt; far, though; keep in mind that you want your training samples to still sound like you, so use whatever emotional affect you want your app to recognize as your voice.&lt;/p&gt;&lt;p&gt;When you’ve recorded all your samples, click “train”. Model training takes a few minutes, so take this opportunity to do some deep breathing to wrap up your day’s therapy session.&lt;/p&gt;&lt;p&gt;Don’t forget to download your model once it’s done training; we’re about to need it.&lt;/p&gt;&lt;h2 id=&quot;writing-the-app&quot;&gt;&lt;a href=&quot;#writing-the-app&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Writing the App&lt;/h2&gt;&lt;p&gt;Our final product will be a simple Python command-line app that you can keep running in the background on any computer with a microphone attached. We’ll be using &lt;a href=&quot;https://github.com/spokestack/spokestack-python&quot;&gt;spokestack-python&lt;/a&gt; to do all the signal processing. You can make a full-featured voice app with the library, including NLU and TTS for responses, but we only need the &lt;a href=&quot;/docs/python/speech-pipeline&quot;&gt;speech pipeline&lt;/a&gt; to demonstrate the usefulness of a keyword model.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;agc&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;webrtc &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; AutomaticGainControl
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;asr&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;keyword&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tflite &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; KeywordRecognizer
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;io&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pyaudio &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; PyAudioInput
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nsx&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;webrtc &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; AutomaticNoiseSuppression
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipeline &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; SpeechPipeline
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;vad&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;webrtc &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; VoiceActivityDetector&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; VoiceActivityTrigger

FRAME_WIDTH &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;10&lt;/span&gt;
SAMPLE_RATE &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;16000&lt;/span&gt;
MODEL_DIR &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;models&amp;quot;&lt;/span&gt;

pipeline &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; SpeechPipeline&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    input_source&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;PyAudioInput&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
      frame_width&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;FRAME_WIDTH&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; sample_rate&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;SAMPLE_RATE
    &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    stages&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
      AutomaticGainControl&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;sample_rate&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;SAMPLE_RATE&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                  frame_width&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;FRAME_WIDTH&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
      AutomaticNoiseSuppression&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;sample_rate&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;SAMPLE_RATE&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
      VoiceActivityDetector&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
        frame_width&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;FRAME_WIDTH&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          sample_rate&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;SAMPLE_RATE&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          vad_fall_delay&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;500&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
      VoiceActivityTrigger&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
      KeywordRecognizer&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;WORDS&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;keys&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; model_dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;MODEL_DIR&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token comment&quot;&gt;# we&amp;#x27;ll talk about this function in just a bit&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;add_handler&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;recognize&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; swear_heard&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;OK, I&amp;#x27;ll be listening. No funny business.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;Press Ctrl-c to exit.\n&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;run&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Believe it or not, this is the bulk of the code you’ll need for this app. We’re setting up the speech pipeline with the help of some hard-coded default values. &lt;code&gt;WORDS&lt;/code&gt; isn’t shown here, but it maps the keyword classes we just created to their values in cents so we can report how much we owe later on. See &lt;a href=&quot;https://github.com/spokestack/swear-jar-python&quot;&gt;the GitHub repository&lt;/a&gt; for more details.&lt;/p&gt;&lt;p&gt;In most cases, you’ll want to read keyword classes from the &lt;code&gt;metdata.json&lt;/code&gt; file distributed alongside the model, but for sake of demonstration, it’s easier to put them in a static dictionary. The key gotcha is that if you hardcode, you still have to list the keywords in the same order as they appear in &lt;code&gt;metadata.json&lt;/code&gt; because the model will reference them by index.&lt;/p&gt;&lt;p&gt;If you do want to load from the metadata file, this is how you’d do it in Python:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; os
&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; json

&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;load_keyword_classes&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;model_dir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    metadata &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;path&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;join&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;model_dir&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;metadata.json&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    config &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; json&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;load&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;metadata&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;clazz&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; clazz &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt; config&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;classes&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The other piece of our puzzle is the app’s reaction when it hears a word on our list. For that we’ll use the pipeline event we registered above:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;io&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pyaudio &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; PyAudioOutput

AUDIO_DIR &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;audio&amp;quot;&lt;/span&gt;
OUTPUT &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; PyAudioOutput&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;num_channels&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; sample_rate&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;44100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;swear_heard&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
  value &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; WORDS&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;get&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; value&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    sound_path &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;path&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;join&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;AUDIO_DIR&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string-interpolation&quot;&gt;&lt;span class=&quot;token string&quot;&gt;f&amp;quot;&lt;/span&gt;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;value&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;.wav&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;with&lt;/span&gt; wave&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;sound_path&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;rb&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt; coin&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
      frame &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; coin&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;readframes&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1024&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token keyword&quot;&gt;while&lt;/span&gt; frame &lt;span class=&quot;token operator&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        OUTPUT&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;write&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;frame&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string-interpolation&quot;&gt;&lt;span class=&quot;token string&quot;&gt;f&amp;quot;Uh-oh! I heard &lt;/span&gt;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;. That&amp;#x27;ll be &lt;/span&gt;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;value&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt; cents.&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that this function was registered to listen to the &lt;code&gt;recognize&lt;/code&gt; event, hinting at the fact that the keyword recognizer is acting as our ASR here. If we were using a wake word detector instead, it would fire the &lt;code&gt;activate&lt;/code&gt; event, and we’d have to wait for &lt;code&gt;recognize&lt;/code&gt; from a separate ASR to get a meaningful value in &lt;code&gt;context.transcipt.&lt;/code&gt;&lt;/p&gt;&lt;p&gt;Thanks to some pre-recorded and cleverly named audio files, we’re able to not only print a scolding message to the terminal, but play the sound of a hard-earned coin dropping into the jar.&lt;/p&gt;&lt;h2 id=&quot;go-deeper&quot;&gt;&lt;a href=&quot;#go-deeper&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Go Deeper&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;Keep a running total of the amount deposited throughout the day&lt;/li&gt;&lt;li&gt;Grab a housemate and train using both your voices&lt;/li&gt;&lt;li&gt;???&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;In order to run our application from the command line, you’ll need to wrap the functions above in standard &lt;code&gt;if __name__ == &amp;quot;main&amp;quot;&lt;/code&gt; Python boilerplate (again, see &lt;a href=&quot;https://github.com/spokestack/swear-jar-python&quot;&gt;the sample code&lt;/a&gt; for a working setup).&lt;/p&gt;&lt;p&gt;Other than that, though, the keyword model and the code above are almost all you need to make a working swear jar using Spokestack.&lt;/p&gt;&lt;p&gt;Have fun, and happy (?) swearing! Get in touch (contact info of all types is listed below) to let us know if you enjoyed this tutorial or if there’s anything else you’d like to see us talk about.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[What Are Personal AI Models?]]></title><description><![CDATA[Spokestack Maker introduces sophisticated AutoML tools for creating personal AI models. What are they, and how should they be used?]]></description><link>https://www.spokestack.io/blog/what-are-personal-ai-models</link><guid isPermaLink="false">https://www.spokestack.io/blog/what-are-personal-ai-models</guid><pubDate>Thu, 10 Jun 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/a0f9288bd857b23ef6fc3a86995255fc/8537d/what-are-personal-ai-models.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACYUlEQVQoz32Ra0uTYRjHn69hzqlpz+ZzmmsHd2prhLggiMDIeqUdpBALe6NUaI4gqK/hi8pDEKRz6Meo3iZrOcHm2vnwzF/czxiUL7rgx3Xf3Pf1539dl6TrOpqmIbJAUVTCQR8LDyct5mau8WDmBtdnp0lM32F26hZLU7cJeb0oqoquaxi6jqHpaLqO1BUSooZhMKKoRAIuko9jvJy/zKsnMebuxxi9O4l27yaPEld4HY8Tcblwqipu1yh9g/2cG+zD0A2krjtVVfD7Aywl11h584mF5Q3mn73n6coWi8ktXiQ3eL66zuziBvOrWyy//cji6hqBQIhz9l56Bmwdwa5Dp9NJOBwmn8/TjUKlShMwwcqC3EmT43yZar1BJnto1cjDMrqioapqx6HL5bJajkajZDI/aJltyrU6R4UTCtUKv2tVivUaxUado195vh9kyB4e8uXrNyLhCLb+XuxDfeiajuTz+RgbG8Pr9TI+Pk4ud8QpUGu0qFZrNBoN2mab09NT2iZU6ia1pmn9yfzMEb0UpW/Ahn3IjmG4kPx+P0LU7XYTj8fJ5XJWu+VqjVK5TKvVxDRNi2KxyEH2hEKpRr3V5iCTZWJiAlv/ID39Q/h9PiSPx4NAbDgWi1EqlfhfFCqdmYo4zhdIJK4yfGEYWT6Pz+dHEs4EQjAUCrG5ucn+/j7bOyk+p3bZTqXZtvIuO7tp0nudeyq9x7sP652lOGQMY8QamyQW0kWIOhwOZFm2sti8oihoqmohzk7nCE6HA4dDtt5FnXtUmLqI1+tBCgaD/I1w+Q/hsOWiy9n3s7V/AG1MCkLdBtKBAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;What are Personal AI Models?&quot; title=&quot;What are Personal AI Models?&quot; src=&quot;/static/a0f9288bd857b23ef6fc3a86995255fc/05162/what-are-personal-ai-models.png&quot; srcSet=&quot;/static/a0f9288bd857b23ef6fc3a86995255fc/2eeed/what-are-personal-ai-models.png 294w,/static/a0f9288bd857b23ef6fc3a86995255fc/0d6a1/what-are-personal-ai-models.png 588w,/static/a0f9288bd857b23ef6fc3a86995255fc/05162/what-are-personal-ai-models.png 1175w,/static/a0f9288bd857b23ef6fc3a86995255fc/8537d/what-are-personal-ai-models.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt; service is designed for two audiences: (1) hobbyists who want to personalize their projects, and (2) developers who want to prototype a project as realistically as possible before committing to training a universal wake word model or studio-quality text-to-speech (TTS) voice. This is a great way to test drive the technology without breaking the bank.&lt;/p&gt;&lt;h2 id=&quot;what-are-the-differences-between-personal-and-universal-voice-ai-models&quot;&gt;&lt;a href=&quot;#what-are-the-differences-between-personal-and-universal-voice-ai-models&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What are the Differences Between “Personal” and “Universal” Voice AI Models?&lt;/h2&gt;&lt;p&gt;Like any neural models, the personal models created with these tools are only as good as the data they’re trained on. In other words, a wake word created in Maker will not respond to everyone else in the world as well as it does to you if it’s shipped in a mobile app — it will only work reliably for voices that resemble the one(s) it was trained with.&lt;/p&gt;&lt;p&gt;For a home automation project, though, this specificity might be a feature — have you ever wished that your smart speaker didn’t listen to the kids quite so well? (Of course, if you &lt;em&gt;do&lt;/em&gt; want your DIY smart speaker to listen to everyone, you can call the kids over to the computer and have them help you train it!)&lt;/p&gt;&lt;p&gt;Similarly, with Maker personal TTS you can train a clone of your voice in just 75 sentences. However, its pronunciation and similarity to the human it’s mimicking isn’t going to be quite as good as a voice trained on dozens of hours of audio recorded in a studio. It’ll still be plenty recognizable — check out the voices on our showcase along with their training criteria — but high-quality samples, and lots of them, always produces better models than low-resource data. If you happen to have good recording equipment and a quiet environment, we do have thousands of scripts queued up in the training tool, so go ahead and record as much as you have patience for. Our automatic model trainer will adjust to make the best use of the data you provide it.&lt;/p&gt;&lt;p&gt;With Spokestack Maker, you’ll still be able to use the NLU model trainer and TTS showcase, but you’ll also get access to three new tools for training your own &lt;a href=&quot;/docs/concepts/wake-word&quot;&gt;wake word&lt;/a&gt;, &lt;a href=&quot;/docs/concepts/keywords&quot;&gt;keyword&lt;/a&gt;, and &lt;a href=&quot;/docs/concepts/tts&quot;&gt;TTS&lt;/a&gt; models using data you record yourself.&lt;/p&gt;&lt;p&gt;You’ll be able to download your custom models and make unlimited requests to your custom TTS voice as long as your Maker subscription is active. If you decide you’d like to take it to the next level and make a universal wake word/keyword model or a studio-quality TTS voice, &lt;a href=&quot;mailto:hello@spokestack.io?subject=I%20want%20to%20know%20more%20about%20Spokestack&amp;#x27;s%20Universal%20voice%20models&quot;&gt;let us know&lt;/a&gt;!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[How to Change Alexa's Voice]]></title><description><![CDATA[Use Spokestack's TTS service to avoid your skill sounding like all the others.]]></description><link>https://www.spokestack.io/blog/how-to-change-alexas-voice</link><guid isPermaLink="false">https://www.spokestack.io/blog/how-to-change-alexas-voice</guid><pubDate>Mon, 07 Jun 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/8e6ff22e235df8fc7b0ef4fc57b996a7/8537d/how-to-change-alexas-voice.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACFklEQVQoz4WQTWsTURSGB9z6B9xYF62dzzuZTE1TxsaY7yYxbVrazlitUlPBulCklairghtFKCiCqFBBFARXVhQFpWipURciuBbBP+DK7SP3xtRaii5ezpnz8dx3juY4Dv+Tbds4jsA0DUxDx7EtJdsyEMJBCLExp7WHtwNsgRk6+xIDVMdnqESnKYzNkkiV6e7uoae3F+v3zrbArbJMg0R/QHSySXTqAg8fr/D0xStern7g/oNHhPURAt9HNww027JwB5LEKxVEMIih65imiWEYShIof7U0epzasbOMnTjH6zfrfPr8hfVWi7X3H7l9eZEonVZONbmUSqUIw5CJyUkVoyhSkrnrugg3zvDMJZp3V7m4/I6FO29pLrc4c/05faUpyvUjjBwqY9kOmnSQz+eZm5uj0WhQLBZJp9Nks1kymQyuo7Nn105mF+9xc+0nV1e+ce3Zd648+cqN1R/kji4gkgV0vUfdWovH4wRBQK1Wo1qtIr9jsZhyJnOztwtz9w6y9WmmmreI5pc4PL9EOL9Ebvo8iWJE/0AKL+bieV4bKJfl3aQkTDZkXfWEw97uLnXLRG6C5FCIl64gUkMkC6McLNTo8+P4vq/mNQmTkI7UzYRQ0fMkvD0sc9/vYzBTplQbp1gZJth/AFeIjceVQ7nYkQR11KnJoY5j+aBwbGKuUNG2rb/mlMPN7v6lDliqXfuTb+79AqQ0hfGAe6LZAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;How to Change Alexa&amp;#x27;s Voice&quot; title=&quot;How to Change Alexa&amp;#x27;s Voice&quot; src=&quot;/static/8e6ff22e235df8fc7b0ef4fc57b996a7/05162/how-to-change-alexas-voice.png&quot; srcSet=&quot;/static/8e6ff22e235df8fc7b0ef4fc57b996a7/2eeed/how-to-change-alexas-voice.png 294w,/static/8e6ff22e235df8fc7b0ef4fc57b996a7/0d6a1/how-to-change-alexas-voice.png 588w,/static/8e6ff22e235df8fc7b0ef4fc57b996a7/05162/how-to-change-alexas-voice.png 1175w,/static/8e6ff22e235df8fc7b0ef4fc57b996a7/8537d/how-to-change-alexas-voice.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;At Spokestack, one of our core goals is to make it possible for every mobile app or voice skill to have its own unique audible voice. Amazon and Google make it easy to use their text-to-speech services, but then your skill (/action) ends up sounding like all the rest. If you’ve ever wondered how to change Alexa’s voice, we have an &lt;a href=&quot;https://voicetechpodcast.com/articles/development/how-to-change-alexas-voice/&quot;&gt;article on Voice Tech Podcast&lt;/a&gt; just for you.&lt;/p&gt;&lt;p&gt;It’ll give you some background on what’s required to actually replace the system voice and walk you step by step through the use of &lt;a href=&quot;https://github.com/spokestack/alexa-custom-tts&quot;&gt;a low-code example in Python&lt;/a&gt; to set up a skill of your own. You can use a free Spokestack account to try everything out, and if you agree that your skill should stand out from the crowd, you can upgrade to &lt;a href=&quot;/pricing#maker&quot;&gt;the Maker tier&lt;/a&gt; and create a completely new voice that only you can use.&lt;/p&gt;&lt;p&gt;The whole process is a very low-effort way to see the benefits of a Spokestack account in action, so head over to &lt;a href=&quot;https://voicetechpodcast.com/articles/development/how-to-change-alexas-voice/&quot;&gt;Voice Tech Podcast&lt;/a&gt; for the whole article!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Introducing Spokestack Maker - AutoML Personal Voice for Creators]]></title><description><![CDATA[Spokestack Maker introduces sophisticated AutoML tools for voice interfaces to a new audience of prototypers, enthusiasts, and makers.]]></description><link>https://www.spokestack.io/blog/introducing-spokestack-maker</link><guid isPermaLink="false">https://www.spokestack.io/blog/introducing-spokestack-maker</guid><pubDate>Wed, 26 May 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/a815377a9187c0af894dd34947f7a1bb/8537d/introducing-spokestack-maker.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACkklEQVQoz12SaUtUYRTH79dwm9yYcZy7OM29M6OjY1JumRmhSPQiyqAoS3pRhJKVBVEfoTdFBGYoQYtF4gv7CFG2qGXZ4kjOTM5y7yz6i+eiZB34cw7P85wfZ3kkRVGQZRnhFVnG45ExfCrdHQEGThzkQn8vx48eoOPoYVqPHKKvp5veQBBD0/DIsp2rKgqqrCArCpIN2oSqqoq72kN9UGNkYBdXzuzi+tkwp/sa0Y51I/f1cLJtDzd37ybs9VKtyHi1GkrKd1BYVoKqqH+BHk81fn+QiyP3uXzrMeeGJzg9OMbApXEuXHvE0Mg4Q1fHOTU4Rv/wKMM37nH+0m2CwRCFjiIKSov/BVZVVREKhYhGo2yZZcZhPS0iNjayQJaslSaZTLGyssLsu/eE6kM4K50o1WJcHiTRqqZpdsvhcJivS9/I5tZJmRapZJxc1rSVz1nkrAQZM8FqNMr8widev3lLQ0OYoh2FOCpKUGQFyTAMAoEAuq7T3NxMJBKxq0tncvxOmFiZPGYmx4+fyywufiKyskp8LUXs9xpzC5+pr2+gpLQYR4UDVdWQ/H4/Aur1emlqamJ5edkGpswMViaHZWVIplKk0yammeZXzGIpkiWXz9vv7t65Q0Gxg8LSSvyGgeTz+RASG25sbCSRSPC/pdMpYrEY8Xjcvk+srZHfBM7MvMJR6sDlqsAw/EiiMiEBrKurY2JigunpaSafv+Dl1BSPnzzh6bNJPs4v8PrNLB/m5vm69IUvS99JmhlGHzykrLwMTXOj6waSWMiWBNTlcuF0Om0vNu92u9ENg/1dXezr7KS1rZ2WljZaWlppb99rd1Wj1eDd6UPXfUi1tbVsl6hyu8RXEudbo9E3/ZbE/Lfn/gFQoxCzAlAyIwAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Introducing Spokestack Maker - AutoML Personal Voice for Creators&quot; title=&quot;Introducing Spokestack Maker - AutoML Personal Voice for Creators&quot; src=&quot;/static/a815377a9187c0af894dd34947f7a1bb/05162/introducing-spokestack-maker.png&quot; srcSet=&quot;/static/a815377a9187c0af894dd34947f7a1bb/2eeed/introducing-spokestack-maker.png 294w,/static/a815377a9187c0af894dd34947f7a1bb/0d6a1/introducing-spokestack-maker.png 588w,/static/a815377a9187c0af894dd34947f7a1bb/05162/introducing-spokestack-maker.png 1175w,/static/a815377a9187c0af894dd34947f7a1bb/8537d/introducing-spokestack-maker.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Spokestack exists to help put voice into software because voice is a humane way to interface with software and because software needs better, more accessible tools to utilize voice.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt; introduces sophisticated machine learning tools for voice to a new audience of prototypers, enthusiasts, and makers. Create, train, distribute, and use state-of-the-art personal wake word, keyword, and text-to-speech (TTS) models — with &lt;em&gt;your&lt;/em&gt; own voice.&lt;/p&gt;&lt;p&gt;Now all developers, not just voice assistant experts or machine learning specialists, can create sofware featuring voice. Spokestack Maker turns voice technology into just another interface developers can utilize—like a mouse &amp;amp; keyboard or a touchscreen—to interact with users.&lt;/p&gt;&lt;p&gt;Lowering the barriers to voice technology means reducing technical cost in addition to the sticker shock that custom machine learning models often carry. Voice AI is a difficult field, full of papers with irreproducible results, hidden pitfalls, and undocumented code. You can end up spending all day in Jupyter notebooks babysitting training jobs instead of building your killer app. We know, we’ve done it before! &lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt; turns voice into just another interface developers utilize—like a mouse &amp;amp; keyboard or a touchscreen—to interact with users.&lt;/p&gt;&lt;h3 id=&quot;spokestack-is-open-source&quot;&gt;&lt;a href=&quot;#spokestack-is-open-source&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Spokestack is open-source&lt;/h3&gt;&lt;p&gt;Spokestack already offers powerful, industry-leading libraries for utilizing voice in &lt;a href=&quot;/docs/python/getting-started&quot;&gt;Python&lt;/a&gt;, &lt;a href=&quot;/docs/react-native/getting-started&quot;&gt;React Native&lt;/a&gt;, &lt;a href=&quot;/docs/android/getting-started&quot;&gt;Android&lt;/a&gt;, &lt;a href=&quot;/docs/node/getting-started&quot;&gt;Node&lt;/a&gt;, and &lt;a href=&quot;/docs/ios/getting-started&quot;&gt;iOS&lt;/a&gt;. With &lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt;, that voice becomes personal and customizable, utilizing state of the art self-service AutoML technology.&lt;/p&gt;&lt;p&gt;With &lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt;, you have the power of three new tools for training your own &lt;a href=&quot;/features/wake-word&quot;&gt;wake word&lt;/a&gt;, &lt;a href=&quot;/features/keyword&quot;&gt;keyword&lt;/a&gt;, and &lt;a href=&quot;/features/tts&quot;&gt;TTS&lt;/a&gt; models using data you record yourself. You’ll still have access to the same &lt;a href=&quot;/features/nlu&quot;&gt;NLU&lt;/a&gt; model trainer and TTS showcase available to all existing free accounts.&lt;/p&gt;&lt;h3 id=&quot;live-maker-q--a-with-the-spokestack-team&quot;&gt;&lt;a href=&quot;#live-maker-q--a-with-the-spokestack-team&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Live Maker Q &amp;amp; A with the Spokestack team&lt;/h3&gt;&lt;p&gt;Have more questions, or want a live demo of Maker? We’re &lt;a href=&quot;https://spokestack.ck.page/62c51983c6&quot;&gt;handing out swag and answering questions&lt;/a&gt; Wednesday June 2 at 1pm EDT. The whole Spokestack team will be on to talk about why we created Maker, some ideas of how you can use the tools in your software, and answer any questions from the chat! Can’t make it then? The live event will be archived &lt;a href=&quot;https://www.youtube.com/channel/UCn1kViAiPO-XzCfREvGI_AA&quot;&gt;on our Youtube&lt;/a&gt; for on-demand viewing.&lt;/p&gt;&lt;h3 id=&quot;features-and-pricing&quot;&gt;&lt;a href=&quot;#features-and-pricing&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Features and Pricing&lt;/h3&gt;&lt;p&gt;Want to test drive cutting-edge machine learning features like keyword recognition, wake word activation, and custom AI voices without breaking the bank? &lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack Maker&lt;/a&gt; brings this enterprise-level technology to the creator market for the first time.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;/pricing#maker&quot;&gt;Spokestack’s Maker subscription pricing&lt;/a&gt; is tailored for two audiences: hobbyists who want to personalize their projects and developers who want to prototype a project as realistically as possible before committing to training a universal wake word model or studio-quality TTS voice.&lt;/p&gt;&lt;p&gt;You’ll be able to download your personal models, train and retrain your personal models and enjoy API access your personal TTS voice as long as your Maker subscription is active. If you decide you’d like to take it to the next level and make a universal wake word/keyword model or a studio-quality TTS voice, &lt;a href=&quot;mailto:hello@spokestack.io?subject=Join%20Pro%20Waitlist&quot;&gt;join the Spokestack Pro waitlist!&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Why Haven't We Seen a Killer Voice App?]]></title><description><![CDATA["Voice app" is a convenient shorthand for "Alexa Skill" or "Google Action", but does it do more harm than good?]]></description><link>https://www.spokestack.io/blog/killer-voice-app</link><guid isPermaLink="false">https://www.spokestack.io/blog/killer-voice-app</guid><pubDate>Mon, 10 May 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:840px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/b17be953eda2ea1a60f2cc5e13df36ad/e2bc6/killer-voice-app.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:71.42857142857143%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAOCAYAAAAvxDzwAAAACXBIWXMAAAsTAAALEwEAmpwYAAADE0lEQVQ4y22TW2hcVRSGz7MWLw99EKuYOmeSmXOd+0wm0ybaXJgZo01ia2KTICEkiiijIhH6oCgSKcbRlyqVPoiZOeNckmkgBW29FVpQfFA7LdTYK0LxxYr6qJ+sXY6CeODfa++z//Wvf529j2aaFpZ1E45tkU7apJMW6ZRFMm6r96ZpKvg8IxwmGNQJBO6np6cbyzL/4WmyMC2HnuBOenOjuLsr5IbfIPngu8T738J1whjmvwkilkhlKBRHmdi3n/4H9mBJQcNQHC1s2BjB+xidmufV1hVebvzGQe8XDlZv8PrGHzz29CsEA13Ytk04FOSZN9d4rfUDhz/7maOnf2Xl+DVePHKKSCKNEQ6hpeI2Hx/dz6cnjvDSe1+wuHKJ+UM/Mn/oEs8d3uKhx58i3KNzb1eA7fdYPL/S4P2Tl3m7/T3ltW9551iHypdXeLb0AroeQOvPWHx1vMSp5iz92RFu3dnhDv0ct3RdZIdRw7G6cWybu3Z0se22O3niwKNstNeprn5A3atQq3xIq+7xUa1GLBZD6w4ZDO92GdoVwo7t5fZgh+3mebbpF7nbbBKPu1i2rQ5gYuwR6o0mnlej1Vpjfb1No9mk6nm0221yuRxaKpXEjSQxLYs9Q5MM7tsi9/AFdu29xsjEBtGog9wEXdeZnp6mKQLVKp7nUavVVKzX66yurhKNRtGSySTZ3gyu67CwsMDN5081nu18RzqdVq0IWeblcplWq6WEfGxublIqlQiFQmiJRIJUKqWOfHHxSa7e+ItPtn7n66tw5psOfdle4vE4UjgSiTA4OMjy8rJy2Wg0qFQqLC0tKQ3R0oQkDgzD4MDUFJev/0T5zBbHTl/n85MniMXiyp1wRNhxHDUvFAqMj4+rAiKUzWYVlKDrugriYmBggMLwEMX8iPrIsi+CPnxh6UjuphTw9yRf88XkL8jn88zOzpIvFCkWi2ouBSTRT5ICY2NjzMzMMDk5ydzcnHLoi2syyMKHJPgQJ/8HX1iM+FF0JGqZTEb13tfXp07RLyBR8N+W/RYlzxcSiEvh/g2+bzU26D7w3gAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Why Haven&amp;#x27;t We Seen a Killer Voice App?&quot; title=&quot;Why Haven&amp;#x27;t We Seen a Killer Voice App?&quot; src=&quot;/static/b17be953eda2ea1a60f2cc5e13df36ad/e2bc6/killer-voice-app.png&quot; srcSet=&quot;/static/b17be953eda2ea1a60f2cc5e13df36ad/2eeed/killer-voice-app.png 294w,/static/b17be953eda2ea1a60f2cc5e13df36ad/0d6a1/killer-voice-app.png 588w,/static/b17be953eda2ea1a60f2cc5e13df36ad/e2bc6/killer-voice-app.png 840w&quot; sizes=&quot;(max-width: 840px) 100vw, 840px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;In the interest of fostering discussion about the state of voice tech, we’re thinking of starting a new regular feature. It’s tentatively titled “Ask the Cranky Computational Linguist” — which, of course, would be me. The title’s not necessarily official, but I do get cranky from time to time, so it fits.&lt;/p&gt;&lt;p&gt;Our first topic is the question you clicked on to get here, but the answer is part of &lt;a href=&quot;https://spokestack.substack.com/p/killer-voice-app&quot;&gt;our Substack publication&lt;/a&gt;. Posting there lets us send new issues directly to your inbox and interact in the comments section, but we don’t want to leave our web site readers behind, so we’ll add new issues as summary blog posts just like this one.&lt;/p&gt;&lt;p&gt;If you’re interested in discussions like this, let us know by liking/subscribing on Substack, and feel free to submit new topics there or &lt;a href=&quot;mailto:hello@spokestack.io?subject=Dear%20Cranky%20Computational%20Linguist&quot;&gt;via email&lt;/a&gt;. Thanks for reading!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Converting a TensorFlow Model to TensorFlow.js in Python]]></title><description><![CDATA[Google provides a command line utility for converting TensorFlow models into TensorFlow.js format for running in a browser, but what if you want to do that conversion in code?]]></description><link>https://www.spokestack.io/blog/tfjs-conversion-in-python</link><guid isPermaLink="false">https://www.spokestack.io/blog/tfjs-conversion-in-python</guid><pubDate>Mon, 12 Apr 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5ae85c361a7a6cbf14ef980ab0f88a40/8537d/tfjs-conversion-in-python.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABZElEQVQoz3WSzU6DUBCFh9oYI1FbFYMYRZogJED5b4WmhsTHcGPizqU7X8N3cOFTHjMTLrm2uDi5w7kz38yQS0mSgBXHsUiPde8/fzeHxmDL5RJZliJN0z0Ie1mWSc4YdADqsCAIYNvXcN37obOS67pw7CuEQbAHHYA6zPd9GJMJJkQgIliWJXdRFElM4h+DyJBcHSpAfR1eczY/xwERvl8svLamANrmEV3XSWynz3j7/IB/O4V5MkeeZ3+BeZ6jrmtR27a4cRwpfN+cor6b9sAG26etxHER4udrA+fCwNnsUoA8vRqK1us1VquVADnmBqZp9qsRFgtP7sqyhOd5MHr/8MhEWRRyVxTFIKqqavhgGBfyGYYhkiQG37PHJxdHUQzff0BVlbIRq2maYUtZmcVPgaWehYKoRkoM4kKOOVf9Mt6S80mBdKCSmpgT1aTcYLeGxb4AdQA/ASXlcbK+xVieDv0FyxpKWID04MgAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Converting a TensorFlow Model to TensorFlow.js in Python&quot; title=&quot;Converting a TensorFlow Model to TensorFlow.js in Python&quot; src=&quot;/static/5ae85c361a7a6cbf14ef980ab0f88a40/05162/tfjs-conversion-in-python.png&quot; srcSet=&quot;/static/5ae85c361a7a6cbf14ef980ab0f88a40/2eeed/tfjs-conversion-in-python.png 294w,/static/5ae85c361a7a6cbf14ef980ab0f88a40/0d6a1/tfjs-conversion-in-python.png 588w,/static/5ae85c361a7a6cbf14ef980ab0f88a40/05162/tfjs-conversion-in-python.png 1175w,/static/5ae85c361a7a6cbf14ef980ab0f88a40/8537d/tfjs-conversion-in-python.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;At Spokestack, we use several different types of TensorFlow model, and we have various deployment targets for them: big models for the cloud, small TensorFlow Lite models for mobile devices, and even Tensorflow.js models for browsers.&lt;/p&gt;&lt;p&gt;In cases where a single model architecture is trained and deployed to more than one of these targets, it’s nice to be able to do the training, any necessary conversion, and deployment all from one place. It’s even better if this can be managed from a single Python module.&lt;/p&gt;&lt;p&gt;This is easy enough for formats like &lt;a href=&quot;https://www.tensorflow.org/guide/saved_model&quot;&gt;SavedModel&lt;/a&gt; and &lt;a href=&quot;https://www.tensorflow.org/lite/convert&quot;&gt;TensorFlow Lite&lt;/a&gt;, but the JavaScript target is a little trickier. Google provides instructions for &lt;a href=&quot;https://www.tensorflow.org/js/tutorials/conversion/import_saved_model&quot;&gt;converting a SavedModel&lt;/a&gt;, but the only documented path for doing such a conversion is by using a command line utility. That’s a bit of a bummer for our Python module, since we’d really rather not shell out to a library that’s written in Python itself. Surely there’s a better way.&lt;/p&gt;&lt;h2 id=&quot;to-github&quot;&gt;&lt;a href=&quot;#to-github&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;To GitHub!&lt;/h2&gt;&lt;p&gt;Thankfully, the code for &lt;a href=&quot;https://github.com/tensorflow/tfjs/tree/master/tfjs-converter&quot;&gt;tfjs-converter&lt;/a&gt; is on GitHub for us all to see, so let’s dive in and see what we can find.&lt;/p&gt;&lt;hr/&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Since we’re talking about an undocumented workflow, and part of it uses protected functions, it almost goes without saying that this is all subject to change. At the time of writing, it worked using version 2.4.1 of &lt;code&gt;tensorflow&lt;/code&gt; and 3.3.0 of &lt;code&gt;tensorflowjs&lt;/code&gt;. Both are available via &lt;code&gt;pip install&lt;/code&gt;, though installing &lt;code&gt;tensorflowjs&lt;/code&gt; will get you a compatible version of &lt;code&gt;tensorflow&lt;/code&gt; for free.&lt;/p&gt;&lt;hr/&gt;&lt;p&gt;First things first: if you have a SavedModel on disk somewhere that you’re looking to convert, you can skip the slight messiness of the rest of this post and just do the following (&lt;a href=&quot;https://github.com/tensorflow/tfjs/blob/14cfeefb30f9e0af31cb5addfa182fc16909876a/tfjs-converter/python/tensorflowjs/converters/tf_saved_model_conversion_v2.py#L513&quot;&gt;source code&lt;/a&gt;):&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; tensorflowjs&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;converters&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;converter &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; tf_saved_model_conversion_v2 &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt; convert

&lt;span class=&quot;token comment&quot;&gt;# ...&lt;/span&gt;

&lt;span class=&quot;token comment&quot;&gt;# see the source code for other valid kwargs&lt;/span&gt;
convert&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;convert_tf_saved_model&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;saved_model_dir&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; output_dir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;See? Easy. But it’s not quite what we wanted, as it requires us to export our model to the SavedModel format first, and of course that exported version has to be re-loaded to do the conversion. So let’s break it down a bit.&lt;/p&gt;&lt;p&gt;The body of &lt;code&gt;convert_tf_saved_model&lt;/code&gt; loads the model, freezes the weights, builds a protocol buffer version of the signature function we want to convert, and exports the frozen graph to JavaScript format. All those subtasks exist as protected functions, but since this is Python, no one’s going to ask any questions about us just calling those — no one except Pylint, that is, and it can be bribed with a comment line if necessary.&lt;/p&gt;&lt;p&gt;Here’s the replacement I came up with for the public function above. It’s essentially a reproduction that doesn’t require a directory as input. You of course don’t need the Pylint comments if you don’t use Pylint.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; tensorflowjs&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;converters&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;converter &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; tf_saved_model_conversion_v2 &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt; convert

&lt;span class=&quot;token comment&quot;&gt;# ...&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;convert_func&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;concrete_func&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; tf&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;Graph&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; output_dir&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;not&lt;/span&gt; os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;path&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exists&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;output_dir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;makedirs&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;output_dir&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; exist_ok&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

  &lt;span class=&quot;token comment&quot;&gt;# pylint: disable=protected-access&lt;/span&gt;
  frozen_graph &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; convert&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_freeze_saved_model_v2&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    concrete_func&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; control_flow_v2&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;True&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

  inputs &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;x &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; x &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt; concrete_func&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;inputs &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;not&lt;/span&gt; x&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dtype &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;resource&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

  &lt;span class=&quot;token comment&quot;&gt;# pylint: disable=protected-access&lt;/span&gt;
  signature &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; convert&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_build_signature_def&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    frozen_graph&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; inputs&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; concrete_func&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;outputs
  &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

  output_graph &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;path&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;join&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;output_dir&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;model.json&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  convert&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;optimize_graph&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;frozen_graph&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; signature&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; output_graph&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; TF_VERSION&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;I’ve retained the mypy type hints because I think they help the explanation. The &lt;code&gt;concrete_func&lt;/code&gt; argument here is a full TensorFlow graph, but it has that name because the model in question uses &lt;a href=&quot;https://www.tensorflow.org/api_docs/python/tf/function&quot;&gt;&lt;code&gt;tf.function&lt;/code&gt;&lt;/a&gt; to make retrieving its signatures easier. If you have access to these functions, you can call &lt;a href=&quot;https://www.tensorflow.org/guide/function#obtaining_concrete_functions&quot;&gt;&lt;code&gt;get_concrete_func&lt;/code&gt;&lt;/a&gt; to get an input for our &lt;code&gt;convert_func&lt;/code&gt; function.&lt;/p&gt;&lt;p&gt;Another way to get a valid &lt;code&gt;concrete_func&lt;/code&gt; is to use a &lt;a href=&quot;https://www.tensorflow.org/tfx/serving/signature_defs&quot;&gt;SignatureDef&lt;/a&gt;. Let’s say you have a SavedModel from which you want to export several (but not all) signatures:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; tensorflow &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt; tf

model &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; tf&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;saved_model&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;load&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;path/to/saved_model&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; signature &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;sig1&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;sig2&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
  output_dir &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string-interpolation&quot;&gt;&lt;span class=&quot;token string&quot;&gt;f&amp;quot;path/to/output/&lt;/span&gt;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;signature&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;

  &lt;span class=&quot;token comment&quot;&gt;# conversion will fail if the parent directory doesn&amp;#x27;t exist&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;not&lt;/span&gt; os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;path&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exists&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;output_dir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    os&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;makedirs&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;output_dir&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; exist_ok&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

  concrete_func &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; model&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;signatures&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;signature&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
  convert_func&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;concrete_func&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; output_dir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice the creation of a new output directory for every signature. This is important, as internally TensorFlow.js will create a &lt;code&gt;model.json&lt;/code&gt; and binary weights file(s) for each signature, and if you have a single output directory for all the signatures, it will overwrite each one in turn, leaving you with a single set of converted files at the end.&lt;/p&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;And just like that, we’ve come out the other side of our spelunking trip into the &lt;code&gt;tensorflowjs&lt;/code&gt; source code, only a little worse for wear. Models converted using the above method can be loaded in TensorFlow.js via &lt;a href=&quot;https://js.tensorflow.org/api/latest/#loadGraphModel&quot;&gt;&lt;code&gt;tf.loadGraphModel&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;A well-designed repository can now host code for both model training and TensorFlow.js deployment, all in the same language, without maintaining separate shell scripts for model conversion, and all it took was reading a little source code. Who knows, maybe Google will see the value in this approach and choose to officially support this workflow with more public functions and some user-facing documentation in the future.&lt;/p&gt;&lt;p&gt;I hope this speeds our collective journey to infer all the things in the browser, or at least makes it a bit more comfortable.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Question Answering With Spokestack and Transformers]]></title><description><![CDATA[This tutorial leverages HuggingFace's Transformer library to make a voice question answering bot with Spokestack.]]></description><link>https://www.spokestack.io/blog/building-a-question-answering-bot-with-python</link><guid isPermaLink="false">https://www.spokestack.io/blog/building-a-question-answering-bot-with-python</guid><pubDate>Wed, 13 Jan 2021 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/74b36cd73632a2f074735bcd25e710fa/8537d/building-a-question-answering-bot-with-python.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACPElEQVQoz3WSS2tTURSF7z9wIDiwIG1Nk9zce+773CQ36YuGNKWmaUxDEvvCiAOxgoOCE5EKFscOfLaCrUVtwYEDHQg68j8JTj45RxMqxcFiH/ZjsfY62xBCoGCraNsIx9HwfJ+kVCKSklBKiklCkiQUCgU8z8NWvYPZU29jmLBtHXOZDOb4OHY2Qz4I8Gwbz7aIHIdCHBPn87iui2VZun9ANojGaYWOECS1GpO9HlPr60z3+8zc6DO1tkrSbuOXS3iOIAwjrVLhvwoVoW2aVLe3aR6+ZeXdMfVn+yztHdDYP6B5+J6Zm7fottqsb27QarXo9/v4vq/VDgkVmaM8U75YFpXFBerXNrl7Z4mPrzqcvOzy4XmHo6ddTp7U2dqco1JdZLmxpP3M5XJDlQqGlJI4jnVRvRfm52k2O9xem+X1ToXDR1WOdqu8uD/H3k6Fres1Wu0uy406URRpMiVIQStUZKoQhSFRJAl8F9eewHMFwou0X6UkTyhjLC/G9XzSmSxZ0yGOpRai5//CCMOQIAjw/QDfE/jRLFHvC/krDxFWhgsjl8iaOSwzS3rkHOl0mse7D/jx/Zj52oImUec0EGYoU/WP+QGea+LKq8h7v8ivfeLy2ChCXCSXS5EaG8U8bzA6luLb5zfw8yud7irZbEYTKbs0obqpf+AInKiB60/iOhbpiQn9k2qDVMbUxk9Nz9Ba6SGEjev+OR/Vo7Y1isUiZ5D3KRYkxWJCuVwe5sulkl5PSnWH7pk5VfsNDHZ7Z+B3qd0AAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Building a Question Answering Bot with Python&quot; title=&quot;Building a Question Answering Bot with Python&quot; src=&quot;/static/74b36cd73632a2f074735bcd25e710fa/05162/building-a-question-answering-bot-with-python.png&quot; srcSet=&quot;/static/74b36cd73632a2f074735bcd25e710fa/2eeed/building-a-question-answering-bot-with-python.png 294w,/static/74b36cd73632a2f074735bcd25e710fa/0d6a1/building-a-question-answering-bot-with-python.png 588w,/static/74b36cd73632a2f074735bcd25e710fa/05162/building-a-question-answering-bot-with-python.png 1175w,/static/74b36cd73632a2f074735bcd25e710fa/8537d/building-a-question-answering-bot-with-python.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;The ability to find information is a fundamental feature of the internet. Often, the information sought is the answer to a question. When it comes to answering a question about a specific entity, Wikipedia is a useful, accessible, resource. This tutorial will teach you how to use Spokestack and &lt;a href=&quot;https://huggingface.co/transformers/index.html&quot;&gt;Huggingface’s Transformers&lt;/a&gt; library to build a voice interface for a question answering service using data from Wikipedia.&lt;/p&gt;&lt;h2 id=&quot;learning-objectives&quot;&gt;&lt;a href=&quot;#learning-objectives&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Learning Objectives&lt;/h2&gt;&lt;p&gt;By the end of this tutorial, you receive the following:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;The ability to give your app a voice with &lt;a href=&quot;https://www.spokestack.io/&quot;&gt;Spokestack&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;A basic understanding of how to incorporate &lt;a href=&quot;https://huggingface.co/transformers/index.html&quot;&gt;Huggingface’s Transformers&lt;/a&gt; with Spokestack.&lt;/li&gt;&lt;li&gt;An interactive voice interface that allows you to answer questions from &lt;a href=&quot;https://www.wikipedia.org/&quot;&gt;Wikipedia&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;setting-up-the-project&quot;&gt;&lt;a href=&quot;#setting-up-the-project&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Setting up the Project&lt;/h2&gt;&lt;p&gt;First, let’s make a directory to hold the project.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;git&lt;/span&gt; clone https://github.com/spokestack/wikiqa-python
&lt;span class=&quot;token builtin class-name&quot;&gt;cd&lt;/span&gt; wikiqa-python&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now let’s set up the python virtual environment. We use &lt;code&gt;pyenv&lt;/code&gt; to manage virtual environments, but any virtual environment will work.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pyenv &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;3.8&lt;/span&gt;.6
pyenv virtualenv &lt;span class=&quot;token number&quot;&gt;3.8&lt;/span&gt;.6 wikiqa
pyenv &lt;span class=&quot;token builtin class-name&quot;&gt;local&lt;/span&gt; wikiqa&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then the dependencies.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pip &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; -r requirements.txt&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;tensorflow&quot;&gt;&lt;a href=&quot;#tensorflow&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;TensorFlow&lt;/h3&gt;&lt;p&gt;For this tutorial, we are using the full &lt;a href=&quot;http://tensorflow.org&quot;&gt;tensorflow&lt;/a&gt; package for a little more functionality than is included with the &lt;a href=&quot;https://www.tensorflow.org/lite/guide/python#install_just_the_tensorflow_lite_interpreter&quot;&gt;TFLite Interpreter&lt;/a&gt;. If this is the first time you are installing &lt;code&gt;tensorflow&lt;/code&gt; you should follow the &lt;a href=&quot;https://www.tensorflow.org/install&quot;&gt;system-specific installation&lt;/a&gt;. For those who already have the ability to install via &lt;a href=&quot;https://pypi.org/&quot;&gt;pypi&lt;/a&gt; the following will install the library to your environment.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pip &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; tensorflow&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;transformers&quot;&gt;&lt;a href=&quot;#transformers&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Transformers&lt;/h3&gt;&lt;p&gt;In addition to TensorFlow, you will also need to install &lt;a href=&quot;https://huggingface.co/transformers/index.html&quot;&gt;HuggingFace’s&lt;/a&gt; &lt;code&gt;transformers&lt;/code&gt; library for the question answering model.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pip &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; transformers&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;speech-pipeline-with-profiles&quot;&gt;&lt;a href=&quot;#speech-pipeline-with-profiles&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Speech Pipeline with Profiles&lt;/h2&gt;&lt;p&gt;Profiles are preset configurations for our &lt;a href=&quot;/docs/python/speech-pipeline&quot;&gt;Speech Pipeline&lt;/a&gt;. For this tutorial, we will use the Spokestack wake word and ASR profile. The wake word model runs on device, and can be activated by saying “Hey, Spokestack”. ASR is in the cloud though, so you will need to get your API credentials to use it. If you already have a free account, &lt;a href=&quot;/account/login&quot;&gt;log in&lt;/a&gt;. If you do not, you will need to &lt;a href=&quot;/create&quot;&gt;create&lt;/a&gt; one. The credentials can be found in your &lt;a href=&quot;/account/settings&quot;&gt;account settings&lt;/a&gt;. This is everything you need to speak to your app.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;profile&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;wakeword_asr &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; WakewordSpokestackASR

pipeline &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; WakewordSpokestackASR&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;create&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    KEY_ID&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; KEY_SECRET&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; model_dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;tflite&amp;quot;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;start&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;natural-language-understanding-nlu&quot;&gt;&lt;a href=&quot;#natural-language-understanding-nlu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Natural Language Understanding (NLU)&lt;/h2&gt;&lt;p&gt;Natural Language Understanding (NLU) is how we transform what the user says into action. For more explanation on the NLU see our &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;docs&lt;/a&gt;. The NLU model for this project is already included in the GitHub repository. However, we will briefly discuss the model configuration in the next section.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tflite &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; TFLiteNLU

nlu &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TFLiteNLU&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;model-configuration&quot;&gt;&lt;a href=&quot;#model-configuration&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Model Configuration&lt;/h3&gt;&lt;p&gt;We’ve included a pre-trained model, so you can follow along with this tutorial, but if you want to create your own, here’s a quick introduction to writing NLU training data. See our &lt;a href=&quot;/docs/machine-learning/nlu-training-data&quot;&gt;documentation&lt;/a&gt; for more information on our data format and how to train your own model. We are using a basic NLU template which includes intents like &lt;code&gt;greet&lt;/code&gt;, &lt;code&gt;accept&lt;/code&gt;, and &lt;code&gt;help&lt;/code&gt;. In addition, we need a way to identify an entity in the user utterance to perform a Wikipedia search (more on this later). The name of this intent is &lt;code&gt;ask.question&lt;/code&gt;, and the utterance templates are simple ways to ask a question. Naturally, these templates could be more complex and cover a wider variety of utterances, but for the purpose of this tutorial these will be enough.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;subject&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;token builtin&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;entity&amp;quot;&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;generators&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;adjective&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;token builtin&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;list&amp;quot;&lt;/span&gt;
values &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;long&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;tall&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;wide&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;far&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;utterances&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
values &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;who is {subject}?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;what is {subject}?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;what is a {subject}?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;what is an {subject}?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;{subject}&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &amp;quot;how &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;adjective&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;is&lt;/span&gt; the &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;subject&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;?&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &amp;quot;how &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;adjective&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;subject&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;?
&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;question-answering-qa&quot;&gt;&lt;a href=&quot;#question-answering-qa&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Question Answering (QA)&lt;/h2&gt;&lt;p&gt;For the QA model, we are using a &lt;a href=&quot;https://arxiv.org/abs/1606.05250&quot;&gt;SQUAD&lt;/a&gt;-style span detection method. The problem is framed as, “given this question and this context that contains the possible answer, identify the text span that contains the answer”. At this point, you most likely are thinking, “where do we get the context?“. We will answer that in the next section.&lt;/p&gt;&lt;h3 id=&quot;context-retrieval&quot;&gt;&lt;a href=&quot;#context-retrieval&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Context Retrieval&lt;/h3&gt;&lt;p&gt;Earlier, we mentioned that we wanted to be able to identify entities in the user’s utterance for a Wikipedia search. Context retrieval is the reason for this. Our QA model needs a context from which to draw the answer. Therefore, we will follow the assumption that the question is about a specific entity. For example, “What year was Ada Lovelace born?“. From this utterance, we pull out the entity “Ada Lovelace” and do a search to retrieve the Wikipedia page. Then, we feed the entire Wikipedia page into the model as context for our question.&lt;/p&gt;&lt;h3 id=&quot;setting-up-the-model&quot;&gt;&lt;a href=&quot;#setting-up-the-model&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Setting Up The Model&lt;/h3&gt;&lt;p&gt;We are using a pre-trained &lt;a href=&quot;https://huggingface.co/transformers/model_doc/distilbert.html&quot;&gt;DistilBERT&lt;/a&gt; model, which is fine-tuned on the &lt;a href=&quot;https://arxiv.org/abs/1606.05250&quot;&gt;SQUAD&lt;/a&gt; dataset. The choice of model is based on the size, similarity of the fine-tuning task to our objective, and availability at the time of writing. The &lt;a href=&quot;https://huggingface.co/models&quot;&gt;collection&lt;/a&gt; of pre-trained models is always growing, so feel free to try any of the other models that were fine-tuned on a similar task. You will most likely find a model that works better than this basic one. We would love to hear about the one you discovered, or even better, trained!&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; transformers &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; AutoTokenizer&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; TFAutoModelForQuestionAnswering

tokenizer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; AutoTokenizer&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;from_pretrained&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;distilbert-base-cased-distilled-squad&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
model &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TFAutoModelForQuestionAnswering&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;from_pretrained&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;distilbert-base-cased-distilled-squad&amp;quot;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;dialogue-manager&quot;&gt;&lt;a href=&quot;#dialogue-manager&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dialogue Manager&lt;/h2&gt;&lt;p&gt;The dialogue manager for this project is a smart speaker-style interaction. A user asks a question, and the bot speaks the answer. The full example can be seen below. The process starts with retrieving the entity (person/place/thing) found in what the user said. Then, we take that entity and look up its page on Wikipedia. Next, we take the full text of the Wikipedia page and feed it, along with the question, to the QA model. The model gives us the location of the answer, which we use to grab the answer from the full text. Then, we return the answer in the form of a response. This setup is pretty simple and just meant to get you started. Definitely expand on this dialogue manager and conform it to your needs.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token triple-quoted-string string&quot;&gt;&amp;quot;&amp;quot;&amp;quot;
Simple QA dialogue manager
&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; tensorflow &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt; tf
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; mediawiki &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; MediaWiki
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;result &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; Result
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tflite &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; TFLiteNLU
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; transformers &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; AutoTokenizer&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; TFAutoModelForQuestionAnswering


&lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;DialogueManager&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;token triple-quoted-string string&quot;&gt;&amp;quot;&amp;quot;&amp;quot; Simple Question Answering Dialogue Manager &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; log_path&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; base_model&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_wiki &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; MediaWiki&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_nlu &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TFLiteNLU&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;log_path&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_tokenizer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; AutoTokenizer&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;from_pretrained&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;base_model&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_answerer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TFAutoModelForQuestionAnswering&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;from_pretrained&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;base_model&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;__call__&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; utterance&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        result &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_nlu&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;utterance&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;ask.question&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_answer&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;elif&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;greet&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;greet&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;elif&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;command.exit&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exit&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;elif&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;request.help&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;fallback&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_answer&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; Result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# get the tagged entity for page search&lt;/span&gt;
            entity &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;get&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;entity&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;get&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;raw_value&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# perform the search to find the wikipedia page&lt;/span&gt;
            entity &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_wiki&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;search&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;entity&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# get the page content to feed as context to the qa model&lt;/span&gt;
            passage &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_wiki&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;page&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;entity&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; auto_suggest&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;content
            &lt;span class=&quot;token comment&quot;&gt;# prepare qa model inputs&lt;/span&gt;
            inputs &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_tokenizer&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
                result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;utterance&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                passage&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                return_tensors&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;tf&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                padding&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                truncation&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# compute answer span&lt;/span&gt;
            start_scores&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; end_scores &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_answerer&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;inputs&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            start&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; end &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; tf&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;argmax&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;start_scores&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; tf&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;argmax&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;end_scores&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# prepare the passage ids for slicing&lt;/span&gt;
            tokens &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_tokenizer&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;convert_ids_to_tokens&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;inputs&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;input_ids&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;numpy&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;# retrieve only the answer from the passage&lt;/span&gt;
            answer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_tokenizer&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;convert_tokens_to_string&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;tokens&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;start &lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; end &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; answer
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;I don&amp;#x27;t have an answer for that&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@staticmethod&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;greet&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Hello, Ask me anything&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@staticmethod&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;exit&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Goodbye&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@staticmethod&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;fallback&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;token string&quot;&gt;&amp;quot;I&amp;#x27;m having trouble understanding your request, could you please &amp;quot;&lt;/span&gt;
            &lt;span class=&quot;token string&quot;&gt;&amp;quot;repeat it&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@staticmethod&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Ask a question like, how long is the amazon river?&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;OK, now we have our response from the dialogue manager. The next question on your mind is probably “how do we give it a voice?“. Check out the following section for how to set up Spokestack’s text to speech service.&lt;/p&gt;&lt;h2 id=&quot;text-to-speech-tts&quot;&gt;&lt;a href=&quot;#text-to-speech-tts&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Text to Speech (TTS)&lt;/h2&gt;&lt;p&gt;Now, let’s give the app a voice. Similar to the profile section, you will need your Spokestack API keys. We offer a &lt;code&gt;TextToSpeechManager&lt;/code&gt; class which requires a TTS client and an output source. In most cases, the &lt;code&gt;PyAudioOuput&lt;/code&gt; class should work. It uses the default system speaker.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;manager &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TextToSpeechManager&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;TextToSpeechClient&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;KEY_ID&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; KEY_SECRET&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; PyAudioOutput&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;synthesize&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;hello, world&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;demo-male&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now that we have given our bot a voice, we can put everything together. At this point, you are just one section away from a voice interface that allows you to get answers to simple questions. Let’s move on to the complete working example.&lt;/p&gt;&lt;h2 id=&quot;putting-it-all-together&quot;&gt;&lt;a href=&quot;#putting-it-all-together&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Putting it All Together&lt;/h2&gt;&lt;p&gt;We have all the modules set up. Now we need to add the logic that will respond to events in the conversation.
For this, we use our &lt;a href=&quot;/docs/python/speech-pipeline&quot;&gt;Pipeline Events&lt;/a&gt;. Pipeline Events are simply events that occur while the pipeline is running. To use them, you decorate functions with an event decorator. Most applications will want an event handler that does something when speech is recognized. For ours, we want to process the question and play the response. This is defined in the &lt;code&gt;on_recognize&lt;/code&gt; handler. The function we are using for this example can be seen below. For more information on the included events take a look &lt;a href=&quot;/docs/python/speech-pipeline#speech-event-handlers&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;event&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;on_recognize&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pause&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    answer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; dialogue_manager&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;synthesize&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;answer&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;demo-male&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;resume&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;


manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;synthesize&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;dialogue_manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;greet&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;demo-male&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;start&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;run&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That pretty much wraps it up! You should now be able to ask some questions, and it will (hopefully) be able to find the right answer. I hope you found this tutorial useful, and thanks for taking the time to read it!&lt;/p&gt;&lt;h2 id=&quot;contact-us&quot;&gt;&lt;a href=&quot;#contact-us&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Contact Us&lt;/h2&gt;&lt;p&gt;If you have any questions while getting this set up we have a &lt;a href=&quot;https://forum.spokestack.io/&quot;&gt;forum&lt;/a&gt;, or you can open an issue on &lt;a href=&quot;https://github.com/spokestack/spokestack-python/issues&quot;&gt;GitHub&lt;/a&gt;. In addition, I am more than happy to help if you want to reach out to me personally via &lt;a href=&quot;mailto:will@spokestack.io&quot;&gt;email&lt;/a&gt; or &lt;a href=&quot;https://twitter.com/_Will_Rice&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack in Android]]></title><description><![CDATA[Instructions on integrating and using the Spokestack Tray UI component in Android.]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-in-android</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-in-android</guid><pubDate>Thu, 17 Dec 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/e58e66d77427f315ee0cc269273554ac/8537d/android-tray-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACg0lEQVQoz12S30tTYRjHTxfRpRGUF0VJXSRRgXQR0R8QVFcR3QTdh4QEBd0YCWVEFBZZqGSky6lnbs7p/BVoRlvij21nP87OPD+245xOgwrJKIhPvGebRAe+PC/v+zyf9/s875EG7STbyiXwlaN/OY5XV2l4ptEykqE7pPNUNhmxNAZXNfwraXzZBIGVBM1jmiOxlv4FVmADORVfQUNWU+yoTiJVJWluyTI/nydb3MAz5MYdnsBfyDCcV7j2wubqM5tAXkHyCVdlCZjXVhm3QnzIjNIeLtDwqkj1qTR7j6U4fVFjMmvzpvM5rT4XgaJOoKDSEZqm/dMUw2taCVhx6M/F8eQNPidcxGYa6QynuVCvc+Ssxu6jaapqU7xLJPEVkviWVbzZOMGiwdnL5zlz6RzBNb3UsoCKQ6EBMT8zwoQdol9V2VOrsu94mv11GgdOpelLqs6sKl0Nr2a429NBo6uN4YKG1G/G6DOjjnoNESN47Cje1SjulMKumjjSbgWpSmHnwTjdMQXZWkQ2Y3hzJRPBdYPRdcO5QBIgAZWtkvqMKE/ep3gQNGibsQjNb/JxdpMP4U3CCz8YX9HxFlT6rRhufdHJ7zMiThT1kmwpeCyFgWxFcV7MRLnTNUnL0Azwh9L3m5+/vnPz/j2u3LjO29gnPLk4vXoEtwCaZaAAbMtS8OdVXMo09Y9u0fTyISsbW4TyW9jfYGF2jpN1J6g5fIjbrY8JfjHxmDHHUKVLqfIYzjzKj+PLJZ3/cKSwROZrkbHlPKOZNebMJR70dtHU08nruSnnctGVqPFklRJQ2O39T2JPzKdHX6Q7M49sLCBbC/SYEQbXMviLS8h2wsnZzi/X/AX4ML7Gu4SWswAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;How to Integrate Spokestack in Android&quot; title=&quot;How to Integrate Spokestack in Android&quot; src=&quot;/static/e58e66d77427f315ee0cc269273554ac/05162/android-tray-hero.png&quot; srcSet=&quot;/static/e58e66d77427f315ee0cc269273554ac/2eeed/android-tray-hero.png 294w,/static/e58e66d77427f315ee0cc269273554ac/0d6a1/android-tray-hero.png 588w,/static/e58e66d77427f315ee0cc269273554ac/05162/android-tray-hero.png 1175w,/static/e58e66d77427f315ee0cc269273554ac/8537d/android-tray-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Using voice as an interface can be a daunting proposition. First, you have to be able to actually &lt;em&gt;process&lt;/em&gt; voice as input. Then, you have to decide what to &lt;em&gt;do&lt;/em&gt; with that input. You probably want to deliver responses via voice too. After all that, you still have to decide how to integrate the voice interface with your touch interface.&lt;/p&gt;&lt;p&gt;If you’re here, you probably know that the Spokestack library helps you with the first three of those challenges. With Spokestack Tray, you’ll also have an answer for the last one.&lt;/p&gt;&lt;h2 id=&quot;what-is-it&quot;&gt;&lt;a href=&quot;#what-is-it&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What is it?&lt;/h2&gt;&lt;p&gt;We created Spokestack Tray to give developers a full-featured visual UI to reflect an app’s voice interactions with the user. At first glance, it’s just a microphone button that sits on the side of the screen. When the user taps the button (or says the wake word), it opens up into a full tray that displays the user’s speech as well as system responses, reading those responses aloud as it displays them.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/d3baa8ec9a57655d43ef37dc2958772f/android-tray-demo.gif&quot; alt=&quot;Android Spokestack Tray Example&quot;/&gt;&lt;/p&gt;&lt;h2 id=&quot;how-do-i-use-it&quot;&gt;&lt;a href=&quot;#how-do-i-use-it&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How do I use it?&lt;/h2&gt;&lt;p&gt;First, add the dependency to your app’s &lt;code&gt;build.gradle&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;groovy&quot;&gt;&lt;pre class=&quot;language-groovy&quot;&gt;&lt;code class=&quot;language-groovy&quot;&gt;dependencies &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;
  implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;io.spokestack:tray:0.4.4&amp;#x27;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The tray is implemented as a &lt;code&gt;Fragment&lt;/code&gt;, so to include it, add this to your activity’s &lt;code&gt;layout.xml&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;  &lt;span class=&quot;token comment&quot;&gt;&amp;lt;!-- nested in the main layout, after other views/sublayouts --&amp;gt;&lt;/span&gt;

  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;include&lt;/span&gt;
    &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;id&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;@+id/tray_fragment&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    &lt;span class=&quot;token attr-name&quot;&gt;layout&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;@layout/spokestack_tray_fragment&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Depending on your app layout, you may also have to add &lt;code&gt;android:clipChildren=&amp;quot;false&amp;quot;&lt;/code&gt; to the fragment’s parent layout(s) to avoid the microphone tab disappearing as the tray opens.&lt;/p&gt;&lt;p&gt;Then make your activity itself extend &lt;code&gt;TrayActivity&lt;/code&gt; (a subclass of &lt;code&gt;AppCompatActivity&lt;/code&gt;), implement the methods it requires, and the library will take care of the rest. Voice interaction will be handled by the Tray, which you’ll have access to via an instance variable named &lt;code&gt;tray&lt;/code&gt; that’s initialized during &lt;code&gt;onStart&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;You’ll need an ID and secret key from your &lt;a href=&quot;/account/settings#api&quot;&gt;Spokestack account&lt;/a&gt; in order to set up the &lt;code&gt;TrayConfig&lt;/code&gt; that &lt;code&gt;TrayActivity&lt;/code&gt; requires.&lt;/p&gt;&lt;p&gt;If you’d prefer to do the setup yourself, here’s a sample that doesn’t use &lt;code&gt;TrayActivity&lt;/code&gt;, and demonstrates a minimal complete &lt;code&gt;TrayConfig&lt;/code&gt; as well:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; io&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tray&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;*&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; MyActivity &lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;AppCompatActivity&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; SpokestackTrayListener &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

  &lt;span class=&quot;token keyword&quot;&gt;lateinit&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; tray&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; SpokestackTray

  &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;

  &lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;onCreate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;savedInstanceState&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; Bundle&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;val&lt;/span&gt; config &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    supportFragmentManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;fragmentFactory &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;SpokestackTrayFactory&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;config&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// note that the factory is instantiated and set on the manager BEFORE calling&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// `super.onCreate()`&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;onCreate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;savedInstanceState&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;getConfig&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; TrayConfig&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;Builder&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// credentials from your Spokestack account&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;credentials&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;spokestack-client-id&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;spokestack-secret-key&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;wakewordModelURL&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;https://path-to-wakeword-models&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;nluURL&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;https://path-to-nlu-files&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// note the implementation of `SpokestackTrayListener` in the class declaration&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;withListener&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// optional builder customization; see the documentation for more details...&lt;/span&gt;
      &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;onStart&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// set the value of the lateinit `tray` var&lt;/span&gt;
    tray &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; SpokestackTray&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getInstance&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;config&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;onStart&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When you download Spokestack wake word or NLU models, you’ll have several URLs to different files. &lt;code&gt;wakewordModelURL&lt;/code&gt; and &lt;code&gt;nluURL&lt;/code&gt; above only require the path to the relevant directory, not full file URLs. So for the demo “Spokestack” wake word, set &lt;code&gt;wakewordModelURL&lt;/code&gt; to ”&lt;a href=&quot;https://d3dmqd7cy685il.cloudfront.net/model/wake/spokestack/%22&quot;&gt;https://d3dmqd7cy685il.cloudfront.net/model/wake/spokestack/”&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The Tray is designed for seamless use across activities — for example, to allow a user to continue giving a voice command while the app switches activities — so its state is stored outside the fragment itself and survives fragment destruction. If your app needs to release resources held by the Tray and its underlying &lt;code&gt;Spokestack&lt;/code&gt; instance, call the tray’s &lt;code&gt;stop()&lt;/code&gt; method. If you then need to re-enable voice control before the current Tray fragment instance is destroyed, you must call &lt;code&gt;start()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;If you want to keep tray state intact after process death, you can store it in its parent activity’s &lt;code&gt;onSaveInstanceState&lt;/code&gt; and &lt;code&gt;onRestoreInstanceState&lt;/code&gt; methods using the Tray’s &lt;code&gt;getState()&lt;/code&gt; and &lt;code&gt;loadState()&lt;/code&gt; methods; see their documentation for more details.&lt;/p&gt;&lt;h3 id=&quot;responses&quot;&gt;&lt;a href=&quot;#responses&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Responses&lt;/h3&gt;&lt;p&gt;Chances are that if you’re allowing the user to talk to your app, you want the app to talk back. Tray is integrated with Spokestack’s TTS service, so synthesizing audio is just as easy as transcribing it.&lt;/p&gt;&lt;p&gt;When you extend &lt;code&gt;TrayActivity&lt;/code&gt;, one of the methods you’ll have to implement is &lt;code&gt;getTrayListener()&lt;/code&gt;, which creates and returns a &lt;code&gt;SpokestackTrayListener&lt;/code&gt;. This interface assists your app in reacting to events received and produced by the Tray. Because each use case is unique, all its methods are optional; the one we’re interested in here is &lt;code&gt;onClassification&lt;/code&gt;. This method is called after a user’s speech has been transcribed by ASR and classified by NLU. It supplies your app with the NLU result and asks you to return a response:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;onClassification&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; NLUResult&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; VoicePrompt &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;your-special-intent&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;VoicePrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;I hear you loud and clear&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token function&quot;&gt;VoicePrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;Sorry; I didn&amp;#x27;t catch that&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    expectFollowup &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The optional second parameter in the &lt;code&gt;VoicePrompt&lt;/code&gt; constructor lets the Tray know if you’re expecting a response — if you are, it will resume active listening after your prompt is played so the user doesn’t have to use the wake word or a button for each interaction.&lt;/p&gt;&lt;h2 id=&quot;configuration&quot;&gt;&lt;a href=&quot;#configuration&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Configuration&lt;/h2&gt;&lt;p&gt;The above sample will get you up and running with minimal fuss, but it’s far from all that Spokestack Tray offers. When you’re building a &lt;code&gt;TrayConfig&lt;/code&gt; instance, you can choose to configure and provide the underlying &lt;code&gt;Spokestack&lt;/code&gt; builder itself. This will let you do things like change ASR providers, set up custom listeners for events from individual systems, and add custom speech processing components if you need to. You can read about the Spokestack builder &lt;a href=&quot;/docs/android/turnkey-configuration&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;There are also a range of options that are applicable to the Tray itself, accessible via helper methods on the &lt;code&gt;TrayConfig.Builder&lt;/code&gt; instance. See the &lt;a href=&quot;https://spokestack.github.io/spokestack-tray-android/-spokestack-tray/&quot;&gt;documentation&lt;/a&gt; for more details, specifically the documentation on &lt;a href=&quot;https://spokestack.github.io/spokestack-tray-android/-spokestack-tray/io.spokestack.tray/-tray-config/-builder&quot;&gt;&lt;code&gt;TrayConfig.Builder&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;We hope this brief introduction to Spokestack Tray has inspired thought about what your app could do with voice control. With Tray, experimenting with this exciting interface takes just a few lines of code. See the &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-android&quot;&gt;README&lt;/a&gt; for more details about the Tray and the customization options available. If you need help, please open a GitHub issue or check out one of our &lt;a href=&quot;/support&quot;&gt;multiple support channels&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack with Google App Actions, Part 3]]></title><description><![CDATA[Add a UI to your app's voice features using Spokestack Tray]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-3</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-3</guid><pubDate>Tue, 15 Dec 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAAB50lEQVQoz3VSz2vUQBjNnyCIB3+gB7NrNskkM5OZTLJdsam03diDh2Z3a5FuQY3gwUtvHnpQ8NdNrCBikWqxqH+Df9uT+bYJqejh8c2E73vfe2/iMMZgEYYhVRZFZ6DTFHmewxgDrTWa/i7aWcbg2EvzIQgCDFwXfq8Hv9+nKsIQWZYRWZIk1GP7bW3OXVKn3RIESLTGaDpFUT/Crd1drDyusVRtgkUMUnDEcQwpBTjnRC6EQBRFZxU2B9/zkBYFJodfcO/HT1RHx7h7+BXlwScU4w3s7DzAZLaN2dY25vM5ZrMZ6rpGWZbwPI+IW8v2YrfHnCNdXkY0vIlqaw3H7yf49XGKzy/WcfJmFd9ereH509tgLIbv+63lLhybTZqmFHpmDIxS4LHEZmlw8GwV31+PcfTyDk7elviwP8bew5U2T85jEtOAFFoypVQLfUqutKHsonCA3HAopRHHHEIqaL14+eFwSA/WnXfsJiklBWxhAxcygZIBkuIJkvu/oUYVWDjAhcs9uK5Lg3bOKrWElrwR5jQkDRZZSvDQhdh4B7MPZOt76LnXcO78RVy9conUdvutIEtOhETwTzBwkYGbCkmi6Xe50b9OWbWLT0GuhCDVjpX7X2QGuRHI84WtpdGIamPzb9hM/wBbFXESEsdPywAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Integrating Spokestack Google App Actions 3&quot; title=&quot;Integrating Spokestack Google App Actions 3&quot; src=&quot;/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png&quot; srcSet=&quot;/static/57612fae0d2ab023e240afb07024484e/2eeed/google-app-actions-hero.png 294w,/static/57612fae0d2ab023e240afb07024484e/0d6a1/google-app-actions-hero.png 588w,/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png 1175w,/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;This tutorial is part of a series:&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-1&quot;&gt;Part 1&lt;/a&gt;: Working with Google App Actions&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-2&quot;&gt;Part 2&lt;/a&gt;: Adding your own voice experience with Spokestack&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Part 3&lt;/strong&gt; (&lt;em&gt;You are here!&lt;/em&gt;): Using Spokestack Tray to add a voice UI&lt;/li&gt;&lt;/ul&gt;&lt;hr/&gt;&lt;p&gt;Now that we know what we need to get started with voice control, both on Google’s end and in our app, we’ll polish our user experience a bit by adding visual elements to give the user more feedback about what the app is hearing and saying. With the changes we’ll make in this tutorial, an interaction with Google Assistant will look like this:&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/f9bbb2a1091088c764dab5db1a7e55eb/google-app-actions-demo.gif&quot; alt=&quot;screen capture of Google Assistant handing voice control to an app&quot;/&gt;&lt;/p&gt;&lt;p&gt;To follow along, run &lt;code&gt;git checkout spokestack-tray&lt;/code&gt; in your copy of the &lt;a href=&quot;https://github.com/spokestack/app-actions-example&quot;&gt;sample repository&lt;/a&gt;. Since this tutorial involves changes to the activities we’ve already introduced, we’ve made them on a separate branch for easy comparison to the “headless” version of Spokestack.&lt;/p&gt;&lt;h2 id=&quot;the-why-and-how-of-tray&quot;&gt;&lt;a href=&quot;#the-why-and-how-of-tray&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The why and how of Tray&lt;/h2&gt;&lt;p&gt;As we’ve seen in other tutorials, you can add voice control to your app using just the &lt;a href=&quot;https://github.com/spokestack/spokestack-android&quot;&gt;Spokestack library&lt;/a&gt;. &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-android&quot;&gt;Spokestack Tray&lt;/a&gt; exists to help the user visualize their voice interactions without you having to build an entire UI for them from scratch. It includes a microphone button used to activate ASR without the use of a wake word, visual feedback while ASR is active, and a chat-like interface that displays the conversation between user and app.&lt;/p&gt;&lt;p&gt;Let’s walk through the process of dropping it into our project, starting with the changes to &lt;code&gt;build.gradle&lt;/code&gt;. Since the Tray exists to mediate your users’ interaction with Spokestack, it bundles the dependencies necessary for that. That means we can replace all our Spokestack-related dependencies with the following:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;groovy&quot;&gt;&lt;pre class=&quot;language-groovy&quot;&gt;&lt;code class=&quot;language-groovy&quot;&gt;    implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;io.spokestack:tray:0.4.4&amp;#x27;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Continuing with the declarative parts of the project, Tray is implemented as a &lt;code&gt;Fragemnt&lt;/code&gt;, so each &lt;code&gt;Activity&lt;/code&gt; that wants to include it will want to add a few lines to its layout XML:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;    &lt;span class=&quot;token comment&quot;&gt;&amp;lt;!-- nested in the main layout, after other views/sublayouts --&amp;gt;&lt;/span&gt;

    &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;include&lt;/span&gt;
        &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;id&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;@+id/tray_fragment&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
        &lt;span class=&quot;token attr-name&quot;&gt;layout&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;@layout/spokestack_tray_fragment&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This will add a nested &lt;code&gt;ConstraintLayout&lt;/code&gt; containing the &lt;code&gt;SpokestackTray&lt;/code&gt; fragment into your layout. If you’d rather manage the layout XML yourself, take a look at &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-android/blob/main/SpokestackTray/src/main/res/layout/spokestack_tray_fragment.xml&quot;&gt;the layout file&lt;/a&gt; we’re including. You’ll want to use the same tag for the fragment as it does, as it’s important elsewhere.&lt;/p&gt;&lt;p&gt;Note also the &lt;code&gt;clipChildren&lt;/code&gt; attribute on the parent layout. Depending on how your app is set up, you might also need to add this to any layouts serving as parents to the tray fragment. Our sample app does this in all its activities.&lt;/p&gt;&lt;h2 id=&quot;on-to-the-fun-stuff&quot;&gt;&lt;a href=&quot;#on-to-the-fun-stuff&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;On to the fun stuff!&lt;/h2&gt;&lt;p&gt;That takes care of the setup, so let’s move on to the code from &lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-2&quot;&gt;part 2&lt;/a&gt; that we’ll need to change. Spoiler alert: Thanks to delegating management duties to Spokestack Tray, we’ll be deleting more than we add, resulting in a much simpler developer experience!&lt;/p&gt;&lt;p&gt;First, the &lt;code&gt;Voice&lt;/code&gt; class goes away entirely — all &lt;code&gt;Spokestack&lt;/code&gt; interaction goes through Spokestack Tray.&lt;/p&gt;&lt;p&gt;&lt;code&gt;VoiceActivity&lt;/code&gt; gets a new parent class, the &lt;code&gt;TrayActivity&lt;/code&gt; convenience class, which inherits from &lt;code&gt;AppCompatActivity&lt;/code&gt; and deals with some Android platform unpleasantness related to &lt;code&gt;Fragment&lt;/code&gt; construction. We’ve moved our main &lt;code&gt;Spokestack&lt;/code&gt; configuration (now configuration for the Tray, which is similar) from &lt;code&gt;Voice&lt;/code&gt; to &lt;code&gt;VoiceActivity&lt;/code&gt; so we can provide it to &lt;code&gt;TrayActivity&lt;/code&gt;. This configuration includes URLs instead of paths for wake word and NLU files since the Tray downloads them automatically; this means that our app bundle can be smaller, and we don’t have to write the download code.&lt;/p&gt;&lt;p&gt;We’ve also taken the opportunity to provide a fallback voice response that’ll be available to all subclasses:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;fallbackPrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; VoicePrompt &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;VoicePrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;Sorry, I didn&amp;#x27;t understand. Please try again.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    expectFollowup &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We’ll be using this fallback a lot, since we haven’t built out a full voice experience for this sample. Everything else that was in that class dealt with managing Spokestack, so it just goes away. Feels good, doesn’t it?&lt;/p&gt;&lt;p&gt;Next, let’s look at &lt;code&gt;SearchActivity&lt;/code&gt; to see how we deal with a voice command that can come from either Google or Spokestack:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;setUiFromIntent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; Uri&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; fromTray&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; Boolean &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getQueryParameter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;item&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    runOnUiThread &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
      binding&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;searchContent&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;text &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; it
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// `fromTray` is our cue for whether the command originally came from&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// Spokestack Tray or Google Assistant. If it&amp;#x27;s the former, the Tray&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// will automatically display and play the response; if the latter,&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// we&amp;#x27;ll need to explicitly respond.&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;fromTray&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
      tray&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;say&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;it&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The comment there reveals our secret. If the user opens this activity with a voice command from somewhere else in the app or from Google Assistant, we generate a response and pass it through Tray using the &lt;code&gt;say&lt;/code&gt; method. If &lt;code&gt;SearchActivity&lt;/code&gt; is active when the voice command is received, it can set &lt;code&gt;fromTray&lt;/code&gt; to true and generate the response elsewhere.&lt;/p&gt;&lt;p&gt;That’s as good a segue as any to talk about the other notable change in our activities — the listener. We’re converting all our nested &lt;code&gt;SpokestackAdapter&lt;/code&gt;s to &lt;code&gt;SpokestackTrayListener&lt;/code&gt;s, which doesn’t change much in our case because our sample app was only interested in NLU events to begin with. Our &lt;code&gt;nluResult&lt;/code&gt; handlers become &lt;code&gt;onClassification&lt;/code&gt; handlers, and now they’re expected to return a response for the Tray to display and read.&lt;/p&gt;&lt;p&gt;We won’t add any new logic to smooth the voice experience here other than supplying an error message. The message itself comes from &lt;code&gt;VoiceActivity&lt;/code&gt; and plays any time we can’t recognize a search request for a specific item. The only real change to the listener method, then, is the return:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;item &lt;span class=&quot;token operator&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token function&quot;&gt;response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;item&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token function&quot;&gt;fallbackPrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;Just like that, we’ve gone from a cool-but-confusing “headless” voice experience to a UI-supported voice interface that puts the user in control and removes code from our codebase. It’s a win-win. A quick checklist for integrating Spokestack Tray in your app:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;✅ add/update dependency&lt;/li&gt;&lt;li&gt;✅ &lt;code&gt;include&lt;/code&gt; fragment layout in activity XML&lt;/li&gt;&lt;li&gt;✅ subclass &lt;code&gt;TrayActivity&lt;/code&gt; in activities that include the fragment&lt;/li&gt;&lt;li&gt;✅ supply a &lt;code&gt;TrayConfig&lt;/code&gt; suited to your app&lt;/li&gt;&lt;li&gt;✅ implement &lt;code&gt;SpokestackTrayListener&lt;/code&gt; and supply appropriate responses in &lt;code&gt;onClassification&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack with Google App Actions, Part 2]]></title><description><![CDATA[Take your app's voice integration to the next level by having Google Assistant hand off the conversation to an in-app voice assistant.]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-2</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-2</guid><pubDate>Mon, 30 Nov 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAAB50lEQVQoz3VSz2vUQBjNnyCIB3+gB7NrNskkM5OZTLJdsam03diDh2Z3a5FuQY3gwUtvHnpQ8NdNrCBikWqxqH+Df9uT+bYJqejh8c2E73vfe2/iMMZgEYYhVRZFZ6DTFHmewxgDrTWa/i7aWcbg2EvzIQgCDFwXfq8Hv9+nKsIQWZYRWZIk1GP7bW3OXVKn3RIESLTGaDpFUT/Crd1drDyusVRtgkUMUnDEcQwpBTjnRC6EQBRFZxU2B9/zkBYFJodfcO/HT1RHx7h7+BXlwScU4w3s7DzAZLaN2dY25vM5ZrMZ6rpGWZbwPI+IW8v2YrfHnCNdXkY0vIlqaw3H7yf49XGKzy/WcfJmFd9ereH509tgLIbv+63lLhybTZqmFHpmDIxS4LHEZmlw8GwV31+PcfTyDk7elviwP8bew5U2T85jEtOAFFoypVQLfUqutKHsonCA3HAopRHHHEIqaL14+eFwSA/WnXfsJiklBWxhAxcygZIBkuIJkvu/oUYVWDjAhcs9uK5Lg3bOKrWElrwR5jQkDRZZSvDQhdh4B7MPZOt76LnXcO78RVy9conUdvutIEtOhETwTzBwkYGbCkmi6Xe50b9OWbWLT0GuhCDVjpX7X2QGuRHI84WtpdGIamPzb9hM/wBbFXESEsdPywAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Integrating Spokestack Google App Actions 2&quot; title=&quot;Integrating Spokestack Google App Actions 2&quot; src=&quot;/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png&quot; srcSet=&quot;/static/57612fae0d2ab023e240afb07024484e/2eeed/google-app-actions-hero.png 294w,/static/57612fae0d2ab023e240afb07024484e/0d6a1/google-app-actions-hero.png 588w,/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png 1175w,/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;This tutorial is part of a series:&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-1&quot;&gt;Part 1&lt;/a&gt;: Working with Google App Actions&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Part 2&lt;/strong&gt; (&lt;em&gt;You are here!&lt;/em&gt;): Adding your own voice experience with Spokestack&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-3&quot;&gt;Part 3&lt;/a&gt;: Using Spokestack Tray to add a voice UI&lt;/li&gt;&lt;/ul&gt;&lt;hr/&gt;&lt;p&gt;In &lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-1&quot;&gt;the first part&lt;/a&gt; of our tutorial, we talked about how to make an &lt;a href=&quot;https://github.com/spokestack/app-actions-example&quot;&gt;Android app’s&lt;/a&gt; features available via Google App Actions. In this part, we’ll take it to the next level and show how to continue the user interaction via voice once Google Assistant has dropped the user off inside the app.&lt;/p&gt;&lt;h2 id=&quot;from-actions-to-intents&quot;&gt;&lt;a href=&quot;#from-actions-to-intents&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;From actions to intents&lt;/h2&gt;&lt;p&gt;First, we’ll need to recreate the ability to act on natural-language user requests, which is a job for an &lt;a href=&quot;/docs/concets/nlu&quot;&gt;NLU&lt;/a&gt;. Luckily, we’re already halfway to configuring one, thanks to the &lt;code&gt;actions.xml&lt;/code&gt; file required by Google. As a quick reminder, it’s in the &lt;code&gt;res/values/xml&lt;/code&gt; folder in the sample app.&lt;/p&gt;&lt;p&gt;If you’ve worked with voice platforms before (or read the page linked in the last paragraph), you might notice some familiar concepts in that file, especially if you’ve used any custom intents in your app. &lt;code&gt;intentName&lt;/code&gt; is … well, the name of the intent, &lt;code&gt;parameter&lt;/code&gt;s are slots, and &lt;code&gt;queryPatterns&lt;/code&gt; are utterances. If you’re using a built-in intent, like &lt;code&gt;GET_THING&lt;/code&gt; in the sample app, &lt;code&gt;queryPatterns&lt;/code&gt; is hidden from you, handled entirely by Google, but the other things are still there.&lt;/p&gt;&lt;p&gt;We’re going to exploit that similarity and convert our XML directly into Spokestack’s &lt;a href=&quot;/docs/machine-learning/nlu-training-data&quot;&gt;NLU format&lt;/a&gt;, using it to create a custom NLU model that will replicate the features we’ve just defined for Google Assistant in our app itself. This is a great opportunity to add new intents to your in-app NLU that would be too tricky or infeasible to expose via Google Assistant.&lt;/p&gt;&lt;p&gt;We won’t go over the converted versions of all the intents here, but here’s the Spokestack version of the &lt;code&gt;navigate.settings&lt;/code&gt; intent from &lt;code&gt;actions.xml&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;toml&quot;&gt;&lt;pre class=&quot;language-toml&quot;&gt;&lt;code class=&quot;language-toml&quot;&gt;&lt;span class=&quot;token key property&quot;&gt;description&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;the user wishes to view the settings screen&amp;quot;&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token table class-name&quot;&gt;generators.verb&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;token key property&quot;&gt;type&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;list&amp;quot;&lt;/span&gt;
&lt;span class=&quot;token key property&quot;&gt;values&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;see&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;show&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;give me&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;open&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;change&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;update&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;go to&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;take me to&amp;quot;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token table class-name&quot;&gt;utterances&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;token key property&quot;&gt;values&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;settings&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;settings screen&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;settings screen please&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;{verb} settings&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;{verb} my settings&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;{verb} the settings&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;{verb} the settings screen&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;i want to see my settings&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  &lt;span class=&quot;token string&quot;&gt;&amp;quot;i want to change my settings&amp;quot;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In Spokestack’s format, you can achieve the same effect as Google’s conditionals (the parenthetical words and question marks in the XML) using &lt;a href=&quot;/docs/machine-learning/nlu-training-data#generators&quot;&gt;generators&lt;/a&gt;. We’ve added a few more utterances to the Spokestack config than were present in &lt;code&gt;actions.xml&lt;/code&gt; because they’re a little easier to express here.&lt;/p&gt;&lt;p&gt;The &lt;code&gt;description&lt;/code&gt; field is optional but can help you think about the interplay among your various intents.&lt;/p&gt;&lt;p&gt;For sake of demonstration, we’ve included all the Spokestack NLU files in &lt;code&gt;src/main/assets/spokestack-nlu&lt;/code&gt;. We’ve renamed the &lt;code&gt;GET_THING&lt;/code&gt; buit-in intent to &lt;code&gt;command.search&lt;/code&gt; for in-app usage and supplied our own utterances for it. You won’t need to actually create a model with these to run the tutorial code, though, because the trained model is in the assets folder as well. We’ll talk about how to use it … right now.&lt;/p&gt;&lt;h2 id=&quot;app-logic&quot;&gt;&lt;a href=&quot;#app-logic&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;App logic&lt;/h2&gt;&lt;h3 id=&quot;integrating-spokestack&quot;&gt;&lt;a href=&quot;#integrating-spokestack&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Integrating Spokestack&lt;/h3&gt;&lt;p&gt;Now that we have both App Action and Spokestack NLU configuration in place, let’s look at how to handle voice input with Spokestack once the user’s already in your app.&lt;/p&gt;&lt;p&gt;You probably noticed in the first part of the tutorial that our sample app is a bit of a hodgepodge with no particular purpose other than to cover a few common use cases for voice control. Humor us here. The app has 4 total layouts, one that serves as a main menu and three others that demonstrate different potential features. Each scene has its own activity.&lt;/p&gt;&lt;p&gt;To process voice input, all you need is an instance of &lt;code&gt;Spokestack&lt;/code&gt; and a subclass of &lt;code&gt;SpokestackAdapter&lt;/code&gt; to receive events. Since we have multiple activities and want them all to be voice-enabled, we’ve made a &lt;code&gt;Spokestack&lt;/code&gt; instance available via a singleton (see the &lt;code&gt;Voice&lt;/code&gt; object) and created an abstract &lt;code&gt;VoiceActivity&lt;/code&gt; to be extended by all activities that need access to it. Each &lt;code&gt;VoiceActivity&lt;/code&gt; is responsible for creating its own &lt;code&gt;SpokestackAdapter&lt;/code&gt; because each has its own UI that needs to be updated by voice commands. This could likely be DRYed up too, but this is a demo, after all, not production code.&lt;/p&gt;&lt;p&gt;Notice that this design means there’s very little business logic to add to any given activity to add voice interactions. You don’t have to fiddle with the microphone, explicitly start speech recognition, etc. — that’s all handled by Spokestack, which is managed by the parent &lt;code&gt;VoiceActivity&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;There are, however, two details that are particularly important:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The &lt;code&gt;Spokestack&lt;/code&gt; setup in &lt;code&gt;Voice&lt;/code&gt;, specifically how it deals with wake word and NLU data files&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;For this tutorial, we’re taking the simple approach of including all our data files in the &lt;code&gt;assets&lt;/code&gt; folder (thus distributing them with the app itself) and simply decompressing them to the cache folder on startup. To decrease app size, you can choose to omit these files from the distribution and download them if absent, possibly forcing the app to redownload them based on version changes. It’s important to think up front about how you’ll distribute updates to your NLU model, since it essentially determines which features are available via voice.&lt;/p&gt;&lt;ol start=&quot;2&quot;&gt;&lt;li&gt;The permission request in &lt;code&gt;VoiceActivity&lt;/code&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Spokestack needs the &lt;code&gt;RECORD_AUDIO&lt;/code&gt; permission to use the device’s microphone. It’s automatically included in your manifest when you declare the Spokestack dependency, but starting with Android API level 23, you’ll also need to &lt;a href=&quot;https://developer.android.com/guide/topics/media/mediarecorder#audio-record-permission&quot;&gt;request it at runtime&lt;/a&gt;. Since the user can revoke the permission at any time in their settings, we check for it and re-request every time a &lt;code&gt;VoiceActivity&lt;/code&gt; is created.&lt;/p&gt;&lt;p&gt;The way things are set up here, the microphone permission will also be requested on app startup. In a real app, you’ll want to reorganize this to explain the permission before requesting it in order to provide a better user experience.&lt;/p&gt;&lt;h3 id=&quot;recreating-the-google-assistant&quot;&gt;&lt;a href=&quot;#recreating-the-google-assistant&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Recreating the Google Assistant&lt;/h3&gt;&lt;p&gt;With our voice integration set up, let’s take a look at how it’s used. Open up &lt;code&gt;DeviceControlActivity&lt;/code&gt; and scroll to the bottom.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;inner&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; Listener &lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;SpokestackAdapter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;nluResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; NLUResult&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;command.control_device&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;token keyword&quot;&gt;val&lt;/span&gt; dataUri &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; Uri&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;Builder&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;appendQueryParameter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;device&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;device&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;appendQueryParameter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;command&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token function&quot;&gt;setUiFromIntent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;dataUri&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There are several other methods in &lt;code&gt;SpokestackAdapter&lt;/code&gt; that we could take advantage of to make our UI more responsive and log errors, but all we’re interested in for the purpose of this demonstration is receiving results from the app’s NLU. What we’re doing here is reusing our deep link processing that’s in place for Google Assistant: when Spokestack’s NLU gives the app an intent that matches the current scene, we use that intent’s slots to construct a URI containing the query parameters we’ve set up for our App Action. The schema and host don’t matter because we’re already in the activity we want to be in. If we wanted to transition somewhere else, we’d have to make a full URI and use the &lt;code&gt;startActivity&lt;/code&gt; method; this is what &lt;code&gt;MainActivity&lt;/code&gt; does to route button presses to different activities.&lt;/p&gt;&lt;p&gt;At the risk of sounding like a broken record, a proper voice experience does require a bit more code than this. You’ll want to handle intents that &lt;em&gt;aren’t&lt;/em&gt; meant for the current activity, respond intelligently, and so on. To do all this in a maintainable, understandable way, you want what’s called a &lt;em&gt;dialogue manager&lt;/em&gt; component. Watch this blog for more on that in the future!&lt;/p&gt;&lt;p&gt;A couple more notes about &lt;code&gt;DeviceControlActivity&lt;/code&gt;:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Take a look at &lt;code&gt;onResume&lt;/code&gt;. It routes the URI in &lt;code&gt;intent?.data&lt;/code&gt; to &lt;code&gt;setUiFromIntent&lt;/code&gt; just like our NLU event handler above. The presence of a URI in &lt;code&gt;data&lt;/code&gt; is how you’ll know if you were reached via voice command, unless you’ve explicitly deep-linked to this activity somewhere else. If that’s the case, you’ll want to include an extra query parameter somewhere to help the app tell the links apart.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;We’ve overridden &lt;code&gt;onResume&lt;/code&gt; from &lt;code&gt;VoiceActivity&lt;/code&gt; here to avoid an awkward scenario where the TTS response starts playing before the system has completed the transition to the new activity, which causes playback to pause when the transition does finish.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;code&gt;populateVoiceMaps&lt;/code&gt; does a simple synonym mapping of potential slot values (what the user might actually &lt;em&gt;say&lt;/em&gt;) to canonical device names — in this case, we’re actually mapping straight to the UI components that represent those devices. This is because Google only lets us specify parameters for custom intents as plain text, rather than allowing the full expressive power of &lt;a href=&quot;https://developers.google.com/assistant/app/action-schema#entity&quot;&gt;entities&lt;/a&gt; that’s available to built-in intents. Hence, we can’t do that normalization in &lt;code&gt;actions.xml&lt;/code&gt;. In Spokestack’s format, we can fix this using a &lt;a href=&quot;/docs/machine-learning/nlu-training-data/selset&quot;&gt;selset slot&lt;/a&gt;, but since user queries could come from either Google or Spokestack, we’ve left the parsing logic in the app so both query types can be handled the same way.&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;time-to-talk-back&quot;&gt;&lt;a href=&quot;#time-to-talk-back&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Time to talk back&lt;/h3&gt;&lt;p&gt;Once you’ve mastered the basics of voice navigation we’ve talked about here, you’ll naturally want to start thinking about how your app should respond to users. We’ve given quick examples of this in both &lt;code&gt;DeviceControlActivity&lt;/code&gt; and &lt;code&gt;SearchActivity&lt;/code&gt;, but let’s talk briefly about the latter.&lt;/p&gt;&lt;p&gt;In &lt;code&gt;SearchActivity&lt;/code&gt;’s &lt;code&gt;setUiFromIntent&lt;/code&gt;, we extract the “item” slot (the presumed search term) from the data URI and use it in a TTS response to the user. We end the response by asking the user if they want to search again. To make this a seamless experience for the user, we’ve added the following event handler to our &lt;code&gt;Listener&lt;/code&gt; inner class:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;kotlin&quot;&gt;&lt;pre class=&quot;language-kotlin&quot;&gt;&lt;code class=&quot;language-kotlin&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;fun&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;ttsEvent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;event&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; TTSEvent&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;event&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;type&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    TTSEvent&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;Type&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;PLAYBACK_COMPLETE &lt;span class=&quot;token operator&quot;&gt;-&amp;gt;&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;activate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This snippet automatically reactivates ASR when Spokestack finishes playing the audio for a TTS prompt, so the user can give another search if they want. If they say nothing, the ASR will deactivate after a timeout.&lt;/p&gt;&lt;p&gt;That brings us to one last point: in its current state, voice integration in the sample app can only be accessed via wake word (“Spokestack”, in this case). It’s easy enough to add a microphone button of your choosing, and it should call &lt;code&gt;spokestack.activate()&lt;/code&gt; just as our TTS listener above. If you want to make your button work like a walkie-talkie, you can call &lt;code&gt;spokestack.deactivate()&lt;/code&gt; when the user releases it; otherwise, calling &lt;code&gt;deactivate&lt;/code&gt; is unnecessary.&lt;/p&gt;&lt;h2 id=&quot;conclusionor-is-it&quot;&gt;&lt;a href=&quot;#conclusionor-is-it&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion…or is it?&lt;/h2&gt;&lt;p&gt;Congratulations; you have an app that not only makes its features accessible via Google Assistant, but continues that voice interaction via its very own voice layer! We’ve only scratched the surface of making a fully immersive voice experience here, but check out our other &lt;a href=&quot;/tutorials&quot;&gt;tutorials&lt;/a&gt; and &lt;a href=&quot;/docs&quot;&gt;documentation&lt;/a&gt; to learn more.&lt;/p&gt;&lt;p&gt;On that note, if you were frustrated by our final caveat—that we don’t have any UI feedback for our voice interactions—then &lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-3&quot;&gt;part 3&lt;/a&gt; of the series is for you. We’ll take what we’ve developed here and drop in &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-android&quot;&gt;Spokestack Tray&lt;/a&gt; so that our users can see what they’re saying and interact with Spokestack more naturally.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack with Google App Actions, Part 1]]></title><description><![CDATA[Take your app's voice integration to the next level by having Google Assistant hand off the conversation to an in-app voice assistant.]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-1</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-google-app-actions/part-1</guid><pubDate>Mon, 23 Nov 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAAB50lEQVQoz3VSz2vUQBjNnyCIB3+gB7NrNskkM5OZTLJdsam03diDh2Z3a5FuQY3gwUtvHnpQ8NdNrCBikWqxqH+Df9uT+bYJqejh8c2E73vfe2/iMMZgEYYhVRZFZ6DTFHmewxgDrTWa/i7aWcbg2EvzIQgCDFwXfq8Hv9+nKsIQWZYRWZIk1GP7bW3OXVKn3RIESLTGaDpFUT/Crd1drDyusVRtgkUMUnDEcQwpBTjnRC6EQBRFZxU2B9/zkBYFJodfcO/HT1RHx7h7+BXlwScU4w3s7DzAZLaN2dY25vM5ZrMZ6rpGWZbwPI+IW8v2YrfHnCNdXkY0vIlqaw3H7yf49XGKzy/WcfJmFd9ereH509tgLIbv+63lLhybTZqmFHpmDIxS4LHEZmlw8GwV31+PcfTyDk7elviwP8bew5U2T85jEtOAFFoypVQLfUqutKHsonCA3HAopRHHHEIqaL14+eFwSA/WnXfsJiklBWxhAxcygZIBkuIJkvu/oUYVWDjAhcs9uK5Lg3bOKrWElrwR5jQkDRZZSvDQhdh4B7MPZOt76LnXcO78RVy9conUdvutIEtOhETwTzBwkYGbCkmi6Xe50b9OWbWLT0GuhCDVjpX7X2QGuRHI84WtpdGIamPzb9hM/wBbFXESEsdPywAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Integrating Spokestack Google App Actions&quot; title=&quot;Integrating Spokestack Google App Actions&quot; src=&quot;/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png&quot; srcSet=&quot;/static/57612fae0d2ab023e240afb07024484e/2eeed/google-app-actions-hero.png 294w,/static/57612fae0d2ab023e240afb07024484e/0d6a1/google-app-actions-hero.png 588w,/static/57612fae0d2ab023e240afb07024484e/05162/google-app-actions-hero.png 1175w,/static/57612fae0d2ab023e240afb07024484e/8537d/google-app-actions-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;This tutorial is part of a series:&lt;/em&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Part 1&lt;/strong&gt; (&lt;em&gt;You are here!&lt;/em&gt;): Working with Google App Actions&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-2&quot;&gt;Part 2&lt;/a&gt;: Adding your own voice experience with Spokestack&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-3&quot;&gt;Part 3&lt;/a&gt;: Using Spokestack Tray to add a voice UI&lt;/li&gt;&lt;/ul&gt;&lt;hr/&gt;&lt;p&gt;The terms “smart speaker” and “voice assistant” are now full members of our cultural vocabulary. Talking to a voice assistant has become so common that voice interaction has grown from an exciting new feature into an expectation. Consumers are habituated to the tech and &lt;a href=&quot;https://voicebot.ai/2020/11/09/national-consumer-survey-reveals-that-a-lot-of-consumers-want-voice-assistants-in-mobile-apps/&quot;&gt;would like to use it in mobile apps&lt;/a&gt; — if only the apps supported it.&lt;/p&gt;&lt;p&gt;Making that last part a reality is what Spokestack’s all about. In this tutorial, we’re going to solve one of the problems mentioned in the article linked above:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;Those approaches enable a user to employ Siri or Google Assistant to deep link into an app to a specific screen or to execute an action. However, once the user gets to that point, the Siri or Google Assistant session has ended.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;The “approach” the author’s talking about, in Android’s case, is the use of an &lt;a href=&quot;https://developers.google.com/assistant/app/overview&quot;&gt;App Action&lt;/a&gt; to open an app or a specific screen/&lt;code&gt;Activity&lt;/code&gt; via Google Assistant. This is a great feature that Google’s provided, and having the interaction between the user and Google Assistant end after your app has been opened is exactly what you want! That’s &lt;em&gt;your&lt;/em&gt; user experience to manage from there on out, and any data provided by the user (your customer) should stay between the two of you.&lt;/p&gt;&lt;p&gt;It does, however, leave you with the burden of maintaining the UX expectations set up by that initial interaction, which in this case means continuing the conversation via voice if the user wants to. Let’s talk about how to do that.&lt;/p&gt;&lt;h2 id=&quot;show-me-the-code&quot;&gt;&lt;a href=&quot;#show-me-the-code&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Show me the code!&lt;/h2&gt;&lt;p&gt;In this tutorial, we’ll be talking through &lt;a href=&quot;https://github.com/spokestack/app-actions-example&quot;&gt;a sample app&lt;/a&gt; we’ve set up to demonstrate an integration with App Actions. It doesn’t have any groundbreaking features or much of a UI to speak of, but it should give you a basic idea of how to handle multiple &lt;code&gt;Intent&lt;/code&gt;s/&lt;code&gt;Activity&lt;/code&gt;s via voice, using Google Assistant for the initial interaction and following up with your own voice responses from then on.&lt;/p&gt;&lt;h2 id=&quot;setup&quot;&gt;&lt;a href=&quot;#setup&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Setup&lt;/h2&gt;&lt;p&gt;Google has some &lt;a href=&quot;https://developers.google.com/assistant/app/get-started#requirements&quot;&gt;prerequisites&lt;/a&gt; for App Actions development. You’ll need to meet those in order to try out the Google Assistant interactions. Spokestack’s features are available either way; the only prerequisite for them is a physical device (not the emulator) to test voice input.&lt;/p&gt;&lt;h3 id=&quot;dependencies&quot;&gt;&lt;a href=&quot;#dependencies&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dependencies&lt;/h3&gt;&lt;p&gt;OK, with that out of the way, let’s get our dependencies sorted. Check the &lt;code&gt;app/build.gradle&lt;/code&gt; file in the sample project — these are the important bits:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;groovy&quot;&gt;&lt;pre class=&quot;language-groovy&quot;&gt;&lt;code class=&quot;language-groovy&quot;&gt;implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;io.spokestack:spokestack-android:11.5.2&amp;#x27;&lt;/span&gt;
implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;org.tensorflow:tensorflow-lite:2.6.0&amp;#x27;&lt;/span&gt;
implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;androidx.media:media:1.3.1&amp;#x27;&lt;/span&gt;
implementation &lt;span class=&quot;token string&quot;&gt;&amp;#x27;com.google.android.exoplayer:exoplayer-core:2.14.0&amp;#x27;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;There are several non-Spokestack dependencies here because Spokestack takes a “use only what you need” approach and doesn’t include many transitive dependencies by default. The sample app will use everything, though, so we’ll need to pull in some libraries to support wake word, NLU, and TTS that we wouldn’t need if we didn’t want that functionality.&lt;/p&gt;&lt;h3 id=&quot;declaring-app-actions&quot;&gt;&lt;a href=&quot;#declaring-app-actions&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Declaring App Actions&lt;/h3&gt;&lt;p&gt;The main addition to your app when integrating App Actions is adding a &lt;code&gt;res/values/actions.xml&lt;/code&gt; file that describes the pieces of your app you want to make available to Google Assistant. Google’s documentation for the format is &lt;a href=&quot;https://developers.google.com/assistant/app/action-schema&quot;&gt;here&lt;/a&gt;, but in summary:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;code&gt;actions&lt;/code&gt; is the top-level element, and it contains one or more nested &lt;code&gt;action&lt;/code&gt; elements.&lt;/li&gt;&lt;li&gt;Each &lt;code&gt;action&lt;/code&gt; describes a single intent (or command) you want to expose to Google Assistant.&lt;/li&gt;&lt;li&gt;You choose between a built-in and custom intent via the &lt;code&gt;intentName&lt;/code&gt; attribute:&lt;ul&gt;&lt;li&gt;A name that matches on of Google’s &lt;a href=&quot;https://developers.google.com/assistant/app/reference/built-in-intents/bii-index&quot;&gt;built-ins&lt;/a&gt; will automatically inherit the parameters described by that built-in.&lt;/li&gt;&lt;li&gt;A name that does not match a built-in means that you’ll need to specify both a &lt;code&gt;queryPatterns&lt;/code&gt; attribute that points to natural-language examples of how a user would invoke this intent and nested &lt;code&gt;parameter&lt;/code&gt; elements that describe any parameters that you want to capture in those utterances.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Each action must have at least one nested &lt;code&gt;fulfillment&lt;/code&gt; element that describes what part of the app should handle the intent. There are &lt;a href=&quot;https://developers.google.com/assistant/app/action-schema#fulfillment&quot;&gt;several ways&lt;/a&gt; to handle this. Our sample app uses custom schemas (&lt;code&gt;example://feature?{param1,param2}&lt;/code&gt;), but an app tied to an established web presence should probably use http/https deep links instead.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Here’s an example of one of our actions, this one using a custom intent:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;action&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;intentName&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;navigate.settings&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;queryPatterns&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;@array/NavigateSettingsQueries&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;fulfillment&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;urlTemplate&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;example://settings&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;action&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is the simplest action we have, because it doesn’t include any extra parameters. We’re only using this one to take the user to a certain screen in the app. The &lt;code&gt;queryPatterns&lt;/code&gt; attribute points to an item in &lt;code&gt;res/values/arrays.xml&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;string-array&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;NavigateSettingsQueries&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;settings&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;settings (screen)? (please)?&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;go to settings&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;go to my settings&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;go to the settings&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;item&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token comment&quot;&gt;&amp;lt;!-- ... --&amp;gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;string-array&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Note the use of &lt;a href=&quot;https://developers.google.com/assistant/conversational/df-asdk/reference/action-package/QueryPatterns#ap-conditionals&quot;&gt;conditionals&lt;/a&gt;, a feature from another part of the Google Assistant ecosystem brought over to make these query configurations a little more concise.&lt;/p&gt;&lt;h4 id=&quot;fulfilling-the-app-actions-request&quot;&gt;&lt;a href=&quot;#fulfilling-the-app-actions-request&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Fulfilling the App Actions request&lt;/h4&gt;&lt;p&gt;As mentioned above, the &lt;code&gt;fulfillment&lt;/code&gt; element specifies what sort of URI a given intent should produce, but the &lt;em&gt;routing&lt;/em&gt; for those URIs happens in your manifest (&lt;code&gt;AndroidManifest.xml&lt;/code&gt;). An activity that should handle a deep link must declare an &lt;code&gt;intent-filter&lt;/code&gt; element that matches said link. Here’s the intent filter for the URI above, which leads to the settings activity:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;intent-filter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;action&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;name&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;android.intent.action.VIEW&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;category&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;name&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;android.intent.category.DEFAULT&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;data&lt;/span&gt;
    &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;host&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;settings&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt;
    &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;scheme&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;example&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;intent-filter&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The key here is the &lt;code&gt;data&lt;/code&gt; element, where we break down the custom URI we used in &lt;code&gt;actions.xml&lt;/code&gt;. This &lt;code&gt;data&lt;/code&gt; element matches a URI of “example://settings”.&lt;/p&gt;&lt;p&gt;Once properly declared in the manifest, links handled by an activity will show up as the &lt;code&gt;data&lt;/code&gt; field of the activity’s &lt;code&gt;Intent&lt;/code&gt;. See any of the sample app’s activities for an example.&lt;/p&gt;&lt;p&gt;This is, of course, only one way of setting up deep links. For a large app (or maybe just for sake of convenience), you might prefer an alternate solution like airbnb’s &lt;a href=&quot;https://github.com/airbnb/DeepLinkDispatch&quot;&gt;&lt;code&gt;DeepLinkDispatch&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;&lt;h2 id=&quot;try-it-out&quot;&gt;&lt;a href=&quot;#try-it-out&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Try it out!&lt;/h2&gt;&lt;p&gt;This is a good time to take a step back and run the demo to see what we’ve been talking about. You can run the app itself on either the simulator or a device, but you’ll need a physical device with Google Play services and a Google developer account in order to test the actual App Actions. If you have those, head over to &lt;a href=&quot;https://codelabs.developers.google.com/codelabs/appactions/#2&quot;&gt;this Codelab&lt;/a&gt; for instructions on how to modify the app’s ID, upload a test build to the Play Console, and get the App Actions Test Tool.&lt;/p&gt;&lt;p&gt;Once you’ve uploaded the build, make sure you’re signed into Android Studio with the same developer account you used for the upload (click the avatar icon at the top right of the IDE window to sign in). Then, follow the instructions in the “Test App Actions” section &lt;a href=&quot;https://codelabs.developers.google.com/codelabs/appactions/#4&quot;&gt;later in the Codelab&lt;/a&gt; to create a preview build.&lt;/p&gt;&lt;p&gt;You can send test actions either directly from the test tool or by invoking Google Assistant on the device like you normally would. Try saying things like “OK Google, turn the office light off” or “OK Google, find me some shoes”. We’ve worked out the kinks for this demo, so everything should go smoothly; if it doesn’t, please open an issue in the sample app repository! If you run into any errors as you start to make changes to the sample app, check Google’s &lt;a href=&quot;https://developers.google.com/assistant/app/troubleshoot&quot;&gt;troubleshooting tips&lt;/a&gt;; the error messages themselves aren’t always intuitive.&lt;/p&gt;&lt;h2 id=&quot;but-waittheres-more&quot;&gt;&lt;a href=&quot;#but-waittheres-more&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;But wait…there’s more&lt;/h2&gt;&lt;p&gt;That does it for the first part of the interaction — getting from Google Assistant into a specific part of your app. That’s a step toward a better user experience already, but check out the &lt;a href=&quot;/blog/integrating-spokestack-google-app-actions/part-2&quot;&gt;next part&lt;/a&gt; of our tutorial to learn how to continue the voice interface the user just tried to open. See you then!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Spokestack Python Library]]></title><description><![CDATA[Launching the Spokestack Python library that brings the Spokestack speech pipeline to Raspberry Pi, Tinker Board, and PCs.]]></description><link>https://www.spokestack.io/blog/introducing-the-spokestack-python-library</link><guid isPermaLink="false">https://www.spokestack.io/blog/introducing-the-spokestack-python-library</guid><pubDate>Tue, 06 Oct 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/cf8fc58f6b200f42395fe1744662c3d7/8537d/python-library-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAAB40lEQVQoz3WRzW7TQBRG/RZsiERREmzHf+OxYztxErxo0jZJAamFDWGJKrqi6aIgxBpWJSKCIhYtEq/ABoTgzQ6aSRyaIBZHY83Pd8+9NlzXxXEc1Kq/G40Ftv2X5V55Z5Pr743NTS8I8MMQ2e0Q9zrEd3uIOCKQIa7TwLYsrCWmadJYFirfrwLVQRTH7L95y+HlFQdfvjK8uGT06Yrd+WcG5x/Ih/fZ29tnNB4xHo+ZTCa0Wi1s2143VOkl3cmE3tMjHkyPeTk74cXshFfvp5ydP2d6POLoyS6dbo88b1MUBVEUrSx1oArxPA/f9xfYFtWtGg8HPr8vBnyf9/n1sc+PeZ9v813enRWYlo1lmpjmHW13XciI45hms0mSJJo0y2gmKZ08Y2e7zXinzaN7bcb9hO0ioegmRFGMlJJAzdv3tZBCG6owpV1SFoiiJoHvEbgWQeAjZIwQEikX51mW6fmlabr23lCVwjBECKFRVUUoicIGsniGfPyTZvcQp1HnRuUW9XpN2ylUcRWowksxowwp0XMMQgKnihjOSF5DOjjl9tZNPL9CvVohDKO1+0pIhevA1c/4B5dApATJgW5TypB6raZntSq8RHclhLY2lO5/SROyRJBli7byPNdr2eYmaqZ/ACNVcgMVJ6wJAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Spokestack Python Library&quot; title=&quot;Spokestack Python Library&quot; src=&quot;/static/cf8fc58f6b200f42395fe1744662c3d7/05162/python-library-hero.png&quot; srcSet=&quot;/static/cf8fc58f6b200f42395fe1744662c3d7/2eeed/python-library-hero.png 294w,/static/cf8fc58f6b200f42395fe1744662c3d7/0d6a1/python-library-hero.png 588w,/static/cf8fc58f6b200f42395fe1744662c3d7/05162/python-library-hero.png 1175w,/static/cf8fc58f6b200f42395fe1744662c3d7/8537d/python-library-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Spokestack’s tagline, “Give your app a voice interface,” has mainly focused on mobile apps since we launched in January. Today we’re expanding the type of apps we can voice-enable to almost any project with the launch of &lt;a href=&quot;https://github.com/spokestack/spokestack-python&quot;&gt;Spokestack Python&lt;/a&gt;. Want to try it out now? &lt;a href=&quot;https://twitter.com/_Will_Rice&quot;&gt;Will Rice&lt;/a&gt;, who built the library, has written a &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-python-using-spokestack&quot;&gt;tutorial&lt;/a&gt; for the “Minecraft Helper” example we developed for our &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;mobile platforms&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Just as they do with our &lt;a href=&quot;https://github.com/spokestack/spokestack-ios&quot;&gt;iOS&lt;/a&gt;, &lt;a href=&quot;https://github.com/spokestack/spokestack-android&quot;&gt;Android&lt;/a&gt;, and &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack&quot;&gt;React Native voice libraries&lt;/a&gt;, developers can use Spokestack as a complete speech pipeline solution to talk to their customers. All you need to do to get started is &lt;a href=&quot;/create&quot;&gt;create a Spokestack account&lt;/a&gt; and get your API credentials. The same API that powers our mobile libraries also powers Spokestack Python. We want developers to be able to write once, talk everywhere using Spokestack.&lt;/p&gt;&lt;figure&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1024px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/eb002c683271707e3b67e53243f315f8/658fc/speech-pipeline.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:75.17006802721089%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAPCAYAAADkmO9VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAB/ElEQVQ4y42SS2sTURiGR7fiystGQXTVRRAXogvxF3TTVQSlu9pWstOFbqqgmyql1oUg3bgqrpTUNJMqVgq6kCITzaU1XSQxl0m0NOZS0kkm5zwyZ1pISkjnwJnv5f0e3nO+mdE4sKSUqlarVYKhMCF9mcUlXen34QjBkE6tVu9hu5d20Oh0Oqoa0R8cO3EG3+VrXLxyXe2hS1c5fuossXiihx0YKIR76p+tCo+nnzPz4iXTM7M8m51T+snTOba2qz3swECkRGG1GHQyQBPsirsd7Xi1uMt4GVkKW1UrPkVd95EPD5PTR8gsjVDQh2noPqz4ox52cKAU7jh/P9M2ApifRil8uEFW91NaGcWOBlSvmz0kUD0pNcBsQLPdwWrtYrWaShfrUN5xmT4T9/koe1TSFETWJQureRZWi7xd+8ebLyWWN9xeN3vIDaU6vdqU/K5AyrRI5nf5VbKVdjyn5/k/FHuBX9MSfUNi5CUxU/KzKDAKruf0HMbTDYVw309kE+6+s5h4lWR8PsXE/CaB12nuBVtEUi4jBF5HhlSmTaIoWd8GowzRMko7Xirb9j6ybbvQ2J0c54cS+G+luT2ZZWwyi/9mWnnjgVwP6ynw/pSJpn1HO2pw+kKCk+cSaEcM5T14aHoP3B/DgXP5Ft/Wdvi4Ulfb0Y63H9Rv5P8Z/k05Ew/3BwAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Spokestack pipeline&quot; title=&quot;Spokestack pipeline&quot; src=&quot;/static/eb002c683271707e3b67e53243f315f8/658fc/speech-pipeline.png&quot; srcSet=&quot;/static/eb002c683271707e3b67e53243f315f8/2eeed/speech-pipeline.png 294w,/static/eb002c683271707e3b67e53243f315f8/0d6a1/speech-pipeline.png 588w,/static/eb002c683271707e3b67e53243f315f8/658fc/speech-pipeline.png 1024w&quot; sizes=&quot;(max-width: 1024px) 100vw, 1024px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;figcaption&gt;You have everything you need to add a voice interface to any project.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;As with all of our libraries, developers can create a &lt;a href=&quot;/blog/wakewords-for-mobile-apps&quot;&gt;custom wake word&lt;/a&gt; and a text-to-speech voice that is unique to their brand and application. With the Spokestack Python library, the voice interface is less tied to a platform, opening up a number of application opportunities.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Build your own voice interface to talk directly to your customers&quot; src=&quot;https://www.youtube-nocookie.com/embed/AvhQ6-9nCrQ?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;Got an Alexa or Google Assistant app you want to set up as a kiosk? Now you can build one on a low-energy, small-form device such as a Raspberry Pi or &lt;a href=&quot;https://www.asus.com/us/Single-Board-Computer/Tinker-Board/&quot;&gt;Tinker Board&lt;/a&gt;. Maybe you want to set up something on an old PC with a mic and a speaker. Simply &lt;a href=&quot;/docs/integrations/export&quot;&gt;export your NLU model&lt;/a&gt;, and you can do it!&lt;/p&gt;&lt;p&gt;Or maybe you want to build a voice bot companion for a game or social platform. Whatever your project, there are a ton of ways to add an independent voice interface, and you have complete creative freedom using Spokestack Python.&lt;/p&gt;&lt;p&gt;Over the next few weeks, Will is going to create video tutorials and demonstrations to show what you can do with Spokestack. We hope this will spark everyone’s imagination and stir your thoughts about what is possible with a flexible speech pipeline that’s customizable for any use case you can imagine. Be sure to follow us on &lt;a href=&quot;http://www.twitter.com/spokestack&quot;&gt;Twitter&lt;/a&gt; or &lt;a href=&quot;http://spokestack.substack.com&quot;&gt;sign up for our newsletter&lt;/a&gt; for stream announcements.&lt;/p&gt;&lt;p&gt;If you have any questions or feedback, please send us an &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;email&lt;/a&gt;. We look forward to seeing what you build!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Porting the Alexa Minecraft Skill to Python Using Spokestack]]></title><description><![CDATA[Spokestack makes it easy to convert a smart speaker voice app from major platforms to an app that you control.]]></description><link>https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-python-using-spokestack</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-python-using-spokestack</guid><pubDate>Mon, 05 Oct 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5d6f33bc4e7c64b84c67f2a917944350/8537d/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACI0lEQVQoz3WTTWsTURSG51e4qAshQpv5/rjzPUkdsDZtWk3TkjZNW2xTVPCDCoILRRRFqN24UBFpa0mMCzf+Ilf+kUfulEmD6OLhcs+c89733HtGMU0TiWEYxWpaVoHjuqRZhgh8gjAkq9XIsowkSXBd9yJ/stY0USYDuq6jTU+jViqY1SqR5+EaBo6uEzoOSRwTRRG2bRe5krK2XJXJUzzPI2u1qPd65P0+s7u75Pv71LobpGurWMLDsW2E8PCFIAj8/ziU6DqWZbF4+I7VbyPap2esfDmmfTpg5XTIrZMh+doGW70dNjZ3WL65yu3dPnmeo2nahaBUl0JyU6/X6e7t0d7epr3VY35tnbv3Wow+rvPzuMvoqMnX13OcvbnBj6M5nj1o4IkYQ9fGTpUgCIp78X2fRqPB1uYm3U6H1vISiwtL7HVyPj2f4+TVPIO3CwwPm3x+ucDosMnj/nUMy8W2zLEpJQzDQkziOE7xwXEdDNNC001sRxboGLpKFPokkY/rBbgiIgjPH0maKjUUIUTxGHIUJHJvGhaPDu5w9P4FcRjjrZwQdgbFg1y6fIXK1QqBL5Bm4jguRqk0pkgR6axE7qtVjadPHvJ98IE0yfHv/yI5+I2mulTVKdSZKQzTLvLlCElDpUtFBv5G3oXretRq17AsHREv4YVNhOcwI+dUVcdiktKI7E6Rdv+FbCUMA5I0JYnFOUlS/C1pmo5bnUTG/wCYfHvp2fey4wAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting the Alexa Minecraft Skill to Python using Spokestack&quot; title=&quot;Porting the Alexa Minecraft Skill to Python using Spokestack&quot; src=&quot;/static/5d6f33bc4e7c64b84c67f2a917944350/05162/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png&quot; srcSet=&quot;/static/5d6f33bc4e7c64b84c67f2a917944350/2eeed/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png 294w,/static/5d6f33bc4e7c64b84c67f2a917944350/0d6a1/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png 588w,/static/5d6f33bc4e7c64b84c67f2a917944350/05162/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png 1175w,/static/5d6f33bc4e7c64b84c67f2a917944350/8537d/porting-the-alexa-minecraft-skill-to-python-using-spokestack.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;This is a tutorial on how to port a simple &lt;a href=&quot;https://github.com/alexa/skill-sample-python-howto&quot;&gt;Minecraft recipe skill&lt;/a&gt; to Spokestack using the &lt;a href=&quot;https://github.com/spokestack/spokestack-python&quot;&gt;spokestack-python&lt;/a&gt; library. It is similar to &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;our mobile tutorial series&lt;/a&gt;, but the Python version not have any GUI components. This makes the experience closer to that of a smart speaker. We will discuss the concepts for each part of the user interaction briefly, but for a full description check out our &lt;a href=&quot;/docs/concepts&quot;&gt;documentation&lt;/a&gt;. Before we get into the programming, we will need to get API keys from our Spokestack account.&lt;/p&gt;&lt;h2 id=&quot;signing-up-for-a-spokestack-account&quot;&gt;&lt;a href=&quot;#signing-up-for-a-spokestack-account&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Signing Up for a Spokestack Account&lt;/h2&gt;&lt;ol&gt;&lt;li&gt;&lt;a href=&quot;/account/create&quot;&gt;Create&lt;/a&gt; a Spokestack account.&lt;/li&gt;&lt;li&gt;Click “Add token” in the &lt;a href=&quot;/account/settings#api&quot;&gt;API Credentials dashboard&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Copy the secret key when it is displayed; you’ll need it later.&lt;/li&gt;&lt;/ol&gt;&lt;h2 id=&quot;setting-up-the-project&quot;&gt;&lt;a href=&quot;#setting-up-the-project&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Setting up the Project&lt;/h2&gt;&lt;p&gt;First let’s make a directory to hold the project.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;git&lt;/span&gt; clone https://github.com/spokestack/minecraft-skill-python
&lt;span class=&quot;token builtin class-name&quot;&gt;cd&lt;/span&gt; minecraft-skill-python&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then there are some system dependencies.&lt;/p&gt;&lt;h3 id=&quot;macos&quot;&gt;&lt;a href=&quot;#macos&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;macOS&lt;/h3&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;brew &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; lame portaudio&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;debianubuntu&quot;&gt;&lt;a href=&quot;#debianubuntu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Debian/Ubuntu&lt;/h3&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;sudo&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;apt-get&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; portaudio19-dev libmp3lame-dev&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now let’s set up the python virtual environment. We use &lt;a href=&quot;https://github.com/pyenv/pyenv&quot;&gt;&lt;code&gt;pyenv&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;https://github.com/pyenv/pyenv-virtualenv&quot;&gt;&lt;code&gt;pyenv-virtualenv&lt;/code&gt;&lt;/a&gt; to manage virtual environments, but any virtual environment will work.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pyenv &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;3.7&lt;/span&gt;.6
pyenv virtualenv &lt;span class=&quot;token number&quot;&gt;3.7&lt;/span&gt;.6 minecraft
pyenv &lt;span class=&quot;token builtin class-name&quot;&gt;local&lt;/span&gt; minecraft&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then the python dependencies.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;pip &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; -r requirements.txt&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;tflite-runtime&quot;&gt;&lt;a href=&quot;#tflite-runtime&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;TFLite Runtime&lt;/h3&gt;&lt;p&gt;In addition to the Python dependencies, you will need to install the TFLite Interpreter. You can install it for your platform by following the instructions at &lt;a href=&quot;https://www.tensorflow.org/lite/guide/python#install_just_the_tensorflow_lite_interpreter&quot;&gt;TFLite Interpreter&lt;/a&gt;. Note: this is not the full &lt;a href=&quot;https://www.tensorflow.org/&quot;&gt;Tensorflow&lt;/a&gt; package.&lt;/p&gt;&lt;p&gt;If you would like to try out the final version of the Minecraft app, you can now run it with:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;python app.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;To follow along with the tutorial, we recommend making a new python file titled &lt;code&gt;myapp.py&lt;/code&gt; or similar so you can compare it to the original &lt;code&gt;app.py&lt;/code&gt;.&lt;/p&gt;&lt;h2 id=&quot;using-the-speech-pipeline&quot;&gt;&lt;a href=&quot;#using-the-speech-pipeline&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Using the Speech Pipeline&lt;/h2&gt;&lt;p&gt;An essential piece to any voice interface is the ability to detect when the user is speaking, then convert the spoken phrase into a text transcript. Spokestack has an &lt;a href=&quot;/docs/concepts/pipeline-configuration&quot;&gt;easy-to-use speech pipeline&lt;/a&gt; that will handle this for us. The speech pipeline consists of three major components: a voice detection module, a wake word trigger, and a speech recognizer.&lt;/p&gt;&lt;h3 id=&quot;microphone-input&quot;&gt;&lt;a href=&quot;#microphone-input&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Microphone Input&lt;/h3&gt;&lt;p&gt;Accepting audio input is always the first step in the pipeline. For this demo, we will use the included &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/io/pyaudio.py#L8&quot;&gt;input class&lt;/a&gt; that leverages &lt;a href=&quot;http://people.csail.mit.edu/hubert/pyaudio/&quot;&gt;PyAudio&lt;/a&gt; to stream microphone input to the pipeline. The class is initialized like this:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;io&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pyaudio &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; PyaudioMicrophoneInput

mic &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; PyaudioMicrophoneInput&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;voice-activity-detection&quot;&gt;&lt;a href=&quot;#voice-activity-detection&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Voice Activity Detection&lt;/h3&gt;&lt;p&gt;The second component we are adding to the pipeline is the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/vad/webrtc.py#L18&quot;&gt;VoiceActivityDetector&lt;/a&gt;. This module analyzes a single frame of audio to determine if speech is present. This will be the component that allows audio to flow through the rest of the pipeline. For simplicity, we will use the the default voice activity detection settings. The voice activity component can be initialized with the following:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;vad&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;webrtc &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; VoiceActivityDetector

vad &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; VoiceActivityDetector&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now that we have a way to determine if the audio contains speech, let’s move on to the component that activates the pipeline when it hears a specific phrase.&lt;/p&gt;&lt;h3 id=&quot;wake-word-activation&quot;&gt;&lt;a href=&quot;#wake-word-activation&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Wake Word Activation&lt;/h3&gt;&lt;p&gt;The &lt;a href=&quot;/docs/concepts/wakeword-models&quot;&gt;wake word&lt;/a&gt; component of the pipeline looks for a specific phrase in the audio input and signals the pipeline to activate ASR when it is recognized. For our purposes, we will be using “Spokestack” as the wake word. As with most voice assistants, “Hey Spokestack” will work as well. The process to initialize this component mirrors the way we set up voice activity detection. The directory passed to &lt;code&gt;model_dir&lt;/code&gt; should contain three &lt;code&gt;.tflite&lt;/code&gt; files: &lt;code&gt;encode.tflite&lt;/code&gt;, &lt;code&gt;detect.tflite&lt;/code&gt;, and &lt;code&gt;filter.tflite&lt;/code&gt;. These can be found inside the &lt;code&gt;tflite&lt;/code&gt; directory of the project GitHub repository.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;wakeword&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tflite &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; WakewordTrigger

wakeword &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; WakewordTrigger&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;model_dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once the skill is actively listening for user speech, all we have to do is transcribe what the user says.&lt;/p&gt;&lt;h3 id=&quot;automatic-speech-recognition-asr&quot;&gt;&lt;a href=&quot;#automatic-speech-recognition-asr&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Automatic Speech Recognition (ASR)&lt;/h3&gt;&lt;p&gt;&lt;a href=&quot;/docs/concepts/asr&quot;&gt;ASR&lt;/a&gt; is the most critical piece of the speech pipeline, because it produces the transcript that is used to turn speech into actions. However, critical components do not have to be difficult to add. The following initializes the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/asr/speech_recognizer.py#L12&quot;&gt;ASR component&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is where you will need your API keys from the account console.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;asr&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speech_recognizer &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; CloudSpeechRecognizer

recognizer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; CloudSpeechRecognizer&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    spokestack_id&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;your_spokestack_key&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    spokestack_secret&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;your_secret_key&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;activation-timeout-optional&quot;&gt;&lt;a href=&quot;#activation-timeout-optional&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Activation Timeout (Optional)&lt;/h3&gt;&lt;p&gt;An issue you may run into is the ASR not being activated long enough or being active for too long. To configure this for your use case, you can add the &lt;code&gt;ActivationTimeout&lt;/code&gt; component to the pipeline with a minimum and maximum value in milliseconds. This component can be initialized with the following:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;activation_timeout &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; ActivationTimeout

timeout &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; ActivationTimeout&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;min_active&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; max_active&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;5000&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;speech-pipeline&quot;&gt;&lt;a href=&quot;#speech-pipeline&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Speech Pipeline&lt;/h3&gt;&lt;p&gt;Now, we can put it all together in the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/pipeline.py#L9&quot;&gt;pipeline&lt;/a&gt;. After this step, you will be able to wake the assistant by saying “Spokestack” and produce a text transcript of what is said next. For the Minecraft skill you would say something like, “Spokestack, what is the recipe for a snow golem?“.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipeline &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; SpeechPipeline

&lt;span class=&quot;token comment&quot;&gt;# without timeout&lt;/span&gt;
pipeline &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; SpeechPipeline&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;input_source&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;mic&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; stages&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;vad&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; wakeword&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; recognizer&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token comment&quot;&gt;# with timeout&lt;/span&gt;
pipeline &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; SpeechPipeline&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
    input_source&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;mic&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; stages&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;vad&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; wakeword&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; recognizer&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; timeout&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;events&quot;&gt;&lt;a href=&quot;#events&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Events&lt;/h2&gt;&lt;p&gt;We know that the goal of the pipeline is to produce a transcript of the user’s speech. However, we haven’t discussed how to access that transcript. The pipeline is designed to run continuously, but we can use event handlers to access the transcript without stopping the pipeline. For this tutorial, we want to pass the completed transcript to a module that helps us &lt;em&gt;understand&lt;/em&gt; what the user has said. To accomplish this, we register an event handler with the pipeline:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;event&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;on_speech&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    transcript &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    &lt;span class=&quot;token keyword&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In the application, we don’t want to print the transcript, but we’ve added that so you can see the results if you’ve been running the code as you follow along. In the subsequent sections, we will discuss a couple new components and also flesh out this event handler to allow the Minecraft skill to understand the user’s request and select an appropriate response.&lt;/p&gt;&lt;h2 id=&quot;natural-language-understanding-nlu&quot;&gt;&lt;a href=&quot;#natural-language-understanding-nlu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Natural Language Understanding (NLU)&lt;/h2&gt;&lt;p&gt;The Natural Language Understanding, or &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;NLU&lt;/a&gt;, component takes a transcript of user speech and distills it into unambiguous instructions for an app. The paradigm used in most systems is the intent and slot model. Essentially, an intent is the function the user intends to invoke, and the slots are the arguments the intent needs to accomplish its action. For example, a user may say &lt;code&gt;What is the recipe for a dark prismarine?&lt;/code&gt;. In this case, the intent is &lt;code&gt;RecipeSearch&lt;/code&gt;, and the slot is &lt;code&gt;dark prismarine&lt;/code&gt; . The initialization of the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/nlu/tflite.py#L18&quot;&gt;TFLiteNLU&lt;/a&gt; should look familiar at this point. The directory passed to &lt;code&gt;model_dir&lt;/code&gt; contains three files: &lt;code&gt;vocab.txt&lt;/code&gt;, &lt;code&gt;metadata.json&lt;/code&gt;, &lt;code&gt;nlu.tflite&lt;/code&gt;. These files are necessary to run our on-device NLU model and are in the &lt;code&gt;tflite&lt;/code&gt; directory of the GitHub repository.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tflite &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; TFLiteNLU

nlu &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TFLiteNLUModel&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;model_dir&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now is a good time to add the NLU to our &lt;code&gt;on_speech&lt;/code&gt; event handler:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;event&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;on_speech&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    transcript &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; nlu&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now that we know what recipe the user is looking for, you may be wondering how we turn this into a response. The following section will explain just that.&lt;/p&gt;&lt;h2 id=&quot;dialogue-management&quot;&gt;&lt;a href=&quot;#dialogue-management&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dialogue Management&lt;/h2&gt;&lt;p&gt;The Minecraft dialogue manager is fairly simple. The basic component necessary is a way to look up the recipes, and the rest is just string interpolation. Since we have a relatively limited number of recipes, we can implement the lookup as a simple dictionary in Python. Below is a snippet of the recipe “database”.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;DB&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; Dict&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;snow golem&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;A snow golem can be created by placing a pumpkin on top of  two &amp;quot;&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;snow blocks on the ground.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;pillar quartz block&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;A pillar of quartz can be obtained by placing a block of &amp;quot;&lt;/span&gt;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;quartz on top of a block of quartz in mine craft.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This makes looking up a recipe very concise: &lt;code&gt;DB.get(&amp;quot;snow golem&amp;quot;)&lt;/code&gt;. There can be an issue with using the dictionary lookup alone though. Let’s say that due to an ASR error the parsed slot isn’t a full match for &lt;code&gt;snow golem&lt;/code&gt;, but it is something like &lt;code&gt;sow golem&lt;/code&gt;. A simple dictionary lookup will not be able to resolve those slots. However, there is a method that we can add to deal with small errors like that. This method is called &lt;a href=&quot;https://en.wikipedia.org/wiki/Fuzzy_matching_(computer-assisted_translation)&quot;&gt;fuzzy matching&lt;/a&gt;, and based on the similarity between &lt;code&gt;snow golem&lt;/code&gt; and &lt;code&gt;sow golem&lt;/code&gt; we can make sure that the latter resolves to the actual entity. In this tutorial, we will use the python library &lt;code&gt;fuzzywuzzy&lt;/code&gt; to make these matches. Below is the way it is used in the tutorial repository.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; fuzzywuzzy &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; process

matched&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; score &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; process&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;extractOne&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;slot&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;raw_value&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_names&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; score &lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_threshold&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    recipe &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_recipes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;get&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;matched&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We are simply overwriting the parsed entity with the one that is the closest match from the set of possible entities. The full dialogue manager can be seen below:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; fuzzywuzzy &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; process  &lt;span class=&quot;token comment&quot;&gt;# type: ignore&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; minecraft &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; recipes
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; minecraft&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;responses &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; Response

&lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;DialogueManager&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;token triple-quoted-string string&quot;&gt;&amp;quot;&amp;quot;&amp;quot;Simple dialogue manager

    Args:
        threshold (float): fuzzy match threshold
    &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;__init__&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; threshold&lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_recipes &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; recipes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;DB
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_names &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;list&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_recipes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;keys&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_threshold &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; threshold
        self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; Response

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;__call__&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token triple-quoted-string string&quot;&gt;&amp;quot;&amp;quot;&amp;quot; Maps nlu result to a dialogue response.

        Args:
            results (Result): classification results from nlu

        Returns: a string response to be synthesized by tts

        &amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

        intent &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;RecipeIntent&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_recipe&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;elif&lt;/span&gt; intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;AMAZON.HelpIntent&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_help&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;elif&lt;/span&gt; intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;AMAZON.StopIntent&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_stop&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_error&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_recipe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        slots &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; key &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                slot &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;key&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; slot&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;name&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Item&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_fuzzy_lookup&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;slot&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;raw_value&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_not_found&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;slot&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;raw_value&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;RECIPE_NOT_FOUND_WITHOUT_ITEM_NAME&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_help&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;HELP_MESSAGE&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_stop&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;STOP&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_error&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;ERROR&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_fuzzy_lookup&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; raw_value&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        matched&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; score &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; process&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;extractOne&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;raw_value&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_names&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; score &lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_threshold&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            recipe &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_recipes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;get&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;matched&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; recipe
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; raw_value

    &lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;_not_found&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;self&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; raw_value&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; self&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;_response&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;RECIPE_NOT_FOUND_WITH_ITEM_NAME&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;raw_value&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now we can add the dialogue manager to the event handler.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;event&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;on_speech&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    transcript &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; nlu&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; dialogue_manager&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;OK, that was a lot to cover, but we are almost to the finish line. In the next section, we will learn how to convert the app’s text responses into speech.&lt;/p&gt;&lt;h2 id=&quot;text-to-speech-tts&quot;&gt;&lt;a href=&quot;#text-to-speech-tts&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Text to Speech (TTS)&lt;/h2&gt;&lt;p&gt;Much like the name suggests, &lt;a href=&quot;/docs/concepts/tts&quot;&gt;TTS&lt;/a&gt; translates written text into its spoken form with a synthetic voice. This tutorial assumes you are using our default voice, but if you have a paid plan you can replace &lt;code&gt;demo-male&lt;/code&gt; with the name of a custom voice. To initialize the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/tts/clients/spokestack.py#L20&quot;&gt;TTSClient&lt;/a&gt;, you simply do the following:&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is another part where you will need your Spokestack API keys. However, notice that the URL for TTS is slightly different than for ASR.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;clients&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; TextToSpeechClient

client &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TextToSpeechClient&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;your_key&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;your_secret_key&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Another important aspect of this section is playback. We have a &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/io/pyaudio.py#L76&quot;&gt;PyAudio-based output class&lt;/a&gt; that will play through your system’s default playback device. As a convenient way to manage speech synthesis and playback, we have the &lt;a href=&quot;https://github.com/spokestack/spokestack-python/blob/4009a9d8b61cd4375886c66ca0d4a87d99e12153/spokestack/tts/manager.py#L9&quot;&gt;TTSManager&lt;/a&gt;. Look below to see how to initialize that with an output source.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;io&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pyaudio &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; PyAudioOutput
&lt;span class=&quot;token keyword&quot;&gt;from&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;manager &lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; TextToSpeechManager

output &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; PyAudioOutput&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

manager &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; TextToSpeechManager&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;client&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; output&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now we can update our event handler to read the response.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;&lt;span class=&quot;token decorator annotation punctuation&quot;&gt;@pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;event&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;on_speech&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
    transcript &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; nlu&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; dialogue_manager&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;synthesize&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;response&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;text&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;demo-male&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;lets-run-it&quot;&gt;&lt;a href=&quot;#lets-run-it&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Let’s Run it!&lt;/h2&gt;&lt;p&gt;We now have a fully working example except for two final commands. We have to start and run the pipeline!&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;python&quot;&gt;&lt;pre class=&quot;language-python&quot;&gt;&lt;code class=&quot;language-python&quot;&gt;pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;start&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;run&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then you can start the skill by running this command in the terminal. The skill will remain running until you say “stop”.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;python myapp.py&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now you should be able ask your skill about how to craft things in Minecraft. If you get a chance, I recommend trying it out in-game; despite its simplicity, it is actually very useful.&lt;/p&gt;&lt;p&gt;At this point you may have already cloned the repository, but if you have not, check out the full example on &lt;a href=&quot;https://github.com/spokestack/minecraft-skill-python&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;&lt;h2 id=&quot;contact-us&quot;&gt;&lt;a href=&quot;#contact-us&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Contact Us&lt;/h2&gt;&lt;p&gt;If you have any questions while getting this set up we have a &lt;a href=&quot;https://forum.spokestack.io/&quot;&gt;forum&lt;/a&gt;, or you can open an issue on &lt;a href=&quot;https://github.com/spokestack/spokestack-python/issues&quot;&gt;GitHub&lt;/a&gt;. In addition, I am more than happy to help if you want to reach out to me personally via &lt;a href=&quot;mailto:will@spokestack.io&quot;&gt;email&lt;/a&gt; or &lt;a href=&quot;https://twitter.com/_Will_Rice&quot;&gt;Twitter&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Why We Built Spokestack Tray]]></title><description><![CDATA[Spokestack Tray makes it much easier to begin experimenting with adding a voice interface to your app and conversing with your customers. Get started building your own independent voice assistant.]]></description><link>https://www.spokestack.io/blog/why-we-built-spokestack-tray</link><guid isPermaLink="false">https://www.spokestack.io/blog/why-we-built-spokestack-tray</guid><pubDate>Thu, 24 Sep 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5bdab76f681c51c1167163600f802fae/8537d/introducing-spokestack-tray.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACeUlEQVQoz12SS0gUcRzHh7p1CfUkUpqP1tadncUHpGl6EDQsfOs+ZvMFaUUmnSJS8xBiUB0iPERBhyIPIdUxOoRhbVSHsHRJoVjbebi77kOSfPCJmVVcOnyGOfz/n/l9v78RJI/OHhp2t47DoyF5VMp7NarlZfrGQvSMhmgdVGgY0slvX8XmjiC5VSQ5Sk7NI3JqHuKQowipQkNmSK3OMAUdMQqbFNz9S/h8AWZmA7z3/eLL9yg3J55zut+H1RWh2BsnXZRJE9045DiC5DZEmklSFqJuYIHzI7NcfhBi9KnO3VdhKpoXsdX46R6LMTI8Sa37BUWeuDmh5cQwlopr2OUYQlKUnLBYVjjSmuDGxDSf3gzTMx6g/sISpwYWySxe4EDuPCWySn5bCJsrZNYiemIUVl7HUnEV0bMTeXdCA5trhePdAWr7/ZTKOgcLFki3LJDl8JPp8FPZp6X0reLwxsmw9pFh7cXhjSGITg0DW6eBiuhUOdauk9cSprxbYf+hOYT0rwhpX9mXNUd5l4rNqWJ3adhdyaVkV02SXXUfSV7dE5q4ktKys0FKvUEahlaY+bDGu48JZnwGceoHdfKaNexOxTxrIHkSSHLCvC8YD9H8WhLjvaxLoUQOMjA2zzabbG/9BdbZ2tjg9vg8Vd2q2ZdoSjVsHUFsnYo5lGBPke31qFPk/M2de9PEE2HeLq/xI7RJNLbGpXOjtJy5QkHjNxyesBnbHGonpZC6kNTfp8ilUX0xzNTrOLdehpmYijH57CeFdU84fPIx1qbPSHLEFJr3dtIJuz38j9ipUtShktuoYG1RsbYpHG3VkeQ/SPI6oitixtxd5i7/AJNMKQ4naPXFAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Why We Built Spokestack Tray&quot; title=&quot;Why We Built Spokestack Tray&quot; src=&quot;/static/5bdab76f681c51c1167163600f802fae/05162/introducing-spokestack-tray.png&quot; srcSet=&quot;/static/5bdab76f681c51c1167163600f802fae/2eeed/introducing-spokestack-tray.png 294w,/static/5bdab76f681c51c1167163600f802fae/0d6a1/introducing-spokestack-tray.png 588w,/static/5bdab76f681c51c1167163600f802fae/05162/introducing-spokestack-tray.png 1175w,/static/5bdab76f681c51c1167163600f802fae/8537d/introducing-spokestack-tray.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;We started Spokestack to give developers the freedom to build the conversational experiences they wanted to build &lt;em&gt;without&lt;/em&gt; limitations. So we created &lt;a href=&quot;https://github.com/spokestack/spokestack-ios&quot;&gt;iOS&lt;/a&gt;, &lt;a href=&quot;https://github.com/spokestack/spokestack-android&quot;&gt;Android&lt;/a&gt;, and &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack&quot;&gt;React Native&lt;/a&gt; libraries that give mobile developers complete flexibility for adding voice interfaces to their mobile apps. We made it possible for a mobile developer to design and deploy a voice interface any way they felt like it.&lt;/p&gt;&lt;p&gt;But sometimes a blank slate is daunting. Asking someone with no background in linguistics to build a conversational experience from scratch was like asking someone who hasn’t skied before to pull off a ski jump.&lt;/p&gt;&lt;figure&gt;&lt;img alt=&quot;Ski Jump&quot; src=&quot;./ski_jump.gif&quot; style=&quot;width:100%&quot;/&gt;&lt;figcaption&gt;Not a real developer. Don&amp;#x27;t try this at home.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;As it turns out, few mobile developers understand how to &lt;a href=&quot;https://spokestack.io/docs/design/getting-started&quot;&gt;design conversational experiences&lt;/a&gt;. And few voice developers know how to build mobile apps. At Spokestack, we’re hoping to nudge both sides closer to each other.&lt;/p&gt;&lt;p&gt;With Spokestack Tray, we’ve created a ready-made “voice kit” that still allows for customization without having to design a voice interface from scratch. Will you still need to build a conversation? Yes, but you don’t have to think about how the user will interact with the conversation. Spokestack Tray makes it much easier to begin experimenting with adding a voice interface to your app and conversing with your customers.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Spokestack Tray Demo&quot; src=&quot;https://www.youtube-nocookie.com/embed/0RBITe8RNco?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;Please check out &lt;a href=&quot;/blog/introducing-spokestack-tray&quot;&gt;Spokestack Tray&lt;/a&gt; and tell us what you think. If you have any questions or feedback, email us at &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;hello@spokestack.io&lt;/a&gt;, and let’s talk conversational interfaces!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Introducing Spokestack Tray – a Turnkey Voice Interface for Mobile Apps]]></title><description><![CDATA[Today we announce Spokestack Tray: a UI component for iOS, Android, and React Native making it easy to add Spokestack to any mobile app. Create your own Independent Voice Assistant!]]></description><link>https://www.spokestack.io/blog/introducing-spokestack-tray</link><guid isPermaLink="false">https://www.spokestack.io/blog/introducing-spokestack-tray</guid><pubDate>Mon, 21 Sep 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5bdab76f681c51c1167163600f802fae/8537d/introducing-spokestack-tray.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACeUlEQVQoz12SS0gUcRzHh7p1CfUkUpqP1tadncUHpGl6EDQsfOs+ZvMFaUUmnSJS8xBiUB0iPERBhyIPIdUxOoRhbVSHsHRJoVjbebi77kOSfPCJmVVcOnyGOfz/n/l9v78RJI/OHhp2t47DoyF5VMp7NarlZfrGQvSMhmgdVGgY0slvX8XmjiC5VSQ5Sk7NI3JqHuKQowipQkNmSK3OMAUdMQqbFNz9S/h8AWZmA7z3/eLL9yg3J55zut+H1RWh2BsnXZRJE9045DiC5DZEmklSFqJuYIHzI7NcfhBi9KnO3VdhKpoXsdX46R6LMTI8Sa37BUWeuDmh5cQwlopr2OUYQlKUnLBYVjjSmuDGxDSf3gzTMx6g/sISpwYWySxe4EDuPCWySn5bCJsrZNYiemIUVl7HUnEV0bMTeXdCA5trhePdAWr7/ZTKOgcLFki3LJDl8JPp8FPZp6X0reLwxsmw9pFh7cXhjSGITg0DW6eBiuhUOdauk9cSprxbYf+hOYT0rwhpX9mXNUd5l4rNqWJ3adhdyaVkV02SXXUfSV7dE5q4ktKys0FKvUEahlaY+bDGu48JZnwGceoHdfKaNexOxTxrIHkSSHLCvC8YD9H8WhLjvaxLoUQOMjA2zzabbG/9BdbZ2tjg9vg8Vd2q2ZdoSjVsHUFsnYo5lGBPke31qFPk/M2de9PEE2HeLq/xI7RJNLbGpXOjtJy5QkHjNxyesBnbHGonpZC6kNTfp8ilUX0xzNTrOLdehpmYijH57CeFdU84fPIx1qbPSHLEFJr3dtIJuz38j9ipUtShktuoYG1RsbYpHG3VkeQ/SPI6oitixtxd5i7/AJNMKQ4naPXFAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Introducing Spokestack Tray&quot; title=&quot;Introducing Spokestack Tray&quot; src=&quot;/static/5bdab76f681c51c1167163600f802fae/05162/introducing-spokestack-tray.png&quot; srcSet=&quot;/static/5bdab76f681c51c1167163600f802fae/2eeed/introducing-spokestack-tray.png 294w,/static/5bdab76f681c51c1167163600f802fae/0d6a1/introducing-spokestack-tray.png 588w,/static/5bdab76f681c51c1167163600f802fae/05162/introducing-spokestack-tray.png 1175w,/static/5bdab76f681c51c1167163600f802fae/8537d/introducing-spokestack-tray.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Today we are excited to announce our latest developer feature: Spokestack Tray, a mobile library that lets developers easily add a voice interface to their existing app through a UI component that opens inside the application. Spokestack Tray incorporates our voice services, such as a wake word, speech recognition, natural language processing, and custom text-to-speech voices, into one easy-to-add package along with any existing mobile application. We’ve built a demo, the Bartender mobile app, that uses Spokestack Tray.&lt;/p&gt;&lt;p&gt;Here’s a video of Spokestack Tray working in our demo Bartender app:&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Spokestack Tray Demo&quot; src=&quot;https://www.youtube-nocookie.com/embed/0RBITe8RNco?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;Download the Bartender app and try Spokestack Tray on your own phone:&lt;/p&gt;&lt;p&gt;iOS: &lt;a href=&quot;https://apps.apple.com/us/app/get-the-bartender/id1530425843&quot;&gt;https://apps.apple.com/us/app/get-the-bartender/id1530425843&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Android: &lt;a href=&quot;https://play.google.com/store/apps/details?id=com.spokestack.bartender&quot;&gt;https://play.google.com/store/apps/details?id=com.spokestack.bartender&lt;/a&gt;&lt;/p&gt;&lt;h3 id=&quot;why-is-this-a-big-deal&quot;&gt;&lt;a href=&quot;#why-is-this-a-big-deal&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why is this a big deal?&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;strong&gt;We’ve made adding a voice interface to a mobile app easier.&lt;/strong&gt; We’ve simplified the process of building a voice interface for mobile apps and reduced it to just dropping a new UI element into your app! Even better, the UI element can be customized to fit into your app’s scenes and branding.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;Using Spokestack Tray for search and navigation is fast.&lt;/strong&gt; Tray manages the entire conversation on the user’s phone. We believe speed will drive adoption by both developers and users who want an easier way to find information and services when accessing mobile applications.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;User conversations are private.&lt;/strong&gt; As described above, our on-device voice interface libraries keep user conversations private. Because our natural language models operate on the device, developers and consumers do not need to worry about their conversations being listened to by third parties.&lt;/li&gt;&lt;li&gt;&lt;strong&gt;It works well with Apple Siri shortcuts and Google Assistant shortcuts.&lt;/strong&gt; Speaking of third parties, protecting conversations between companies and their customers is a big selling point in our pitch. We provide hooks to integrate Siri shortcuts and Google Assistant shortcuts so your users can initiate a voice interaction from anywhere while keeping you in control of your and your users’ data.&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;getting-started-with-spokestack-tray&quot;&gt;&lt;a href=&quot;#getting-started-with-spokestack-tray&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Getting started with Spokestack Tray&lt;/h3&gt;&lt;p&gt;We want to make it as easy as possible for you to get started with Spokestack Tray. Here are the steps to get a prototype up quick:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;a href=&quot;/create&quot;&gt;Sign up for a Spokestack account&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Check out the documentation for &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray&quot;&gt;React Native Tray&lt;/a&gt; or &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios&quot;&gt;iOS Tray&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Follow a tutorial: &lt;a href=&quot;/blog/integrating-spokestack-in-ios&quot;&gt;Spokestack Tray iOS Tutorial&lt;/a&gt; or &lt;a href=&quot;/blog/integrating-spokestack-in-react-native&quot;&gt;Spokestack Tray React Native Tutorial&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Get help in our &lt;a href=&quot;https://forum.spokestack.io/t/spokestack-tray/56&quot;&gt;Spokestack Tray Help&lt;/a&gt; forum&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;If you’re new to voice, we’ve put together an introduction to how Spokestack thinks about voice. For more on how to design the voice experience, walk through the &lt;a href=&quot;/docs/design/getting-started&quot;&gt;Design&lt;/a&gt; section of our docs.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Voice is just another interface&quot; src=&quot;https://www.youtube-nocookie.com/embed/wbJ8fZh-iQw?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;h3 id=&quot;customizing-spokestack-tray&quot;&gt;&lt;a href=&quot;#customizing-spokestack-tray&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Customizing Spokestack Tray&lt;/h3&gt;&lt;p&gt;At Spokestack, we believe flexibility and customization are keys to building successful conversational interfaces. We will be adding more design resources and tools over the next few weeks. Additionally, we help you further customize your app with custom wake words, TTS, and your own cloud or on-device NLU model.&lt;/p&gt;&lt;h3 id=&quot;want-a-free-wake-word-nlu-training-and-custom-tts&quot;&gt;&lt;a href=&quot;#want-a-free-wake-word-nlu-training-and-custom-tts&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Want a free wake word, NLU training, and custom TTS?&lt;/h3&gt;&lt;p&gt;After you get a demo up and running, let us know if we can build a custom wake word or TTS for your mobile app. We’ll hook up the first three developers who show us a working version of Spokestack Tray in their mobile app with a free wake word, NLU training, and custom TTS. &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;Email us&lt;/a&gt; your working demo and we’ll help you add these features to your app so you can publish it live.&lt;/p&gt;&lt;h3 id=&quot;whats-next&quot;&gt;&lt;a href=&quot;#whats-next&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What’s next?&lt;/h3&gt;&lt;p&gt;Expect an update on design tools, the Android Tray library, and additional tutorials soon. Follow us on &lt;a href=&quot;https://www.twitter.com/spokestack&quot;&gt;Twitter&lt;/a&gt; for all updates. Can’t wait to see how people use Tray with their mobile apps.&lt;/p&gt;&lt;p&gt;Have fun talking to your customers!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack in iOS]]></title><description><![CDATA[This tutorial guides you through the process of adding spokestack-tray-ios to an iOS app. Turn your app into an independent voice assistant without changing your app's navigation or infrastructure.]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-in-ios</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-in-ios</guid><pubDate>Fri, 18 Sep 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/487ddfe2e0959d3640526c4bd306307a/8537d/ios-tray-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACXUlEQVQoz02SP0wTUQCHb9RNo4kOJrKY4Aaji5uOmhgnFx3UxAUXBk1MXBw0kYhAIhuLiQYtREJpaXtX7tqi5Y9FwNZSyh+hvV5b2tISbHsHn7l3NHDJ795L7r3vvt+7kw4KQVqx8grW0UhRprodorc/Rjy2wFZqEZdrmX39B+yqHBaDmPkglGR8ygyTyoyYSyeBInkFM6+KTZV1DelCglc9m6ihbfxTGYr5DL7xITJJD+yosOPn2YcU3QMp2AkgCauTKagc6GPsb3zm2+wmTwcNXrvyvB0tcPPeKplUlE8fe0nMj0BJhYpG+ucXUnPDsKs5wOPKMpQi1JKDLGsvCM3FuNW1yu2uNDcerXKxI46RCsHusQCVEE8e3uXxgztQ1pzKzkMZy5DFvKH7ID9GeS3E+fYEZ6784Vx7kkudCQrpCJQUcX4CWNYIjL7H53rnAJs5H019UqShe2nqXih6oeqhvK5wum0J6eyiyKnLS+jJIJbhoan7HQm7VS0MtYh4gSRgNjTnx8z5hF1Yi+D1zxOJLPB9tko4WiMUrTE9W8MsRqGiOAJZjxBoZCdFzJwfyb6ZRkDEMgI0cwHisQiur99QFRnnagL7QIO+N8952X2fvb8TWHk/jaz3COyISS2Q1YLaFSoyI0PdTLkHKNfBvVJlasVia0vn+rVOOq62EXb3i6oN3RbyC5jdUGqdw8mYhsJhSaNenOZfdYPf2TTxtQ2y6wt4XX34XD1s/hoWv40jJDst7cq2bkP3iA9yPHqpZz3UsxPUM+Ogu8EYx8x5oKbCngbFgFgj1tvjUf4DtqbvWLl3diUAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;How to Integrate Spokestack in iOS&quot; title=&quot;How to Integrate Spokestack in iOS&quot; src=&quot;/static/487ddfe2e0959d3640526c4bd306307a/05162/ios-tray-hero.png&quot; srcSet=&quot;/static/487ddfe2e0959d3640526c4bd306307a/2eeed/ios-tray-hero.png 294w,/static/487ddfe2e0959d3640526c4bd306307a/0d6a1/ios-tray-hero.png 588w,/static/487ddfe2e0959d3640526c4bd306307a/05162/ios-tray-hero.png 1175w,/static/487ddfe2e0959d3640526c4bd306307a/8537d/ios-tray-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;As much time, talent and treasure that Apple has put into Siri, ASR, and NLU, integrating a custom voice-enabled app experience is still challenging!
Spokestack changes all of that. Our mission at Spokestack is to make it as easy as possible to make your apps fully voice-enabled.&lt;/p&gt;&lt;p&gt;After building the services needed to make voice interaction work, including &lt;a href=&quot;/docs/concepts/wakeword-models&quot;&gt;Wake Word&lt;/a&gt;, &lt;a href=&quot;/docs/concepts/asr&quot;&gt;Speech Recognition&lt;/a&gt;, &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;Natural Language Understanding&lt;/a&gt;, and &lt;a href=&quot;/docs/concepts/tts&quot;&gt;Text-to-speech&lt;/a&gt;, we started working on ways users could integrate these services without having to completely rewrite their applications.&lt;/p&gt;&lt;p&gt;Introducing &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios&quot;&gt;spokestack-tray-ios&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/6dc6c4ad9687295231944b01fc14ff00/ios-tray-demo.gif&quot; alt=&quot;iOS Spokestack Tray Example&quot;/&gt;&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios&quot;&gt;spokestack-tray-ios&lt;/a&gt; is an iOS framework that is designed to work in any application, regardless of its layout or navigation. It utilizes &lt;a href=&quot;https://github.com/spokestack/spokestack-ios&quot;&gt;spokestack-ios&lt;/a&gt;, to add voice experiences. With on-device wake word, ASR, and NLU, the tray’s silent mode works completely offline–TTS is the only service that requires a network.&lt;/p&gt;&lt;p&gt;With a few required props (and &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios/blob/master/SpokestackTray/Models/TrayConfiguration.swift&quot;&gt;lots of optional ones&lt;/a&gt;), you can start building a customizable voice experience without the hassle that usually comes with listening for a wake word, working with a microphone, or playing audio in iOS.&lt;/p&gt;&lt;p&gt;This tutorial will guide you through the process of installing &lt;code&gt;spokestack-tray-ios&lt;/code&gt; as well as using the SpokestackTray framework to respond to your users.&lt;/p&gt;&lt;h3 id=&quot;installation&quot;&gt;&lt;a href=&quot;#installation&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Installation&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;CocoaPods&lt;/strong&gt; is a dependency manager for Cocoa projects. For usage and installation instructions, visit their &lt;a href=&quot;https://cocoapods.org&quot;&gt;website&lt;/a&gt;. To integrate &lt;a href=&quot;https://github.com/spokestack/spokestack-ios&quot;&gt;Spokestack&lt;/a&gt; into your Xcode project using CocoaPods, specify it in your Podfile:&lt;/p&gt;&lt;p&gt;&lt;code&gt;pod &amp;#x27;SpokestackTray-iOS&amp;#x27;&lt;/code&gt;&lt;/p&gt;&lt;p&gt;From your terminal in the project directory run&lt;/p&gt;&lt;p&gt;&lt;code&gt;pod install&lt;/code&gt;&lt;/p&gt;&lt;p&gt;Either in your &lt;code&gt;Podfile&lt;/code&gt; or your Project / Target settings set the minimum iOS target to &lt;strong&gt;iOS 13&lt;/strong&gt;.&lt;/p&gt;&lt;h3 id=&quot;edit-infoplist&quot;&gt;&lt;a href=&quot;#edit-infoplist&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit Info.plist&lt;/h3&gt;&lt;p&gt;Spokestack requires access to your device’s microphone and Apple’s speech recognition service.&lt;/p&gt;&lt;p&gt;&lt;em&gt;without these your app will crash&lt;/em&gt;&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;NSMicrophoneUsageDescription&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;For making voice requests&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;NSSpeechRecognitionUsageDescription&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;For understanding your voice requests&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;edit-the-appdelegateswiftm&quot;&gt;&lt;a href=&quot;#edit-the-appdelegateswiftm&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit the AppDelegate.(swift|m)&lt;/h3&gt;&lt;p&gt;Spokestack is dependent on the appropriate &lt;code&gt;AVAudioSession&lt;/code&gt; to be configured. For &lt;em&gt;most&lt;/em&gt; apps this can be done in the &lt;code&gt;AppDelegate.(swift|m)&lt;/code&gt;&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; session &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;AVAudioSession&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;sharedInstance&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; session&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setCategory&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;playAndRecord&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; options&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;defaultToSpeaker&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;allowAirPlay&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;allowBluetoothA2DP&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;allowBluetooth&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; session&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setActive&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; options&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;usage&quot;&gt;&lt;a href=&quot;#usage&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Usage&lt;/h3&gt;&lt;p&gt;The &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios/tree/master/SpokestackTrayExample&quot;&gt;spokestack-tray-ios example app&lt;/a&gt; uses the “Spokestack” wake word and &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;sample Minecraft NLU models&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;To create your own copy of the Spokestack/Minecraft Helper, add the &lt;code&gt;SpokestackTray&lt;/code&gt; framework:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;SpokestackTray&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Spokestack&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;override&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;viewDidLoad&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;viewDidLoad&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; configuration&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;TrayConfiguration&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;TrayConfiguration&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// When the tray is opened for the first time this is the synthesized&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;/// greeting that will be &amp;quot;said&amp;quot; to the user&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;greeting &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&amp;quot;
    &lt;span class=&quot;token builtin&quot;&gt;Welcome&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;This&lt;/span&gt; example finds tutorials &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Minecraft&lt;/span&gt; crafting&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Try&lt;/span&gt; saying&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; \&amp;quot;&lt;span class=&quot;token builtin&quot;&gt;How&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt; I make a castle&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;\&amp;quot;
    &lt;span class=&quot;token string&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&amp;quot;

    &lt;span class=&quot;token comment&quot;&gt;/// When the tray is listening or processing speech there is a animated gradient that&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;/// sits on top of the tray. The default values are red, white and blue&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;gradientColors &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;token string&quot;&gt;&amp;quot;#61fae9&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spstk_color&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token string&quot;&gt;&amp;quot;#2F5BEA&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spstk_color&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;UIColor&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;systemRed
    &lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// Apart of the initialization of the tray is to download the nlu and wake word models.&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;/// These are the default Spokestack models, but you can replace with your own&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nluModelURLs &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;NLUModelURLMetaDataKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/nlu.tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;NLUModelURLNLUKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/vocab.txt&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;NLUModelURLVocabKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/metadata.json&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;wakewordModelURLs &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;WakeWordModelDetectKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/model/wake/spokestack/detect.tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;WakeWordModelEncodeKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/model/wake/spokestack/encode.tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;WakeWordModelFilterKey&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;https://d3dmqd7cy685il.cloudfront.net/model/wake/spokestack/filter.tflite&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;cliendId &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;YOUR_CLIENT_ID&amp;quot;&lt;/span&gt;
    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;clientSecret &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;YOUR_CLIENT_SECRET&amp;quot;&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// The handleIntent callback is how the SpeechController and the TrayViewModel know if&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;/// NLUResult should be processed and what text should be added to the tableView.&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; greeting&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResult&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;IntentResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;greeting&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;greeting&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; lastNode&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResult&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; greeting

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;handleIntent &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;intent&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; utterance &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;switch&lt;/span&gt; intent &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;repeat&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; lastNode
            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;yes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;IntentResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;search&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;I heard you say yes! What would you like to make?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;no&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;IntentResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exit&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;I heard you say no. Goodbye&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;stop&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;cancel&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;fallback&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;IntentResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exit&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Goodbye!&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;recipe&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;

                &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; whatToMakeSlot&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Dictionary&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Slot&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; slot&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Slot&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; whatToMakeSlot&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;Item&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; item&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; slot&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

                    lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;IntentResult&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;recipe&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                                            prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&amp;quot;
                                            &lt;span class=&quot;token builtin&quot;&gt;If&lt;/span&gt; I were a real app&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; I would show a screen now on how to make a \&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;item&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Want&lt;/span&gt; to &lt;span class=&quot;token keyword&quot;&gt;continue&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;
                                            &lt;span class=&quot;token string&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&amp;quot;
                                &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;IntentResultAmazonType&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;help&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; greeting
            &lt;span class=&quot;token keyword&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
                lastNode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; greeting
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; lastNode
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// Which NLUNodes should trigger the tray to close automatically&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exitNodes &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;InterntResultNode&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;exit&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;rawValue
    &lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// Callback when the tray is opened. The call back is called _after_ the animation has finished&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;onOpen &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;LogController&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;shared&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;isOpen&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// Callback when the tray is closed. The call back is called _after_ the animation has finished&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;onClose &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;LogController&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;shared&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;onClose&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;/// Callback when a `TrayListenerType` has occured&lt;/span&gt;

    configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;onEvent &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;event &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
        &lt;span class=&quot;token builtin&quot;&gt;LogController&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;shared&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;onEvent &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;event&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; tray&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;SpokestackTrayViewController&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;SpokestackTrayViewController&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; configuration&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; configuration&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    tray&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;addToParentView&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    tray&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;listen&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;clientId and clientSecret&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;The &lt;code&gt;clientId&lt;/code&gt; and &lt;code&gt;clientSecret&lt;/code&gt; props are where you pass your API tokens generated in your Spokestack account. First, &lt;a href=&quot;/create&quot;&gt;create a free account&lt;/a&gt;. Then, &lt;a href=&quot;/account/settings#api&quot;&gt;generate a token&lt;/a&gt; on the account settings page. Don’t worry, there’s no hidden subscription.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;nluModelUrls&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;These URLs point to the sample Minecraft NLU model files we have hosted on a CDN, which are downloaded automatically by &lt;code&gt;SpokestackTray&lt;/code&gt; the first time the app launches (and only the first time). The files are then saved to the app’s cache for future use so they only need to be downloaded once. NLU models can vary drastically in size, and it’s a good idea to include them in the app bundles and instead load them lazily.&lt;/p&gt;&lt;p&gt;At this point, you may be wondering what an “NLU” does. While Automatic Speech Recognition provides us with a way to process speech into text, that text is rarely enough to figure out what the user wants the app to do. Natural Language Understanding (NLU) is the next step to process the text into what voice platforms call “intents”.&lt;/p&gt;&lt;p&gt;A good example is searching with voice. If your app has just said, “What would you like to search for?” and the user says, “Bananas,” you can be reasonably sure that the user means for the app to search for bananas without the help of an NLU.&lt;/p&gt;&lt;p&gt;But if the user initiated the interaction and said, “Search for bananas”, the NLU can parse that statement into an intent (e.g. “search”) with variables (e.g. “bananas”). If you were only using ASR, you’d probably end up searching for the whole sentence, “search for bananas”, rather than just “bananas”, which may yield different results. For more information on NLU in Spokestack, please check out &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;our guide&lt;/a&gt;. If you’ve already built an NLU model in Alexa, Dialogflow, or Jovo, check out &lt;a href=&quot;/docs/integrations/export&quot;&gt;our guide on exporting existing NLU models from other platforms&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;handleIntent&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;So, intents are commands for the app based on what the user said. Now that you know how you get intents, it’s your responsibility to respond to those intents. There are two questions to answer for any given intent:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;What should the app say in response to the user? This is the return value of &lt;code&gt;handleIntent&lt;/code&gt; and is always required.&lt;/li&gt;&lt;li&gt;Should the app update the UI? Note that not all intents will need to make UI changes.
In the voice search example, if the user has just searched for “bananas”, the answer to question #1 might be to say, “Here are your search results.” The answer to question #2 would probably be to show the search results. Remember that the NLU doesn’t do the search for you; it just tells you the proper search terms.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;These answers could be written like this…&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;configuration&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;handleIntent &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;intent&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; slots&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; utterance &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; intent &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;search&amp;quot;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; searchService&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;slots&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;ingrident&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        viewModel&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;addSearchResults&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

        &lt;span class=&quot;token comment&quot;&gt;// Return a response&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            prompt&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Here are your search results.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
            node&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;search_results&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;prompt&lt;/code&gt; will be synthesized using Spokestack’s TTS service, and then played using the device’s native audio player (unless the tray is in &lt;code&gt;silent&lt;/code&gt; mode).&lt;/p&gt;&lt;p&gt;The &lt;code&gt;node&lt;/code&gt; property is metadata to help you track conversation state, and the value is completely up to you. The only reason &lt;code&gt;SpokestackTray&lt;/code&gt; needs it is to determine whether to listen again after the prompt has been said.&lt;/p&gt;&lt;p&gt;If the node is specified in the &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios/blob/master/SpokestackTray/Models/TrayConfiguration.swift#L79&quot;&gt;exitNodes&lt;/a&gt; prop, the conversation will stop and SpokestackTray will close.&lt;/p&gt;&lt;p&gt;If the node is not an exit node, &lt;code&gt;SpokestackTray&lt;/code&gt; will stay open and listen to the user again, and the process will repeat.&lt;/p&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;Hopefully, we’ve given you a glimpse into just how powerful &lt;code&gt;spokestack-tray-ios&lt;/code&gt; can be. Add the &lt;code&gt;SpokestackTray framework&lt;/code&gt; to your iOS app to start building elegant conversational user experiences.&lt;/p&gt;&lt;p&gt;For complete documentation, check out &lt;a href=&quot;https://github.com/spokestack/spokestack-tray-ios&quot;&gt;spokestack-tray-ios on GitHub&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;For support, we &lt;a href=&quot;/support&quot;&gt;offer multiple support channels&lt;/a&gt; to help you get started.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Integrating Spokestack in React Native]]></title><description><![CDATA[This tutorial guides you through the process of adding react-native-spokestack-tray into an existing React Native app. Turn your app into an independent voice assistant without changing your app's navigation or infrastructure.]]></description><link>https://www.spokestack.io/blog/integrating-spokestack-in-react-native</link><guid isPermaLink="false">https://www.spokestack.io/blog/integrating-spokestack-in-react-native</guid><pubDate>Tue, 01 Sep 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/c3b9daefbfa80418eb4a9792c8d0136b/8537d/react-native-tray-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACY0lEQVQoz2WSS2vUUBiGgyAKLrR0I4iiC62CGwXrQv+BWhGkv0BwJYgKuhFceteNiHgp1GqZtrS2Vm21i1qLddq5ZjKTmc441s4kmSQnc7Utrh5JMk4FAw8nl3Oe835fjmQkHFrIAj3hoMuCkmLzIyK4ebvI2JDJ9ITN6z6TQsJBqHXMZBVNFoikw+CwycCwiZ10kP4V6k2pJlexUw1yQcGG7UmkjTLnLucIRYtoBYe3ff3MTwax1AZCsbh43+LCXQuRtJG8VE38hFVMOYsRnSMwU+Far01nd54texSOd+XIxRx6HjxkdnwG2xWqVRY+JglOJHDSNSRP1EppYyhr6KFREp+vMzmtc+ZSniOnF9m2T6Vtv4o6J7BSFYxEBS1uU11coetUNydOnqWS+eWX7Eq1uPAmaHEHLaYhlBzZoEN7R5r2A2l2HMqw83CGdFBgp5xWVSJd59WTIV4+HvB6KxWiFoWI6bHcHE3ZppxyyAZNNu9WkNpkpK0ym3YpKLMltFgJd53eDFHNrnq4G0iuoBi1WrjS8Tc6gUHBxAebrwsNvgQbzHxrePcltYadrHkb/wyV/CDhEsuRkrdeKsYstJjtE7cpxmxmp0x6nkV5N6qyfv32uHXjDlfOXyU/X0SLCV8W9sWe0O/bOm7ssurw4t4jxvtHKK/B1FKdRQG6pnOs8ygH93Yw1vuecmaFQtT0QhSaFUruz9D/0my0+85I1tDUChWzTuh7mZBSI68WCDwfJfB0hPCnBJZ7uJuVuaMn9COb/+H2x2Vp3sAIlzCiBoWIRTm9QiWziiGXve/+fL9s9/kPGnvq3tGjJU0AAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;How to Integrate Spokestack in React Native&quot; title=&quot;How to Integrate Spokestack in React Native&quot; src=&quot;/static/c3b9daefbfa80418eb4a9792c8d0136b/05162/react-native-tray-hero.png&quot; srcSet=&quot;/static/c3b9daefbfa80418eb4a9792c8d0136b/2eeed/react-native-tray-hero.png 294w,/static/c3b9daefbfa80418eb4a9792c8d0136b/0d6a1/react-native-tray-hero.png 588w,/static/c3b9daefbfa80418eb4a9792c8d0136b/05162/react-native-tray-hero.png 1175w,/static/c3b9daefbfa80418eb4a9792c8d0136b/8537d/react-native-tray-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Not long ago, adding voice to any mobile app was not only time-consuming, but difficult and convoluted. This was especially true when adding voice experiences on multiple platforms, as might be the case when using React Native. Our mission at Spokestack is to make it as easy as possible to make your apps fully voice-enabled.&lt;/p&gt;&lt;p&gt;After building the services needed to make voice interaction work, including &lt;a href=&quot;/docs/concepts/wakeword-models&quot;&gt;Wake Word&lt;/a&gt;, &lt;a href=&quot;/docs/concepts/asr&quot;&gt;Speech Recognition&lt;/a&gt;, &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;Natural Language Understanding&lt;/a&gt;, and &lt;a href=&quot;/docs/concepts/tts&quot;&gt;Text-to-speech&lt;/a&gt;, we started working on ways users could integrate these services without having to completely rewrite their applications.&lt;/p&gt;&lt;p&gt;Introducing &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray&quot;&gt;react-native-spokestack-tray&lt;/a&gt;!&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/952ba181a17ca0a6dbf6318dd419881b/react-native-tray-demo.gif&quot; alt=&quot;React Native Spokestack Tray Example&quot;/&gt;&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray&quot;&gt;react-native-spokestack-tray&lt;/a&gt; is a React Native component that is designed to work in any application, regardless of its layout or navigation. It uses multiple existing React Native plugins, including &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack&quot;&gt;react-native-spokestack&lt;/a&gt;, to add voice experiences. With &lt;strong&gt;on-device&lt;/strong&gt; wake word, ASR, and NLU, the tray’s &lt;code&gt;silent&lt;/code&gt; mode works completely offline–TTS is the only service that requires a network.&lt;/p&gt;&lt;p&gt;With a few required props (and &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray#spokestacktray--component-props&quot;&gt;lots of optional ones&lt;/a&gt;), you can start building a customizable voice experience without the hassle that usually comes with listening for a wake word, working with a microphone, or playing audio in iOS and Android.&lt;/p&gt;&lt;p&gt;This tutorial will guide you through the process of installing &lt;code&gt;react-native-spokestack-tray&lt;/code&gt; as well as using the &lt;code&gt;&amp;lt;SpokestackTray /&amp;gt;&lt;/code&gt; component to respond to user intents. We won’t go through the process of &lt;a href=&quot;https://reactnative.dev/docs/environment-setup&quot;&gt;setting up a new React Native project&lt;/a&gt;, but be sure to use the React Native CLI path and not expo.&lt;/p&gt;&lt;h2 id=&quot;installation&quot;&gt;&lt;a href=&quot;#installation&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Installation&lt;/h2&gt;&lt;p&gt;First, install the dependencies. Here’s a one-liner to install &lt;code&gt;react-native-spokestack-tray&lt;/code&gt; and its peer dependencies.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;&lt;span class=&quot;token function&quot;&gt;npm&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;install&lt;/span&gt; react-native-spokestack-tray react-native-spokestack @react-native-community/async-storage @react-native-community/netinfo react-native-video rn-fetch-blob react-native-haptic-feedback react-native-linear-gradient react-native-permissions&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you’re wondering why these dependencies are needed, we’ve listed all of their purposes &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray#each-dependency-by-its-usage&quot;&gt;in the tray’s README&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Now that you have all of the node modules, we need to update some native files. We’ll go through each of them by platform.&lt;/p&gt;&lt;h2 id=&quot;ios-installation&quot;&gt;&lt;a href=&quot;#ios-installation&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;iOS installation&lt;/h2&gt;&lt;h3 id=&quot;edit-podfile&quot;&gt;&lt;a href=&quot;#edit-podfile&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit Podfile&lt;/h3&gt;&lt;p&gt;Our main dependency (react-native-spokestack) makes use of relatively new APIs only available in iOS 13+. Make sure to set your deployment target to iOS 13 at the top of your Podfile:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ruby&quot;&gt;&lt;pre class=&quot;language-ruby&quot;&gt;&lt;code class=&quot;language-ruby&quot;&gt;platform &lt;span class=&quot;token symbol&quot;&gt;:ios&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;13.0&amp;#x27;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We use &lt;a href=&quot;https://github.com/react-native-community/react-native-permissions&quot;&gt;react-native-permissions&lt;/a&gt; to check and request the Microphone permission (iOS and Android) and the Speech Recognition permission (iOS only). This library separates each permission into its own pod to avoid inflating your app with code you don’t use. Add the following pods to your Podfile:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ruby&quot;&gt;&lt;pre class=&quot;language-ruby&quot;&gt;&lt;code class=&quot;language-ruby&quot;&gt;target &lt;span class=&quot;token string&quot;&gt;&amp;#x27;SpokestackTrayExample&amp;#x27;&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt;
  &lt;span class=&quot;token comment&quot;&gt;# ...&lt;/span&gt;
  permissions_path &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;../node_modules/react-native-permissions/ios&amp;#x27;&lt;/span&gt;
  pod &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Permission-Microphone&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token symbol&quot;&gt;:path&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter tag&quot;&gt;#{&lt;/span&gt;permissions_path&lt;span class=&quot;token delimiter tag&quot;&gt;}&lt;/span&gt;&lt;/span&gt;/Microphone.podspec&amp;quot;&lt;/span&gt;
  pod &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Permission-SpeechRecognition&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token symbol&quot;&gt;:path&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter tag&quot;&gt;#{&lt;/span&gt;permissions_path&lt;span class=&quot;token delimiter tag&quot;&gt;}&lt;/span&gt;&lt;/span&gt;/SpeechRecognition.podspec&amp;quot;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We need to use &lt;code&gt;use_frameworks!&lt;/code&gt; in our Podfile because a couple of our dependencies are written using Swift.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ruby&quot;&gt;&lt;pre class=&quot;language-ruby&quot;&gt;&lt;code class=&quot;language-ruby&quot;&gt;target &lt;span class=&quot;token string&quot;&gt;&amp;#x27;SpokestackTrayExample&amp;#x27;&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt;
  use_frameworks&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;
  &lt;span class=&quot;token comment&quot;&gt;#...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;For the time being, &lt;code&gt;use_frameworks!&lt;/code&gt; does not work with Flipper, so we also need to disable Flipper. Remove any Flipper-related lines in your Podfile. In React Native 0.63.2, they look like this:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ruby&quot;&gt;&lt;pre class=&quot;language-ruby&quot;&gt;&lt;code class=&quot;language-ruby&quot;&gt;  &lt;span class=&quot;token comment&quot;&gt;# X Remove or comment out these lines X&lt;/span&gt;
  use_flipper&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;
  post_install &lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt;installer&lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt;
    flipper_post_install&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;installer&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;token comment&quot;&gt;# XX&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Remove your existing Podfile.lock and Pods folder to ensure no conflicts, then install the pods:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;bash&quot;&gt;&lt;pre class=&quot;language-bash&quot;&gt;&lt;code class=&quot;language-bash&quot;&gt;$ npx pod-install&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Refer to the &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray/blob/develop/example/ios/Podfile&quot;&gt;Podfile in our example&lt;/a&gt; for a working Podfile.&lt;/p&gt;&lt;h3 id=&quot;edit-infoplist&quot;&gt;&lt;a href=&quot;#edit-infoplist&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit Info.plist&lt;/h3&gt;&lt;p&gt;Add the following to your Info.plist to enable permissions. In XCode, also ensure your iOS deployment target is set to 13.0.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;NSMicrophoneUsageDescription&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;This app uses the microphone to hear voice commands&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;NSSpeechRecognitionUsageDescription&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;key&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;This app uses speech recognition to process voice commands&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;/&lt;/span&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;edit-appdelegatem&quot;&gt;&lt;a href=&quot;#edit-appdelegatem&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit AppDelegate.m&lt;/h3&gt;&lt;h4 id=&quot;add-avfoundation-to-imports&quot;&gt;&lt;a href=&quot;#add-avfoundation-to-imports&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Add AVFoundation to imports&lt;/h4&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;objc&quot;&gt;&lt;pre class=&quot;language-objc&quot;&gt;&lt;code class=&quot;language-objc&quot;&gt;&lt;span class=&quot;token macro property&quot;&gt;&lt;span class=&quot;token directive-hash&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;token directive keyword&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;token expression&quot;&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;lt;&lt;/span&gt;AVFoundation&lt;span class=&quot;token operator&quot;&gt;/&lt;/span&gt;AVFoundation&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;h&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h4 id=&quot;audiosession-category&quot;&gt;&lt;a href=&quot;#audiosession-category&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;AudioSession category&lt;/h4&gt;&lt;p&gt;Set the AudioSession category to enable microphone input and play from the speaker by default. This also enables input and playback over bluetooth.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;objc&quot;&gt;&lt;pre class=&quot;language-objc&quot;&gt;&lt;code class=&quot;language-objc&quot;&gt;&lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;BOOL&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;application&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;UIApplication &lt;span class=&quot;token operator&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;application didFinishLaunchingWithOptions&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;NSDictionary &lt;span class=&quot;token operator&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;launchOptions
&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  AVAudioSession &lt;span class=&quot;token operator&quot;&gt;*&lt;/span&gt;session &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;AVAudioSession sharedInstance&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;session setCategory&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;AVAudioSessionCategoryPlayAndRecord
     mode&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;AVAudioSessionModeDefault
  options&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;AVAudioSessionCategoryOptionDefaultToSpeaker &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; AVAudioSessionCategoryOptionAllowAirPlay &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; AVAudioSessionCategoryOptionAllowBluetoothA2DP &lt;span class=&quot;token operator&quot;&gt;|&lt;/span&gt; AVAudioSessionCategoryOptionAllowBluetooth
    error&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;nil&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;session setActive&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;YES error&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;nil&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h4 id=&quot;remove-flipper&quot;&gt;&lt;a href=&quot;#remove-flipper&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Remove Flipper&lt;/h4&gt;&lt;p&gt;While Flipper works on fixing their pod for &lt;code&gt;use_frameworks!&lt;/code&gt;, we must disable Flipper. We already removed the Flipper dependencies from Pods above, but there remains some code in the AppDelegate.m that imports Flipper. There are two ways to fix this.&lt;/p&gt;&lt;ol&gt;&lt;li&gt;You can disable Flipper imports without removing any code from the AppDelegate. To do this, open your xcworkspace file in XCode. Go to your target, then Build Settings, search for “C Flags”, remove &lt;code&gt;-DFB_SONARKIT_ENABLED=1&lt;/code&gt; from flags.&lt;/li&gt;&lt;li&gt;Remove all Flipper-related code from your AppDelegate.m.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;In our &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray/tree/develop/example&quot;&gt;example app&lt;/a&gt;, we’ve done option 1 and left in the Flipper code in case they get it working in the future and we can add it back.&lt;/p&gt;&lt;h2 id=&quot;android-installation&quot;&gt;&lt;a href=&quot;#android-installation&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Android installation&lt;/h2&gt;&lt;h3 id=&quot;edit-androidmanifestxml&quot;&gt;&lt;a href=&quot;#edit-androidmanifestxml&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit AndroidManifest.xml&lt;/h3&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;xml&quot;&gt;&lt;pre class=&quot;language-xml&quot;&gt;&lt;code class=&quot;language-xml&quot;&gt;    &lt;span class=&quot;token comment&quot;&gt;&amp;lt;!-- For TTS --&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;uses-permission&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;name&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;android.permission.INTERNET&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;&amp;lt;!-- For wake word and ASR --&amp;gt;&lt;/span&gt;
    &lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;uses-permission&lt;/span&gt; &lt;span class=&quot;token attr-name&quot;&gt;&lt;span class=&quot;token namespace&quot;&gt;android:&lt;/span&gt;name&lt;/span&gt;&lt;span class=&quot;token attr-value&quot;&gt;&lt;span class=&quot;token punctuation attr-equals&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;android.permission.RECORD_AUDIO&lt;span class=&quot;token punctuation&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;edit-appbuildgradle&quot;&gt;&lt;a href=&quot;#edit-appbuildgradle&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Edit app/build.gradle&lt;/h3&gt;&lt;p&gt;Add the following lines:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;android &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;
    defaultConfig &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;
        multiDexEnabled &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// ...&lt;/span&gt;
    packagingOptions &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        exclude &lt;span class=&quot;token string&quot;&gt;&amp;#x27;META-INF/INDEX.LIST&amp;#x27;&lt;/span&gt;
        exclude &lt;span class=&quot;token string&quot;&gt;&amp;#x27;META-INF/DEPENDENCIES&amp;#x27;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id=&quot;usage&quot;&gt;&lt;a href=&quot;#usage&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Usage&lt;/h2&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Build your own voice interface to talk directly to your customers&quot; src=&quot;https://www.youtube-nocookie.com/embed/AvhQ6-9nCrQ?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray/tree/develop/example&quot;&gt;react-native-spokestack-tray example app&lt;/a&gt; uses the “Spokestack” wake word and &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;sample Minecraft NLU models&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In this example, the following code is used to add the &lt;code&gt;&amp;lt;SpokestackTray /&amp;gt;&lt;/code&gt; component:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;jsx&quot;&gt;&lt;pre class=&quot;language-jsx&quot;&gt;&lt;code class=&quot;language-jsx&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token tag&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;SpokestackTray&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;clientId&lt;/span&gt;&lt;span class=&quot;token script language-javascript&quot;&gt;&lt;span class=&quot;token script-punctuation punctuation&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;process&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;env&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token constant&quot;&gt;SPOKESTACK_CLIENT_ID&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;clientSecret&lt;/span&gt;&lt;span class=&quot;token script language-javascript&quot;&gt;&lt;span class=&quot;token script-punctuation punctuation&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;process&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;env&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token constant&quot;&gt;SPOKESTACK_CLIENT_SECRET&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;handleIntent&lt;/span&gt;&lt;span class=&quot;token script language-javascript&quot;&gt;&lt;span class=&quot;token script-punctuation punctuation&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;handleIntent&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;
  &lt;span class=&quot;token attr-name&quot;&gt;nluModelUrls&lt;/span&gt;&lt;span class=&quot;token script language-javascript&quot;&gt;&lt;span class=&quot;token script-punctuation punctuation&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    nlu&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/nlu.tflite&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    vocab&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;token string&quot;&gt;&amp;#x27;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/vocab.txt&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    metadata&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;token string&quot;&gt;&amp;#x27;https://d3dmqd7cy685il.cloudfront.net/nlu/production/shared/XtASJqxkO6UwefOzia-he2gnIMcBnR2UCF-VyaIy-OI/metadata.json&amp;#x27;&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;/&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;clientid-and-clientsecret&quot;&gt;&lt;a href=&quot;#clientid-and-clientsecret&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;&lt;code&gt;clientId&lt;/code&gt; and &lt;code&gt;clientSecret&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;The &lt;code&gt;clientId&lt;/code&gt; and &lt;code&gt;clientSecret&lt;/code&gt; props are where you pass your API tokens generated in your Spokestack account. First, &lt;a href=&quot;/create&quot;&gt;create a free account&lt;/a&gt;. Then, &lt;a href=&quot;/account/settings#api&quot;&gt;generate a token&lt;/a&gt; on the account settings page. Don’t worry, there’s no hidden subscription.&lt;/p&gt;&lt;p&gt;Rather than hardcoding these values, we recommend saving them in your environment on your local machine and in CI for deployments. Once they’re saved, run the React Native packager in a new terminal and start the app using &lt;code&gt;npm run ios&lt;/code&gt; or &lt;code&gt;npm run android -- --device&lt;/code&gt; (note that Android requires &lt;a href=&quot;https://reactnative.dev/docs/running-on-device&quot;&gt;a real device&lt;/a&gt; for ASR to work).&lt;/p&gt;&lt;h3 id=&quot;nlumodelurls&quot;&gt;&lt;a href=&quot;#nlumodelurls&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;&lt;code&gt;nluModelUrls&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;These URLs point to the sample Minecraft NLU models files we have hosted on a CDN, which are downloaded automatically by &lt;code&gt;SpokestackTray&lt;/code&gt; the first time the app launches (and only the first time). The files are then saved to the app’s cache for future use so they only need to be downloaded once. NLU models can vary drastically in size, and we thought it better not to include them in the app bundles, but instead load them lazily.&lt;/p&gt;&lt;p&gt;At this point, you may be wondering what an NLU does. While Automatic Speech Recognition provides us with a way to process speech into text, that text is rarely enough to figure out what the user wants the app to do. Natural Language Understanding (NLU) is the next step to process the text into what voice platforms call “intents”.&lt;/p&gt;&lt;p&gt;A good example is searching with voice. If your app has just said, “What would you like to search for?” and the user says, “Bananas”, you can be reasonably sure that the user means for the app to search for bananas without the help of an NLU.&lt;/p&gt;&lt;p&gt;But if the user initiated the interaction and said, “Search for bananas”, the NLU can parse that statement into an intent (e.g. “search”) with variables (e.g. “bananas”). If you were only using ASR, you’d probably end up searching for the whole sentence, “search for bananas”, rather than just “bananas”, which may yield different results. For more information on NLU in Spokestack, please check out &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;our guide&lt;/a&gt;. If you’ve already built an NLU model in Alexa, Dialogflow, or Jovo, check out &lt;a href=&quot;/docs/integrations/export&quot;&gt;our guide on exporting existing NLU models from other platforms&lt;/a&gt;.&lt;/p&gt;&lt;h3 id=&quot;handleintent&quot;&gt;&lt;a href=&quot;#handleintent&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;&lt;code&gt;handleIntent&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;So, intents are commands for the app based on what the user said. Now that you know how you get intents, it’s your responsibility to respond to those intents. There are two questions to answer for any given intent:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;What should the app say in response to the user? This is the return value of &lt;code&gt;handleIntent&lt;/code&gt; and is always required.&lt;/li&gt;&lt;li&gt;Should the app update the UI? Note that not all intents will need to make UI changes.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;In the voice search example, if the user has just searched for “bananas”, the answer to question #1 might be to say, “Here are your search results.” The answer to question #2 would probably be to show the search results. Remember that the NLU doesn’t do the search for you; it just tells you the proper search terms.&lt;/p&gt;&lt;p&gt;These answers could be written like this…&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ts&quot;&gt;&lt;pre class=&quot;language-ts&quot;&gt;&lt;code class=&quot;language-ts&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;handleIntent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
  intent&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;string&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  slots&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;any&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
  utterance&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;string&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;intent &lt;span class=&quot;token operator&quot;&gt;===&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;search&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// Search for recipes with &amp;quot;bananas&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;const&lt;/span&gt; results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; searchService&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
      slots&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;ingredient
    &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// Show the results&lt;/span&gt;
    navigation&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;navigate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;#x27;SearchResults&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
      results
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// Return a response&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
      prompt&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Here are your search results.&amp;#x27;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
      node&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;search_results&amp;#x27;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;prompt&lt;/code&gt; will then be synthesized using Spokestack’s TTS service. It then gets played using the device’s native audio player (unless the tray is in &lt;code&gt;silent&lt;/code&gt; mode).&lt;/p&gt;&lt;p&gt;The &lt;code&gt;node&lt;/code&gt; property is metadata to help you track conversation state, and the value is completely up to you. The only reason &lt;code&gt;SpokestackTray&lt;/code&gt; needs it is to determine whether to listen again after the prompt has been said.&lt;/p&gt;&lt;p&gt;If the node is specified in the &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray#optional-exitnodes&quot;&gt;&lt;code&gt;exitNodes&lt;/code&gt;&lt;/a&gt; prop, the conversation will stop and &lt;code&gt;SpokestackTray&lt;/code&gt; will close.&lt;/p&gt;&lt;p&gt;If the &lt;code&gt;node&lt;/code&gt; is not an exit node, &lt;code&gt;SpokestackTray&lt;/code&gt; will stay open and listen to the user again, and the process will repeat.&lt;/p&gt;&lt;h2 id=&quot;conclusion&quot;&gt;&lt;a href=&quot;#conclusion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;&lt;p&gt;Hopefully, we’ve given you a glimpse into just how powerful &lt;code&gt;react-native-spokestack-tray&lt;/code&gt; can be. Add the &lt;code&gt;SpokestackTray&lt;/code&gt; component to your React Native app to start building elegant conversational user experiences.&lt;/p&gt;&lt;p&gt;For complete documentation, check out &lt;a href=&quot;https://github.com/spokestack/react-native-spokestack-tray&quot;&gt;react-native-spokestack-tray on GitHub&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;For support, we &lt;a href=&quot;/support&quot;&gt;offer multiple support channels&lt;/a&gt; to help you get started.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[What If You're the Product?]]></title><description><![CDATA[How to keep your voice presence from giving away presents]]></description><link>https://www.spokestack.io/blog/what-if-youre-the-product</link><guid isPermaLink="false">https://www.spokestack.io/blog/what-if-youre-the-product</guid><pubDate>Wed, 29 Jul 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:898px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5ee00a2b7442021cb7a6ee8fd6db63db/2ddbb/what-if-youre-the-product.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:99.31972789115646%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAYAAACNiR0NAAAACXBIWXMAAAsTAAALEwEAmpwYAAAFB0lEQVQ4y0WUW2xURRzGj9a227N7brt79uy57p5taSnQ1mKRokZoQUpBCoqgoRWRu6gIRa2CSEFRKFTBSxHoFhFoBLRa4QGjphFF4y0+KIqIRo2RFxPiJdHE6M+cg8SHyWRmMr/55vvmP4JjmwTNdUxcO03Gs8l6DraV/m/ewnMdMp6D69jh2LZMFMVAVXUkyUCL2xiGTTbjIAQbLTNFKpUkkUyS9XP4uVwIDA65CA1gnmthmSZGymD1EpHDPYVsWC3SekOEK0cnyHiZAGig6wkksYi26+sZN6aGslyO/qdXMbtlPIqihOoC5QE0Y+s892gRf54V+Od7gd9PC/zxncvXX8ykalQG4aK6uHgZr2xt5ZvjG3lr9z388uF2lrZeh6Jq4VX8jIOeMmmo09m+vJANCy9hR4fAyQERuJ0zpycyakT2AtAwUphqhKdWTeXzo12cGnyEU8e2smjOeGRFxc+4oa8pw2BsTSndi8aybrpP6xUl9D/u8Nt3DXx7aiSTGgKFaYOUrmPHI3TfMYEf39zCmVfXcO719Rw72E0m6+PaZgg10mnqa8p44d7JPDuvjg1NNl8dXsXZoQ5+/CzGgtt8hLiqUVdbzZTGq9nafjPv9a9jf/e9vP5ynk8+GKKlZRqJuErOz5AyUqyYN43jW+bQf9cEds2yOPfGWn499y7r2mdRXl6JEC0RaWtro7evjxePDLDvwCG6d/SwZ+8Btm3dxKZH1xPX5BCYNg062xeyc+08Ni+bxrabR3Ci7x4+fuco58//QlPTlABYwt0rVjJ49BhHBgY5cfJDHnp4PZse6WTOTS28NvgSY+rHkTZ0bNti3cqF7O3qYNuaFaxdPJuOW5sYGnyeM2e/obq6GkGMlPDYY5sZGBjgye1P8dGnnzO1uZn8rh4aGieS73qQ/MblOFYaz3Po3tjB7sc76LpvCe2t09l4Zxv8/Rf5/F50XUcoKS5m5+4+8nv3cdu8Wxl6+ySzb5pFvreXa65tYPHE4Zzes5TZMybhOTaHDuQ5fjhPX9caeh5YwPv5Bzj/8080Nk7CdV0ERYry8u7NHOx9goqKcubOnUttbS1VVZfjuS6qrLF45jXUjx5J5bBSXt2/k9XLF7BywS10zm/mh1c6OTk0SNowKSsbhjDS1vjyYAdHti0hEddQVRWxJIJlu7iOg2kYJOJJZCXB4ubRnHj2bq6o9CkQBKSiApZOqaX9rqU4bhbfL0VwFZFn5l/FmJoKRFHEcWw0VSGb9cNPwUglQ/9kRWNZSz3rl7WgazKxkgiyWESkqIjpN86lvHw4mYyPoCoqhiajyTJKUBXZLLGoSNa/AEwmtBBq6DqWkcKxLNJxmWpbQo8VExOjjLt6PK7rkc3mEDRFIYDKkoSmaXheJgQGBgcVpGkKyUScRCIRQtV4igkjLPoX1bBlps+B+xsYUz8Wy7Tw/RyCHJOQYzGCtAP/TNPk0ksEHNdl8uRmVFVBDtUroRWSolLtxelZ3siOORV81TuDCZMmhqHkAg8DWPC4Cy8roKCggLq6K8NFWZKprqpBlmLIsoQkScRiUYoKC4kUFnB5VSXjKkx2dc5nfOPkUGFp6TAEKRpFikUJ+khxMclkkng8jhSLhSEpshwqDCwJodEoMVFET2gUFxfT2DSTyhGjcByXXK4MQZElVFlGlZX/N1+8onbhGQUeBy0YB15aeoJyN81wL01NhY/reWHCQSj/AsTEwB4gR4wYAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Doesn&amp;#x27;t look like these folks are having fun being products&quot; title=&quot;Doesn&amp;#x27;t look like these folks are having fun being products&quot; src=&quot;/static/5ee00a2b7442021cb7a6ee8fd6db63db/2ddbb/what-if-youre-the-product.png&quot; srcSet=&quot;/static/5ee00a2b7442021cb7a6ee8fd6db63db/2eeed/what-if-youre-the-product.png 294w,/static/5ee00a2b7442021cb7a6ee8fd6db63db/0d6a1/what-if-youre-the-product.png 588w,/static/5ee00a2b7442021cb7a6ee8fd6db63db/2ddbb/what-if-youre-the-product.png 898w&quot; sizes=&quot;(max-width: 898px) 100vw, 898px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;“If you’re not the customer, you’re the product.” By now, tech-savvy consumers are familiar with this warning. It’s almost a trope. Usually it’s a caveat about &lt;a href=&quot;https://www.wired.com/story/google-tracks-you-privacy/&quot;&gt;Google’s collection of personal data&lt;/a&gt; or &lt;a href=&quot;https://www.wired.com/story/cambridge-analytica-facebook-privacy-awakening/&quot;&gt;Facebook’s stewardship&lt;/a&gt; of their own data trove, but there are plenty of other applications as well. And it’s true: Personal data can be dangerous in the wrong hands.&lt;/p&gt;&lt;p&gt;It’s not true &lt;em&gt;only&lt;/em&gt; for consumers, though; it’s also true for your business. For real-world examples, see &lt;a href=&quot;https://www.theverge.com/tldr/2019/9/19/20874818/amazon-allbirds-shoe-clone-copy-sneaker-206-collective-private-label&quot;&gt;Allbirds&lt;/a&gt;. Or &lt;a href=&quot;https://www.wsj.com/articles/amazon-tech-startup-echo-bezos-alexa-investment-fund-11595520249&quot;&gt;these Alexa Fund participants&lt;/a&gt;. Or &lt;a href=&quot;https://www.businessinsider.com/amazon-echo-ubi-smart-speaker-2020-7&quot;&gt;the creator of an eerily Echo-like device&lt;/a&gt;. There’s a reason trade secrets are called “secrets”.&lt;/p&gt;&lt;h2 id=&quot;an-uncomfortable-truth&quot;&gt;&lt;a href=&quot;#an-uncomfortable-truth&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;An uncomfortable truth&lt;/h2&gt;&lt;p&gt;If you’re browsing this site, it’s probably safe to say you’re familiar with voice technology and its meteoric rise over the past several years. Smart speakers are in people’s homes, voice assistants are on consumers’ phones, and voice is being touted as the next great way for businesses to engage their customers. To do that, you just set up a “skill” or “action” on these platforms—you just configure some text input and output, and the smart speaker/voice assistant platforms will do all the heavy lifting of “understanding” your users’ verbal requests and turning your text responses into audio to read to them. You give them the data, and they give you … &lt;a href=&quot;https://www.bbc.com/worklife/article/20180411-dealing-with-clients-who-expect-you-to-work-for-free&quot;&gt;exposure&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;The Basic Elements of Voice Interfaces&quot; src=&quot;https://www.youtube-nocookie.com/embed/1x4MdTKEy3E?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;This might seem like a reasonable trade—after all, you can’t build automatic speech recognition (ASR), natural language understanding (NLU), and text-to-speech (TTS) systems—that’s a ton of work and maintenance for an uncertain payoff, and systems like that aren’t anywhere near the scope of your business. So you sign up for a free developer account and make sure you have a presence on the platform. It’s not like you gave Google or Amazon credentials to your database, so what are you &lt;em&gt;actually&lt;/em&gt; giving up in the process?&lt;/p&gt;&lt;p&gt;To answer that question, we have to talk a bit about how the voice systems work. By now it won’t surprise you to learn that they’re built on massive troves of data, some of it public, but much of it “proprietary”. You don’t have to know how to &lt;a href=&quot;https://www.tensorflow.org/&quot;&gt;flow a tensor&lt;/a&gt; to know that machine learning, much of it in the form of neural networks, powers some of today’s most impressive software, voice tech systems among them.&lt;/p&gt;&lt;p&gt;I’m greatly simplifying here, but to train an ASR system, you feed a model a whole bunch of audio data (thousands of hours, if you have it) along with transcripts of that audio; thanks to the magic of math, the model learns to predict the latter from the former. Once you have a baseline system trained, you can improve it by feeding it raw audio (different from the original training data), letting it do the transcription, and correcting the transcriptions after the fact if necessary. This is &lt;a href=&quot;https://www.cnet.com/how-to/amazon-and-google-are-listening-to-your-voice-recordings-heres-what-we-know/&quot;&gt;what Amazon and Google do&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;Next in the process, NLU models learn to turn natural language (transcripts from an ASR, which can be single words or full sentences) into something a little more structured so that software can reliably act on it, and TTS systems learn to turn text responses into audio so that they can be read to the user.&lt;/p&gt;&lt;p&gt;So there you have it: The data you provide in order to set up a voice app on one of the big platforms, combined with the interactions users have with your app, is all the vendors need to continuously improve their systems. That’s not so bad, though, right? After all, you &lt;em&gt;want&lt;/em&gt; their voice tech to be as good as it can be so that your users have good experiences with your app.&lt;/p&gt;&lt;p&gt;Keep the cautionary tales from Amazon in mind as we go further down the rabbit hole.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;How Did We Get Here?&quot; src=&quot;https://www.youtube-nocookie.com/embed/B-TIVeN2Kho?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;A current research area in the NLU field is the so-called “end-to-end” dialogue system. It’s another neural model, but instead of translating audio into text or text into “meaning”, this one takes text input (a user request) and directly returns a text response. A quick Google search will turn up a wealth of papers on &lt;a href=&quot;https://arxiv.org/abs/1604.04562&quot;&gt;creating the models themselves&lt;/a&gt; and &lt;a href=&quot;https://www.aclweb.org/anthology/N18-3006&quot;&gt;using automated systems and crowdsourcing&lt;/a&gt; to generate or augment training data for such systems.&lt;/p&gt;&lt;p&gt;But what’s better than data generated by one model under a single set of conditions? That’s right—data collected from and annotated by humans. Enter one of the newest voice assistant features.&lt;/p&gt;&lt;p&gt;As usual, the vendors all have a different name for this feature. Google calls it &lt;a href=&quot;https://developers.google.com/assistant/app/overview&quot; title=&quot;Google App Actions&quot;&gt;App Actions&lt;/a&gt;, Apple has &lt;a href=&quot;https://support.apple.com/en-us/HT209055&quot; title=&quot;Siri Shortcuts&quot;&gt;Siri Shortcuts&lt;/a&gt;, and Amazon just announced &lt;a href=&quot;https://developer.amazon.com/en-US/blogs/alexa/alexa-skills-kit/2020/07/you-can-now-seamlessly-connect-alexa-skills-to-mobile-apps&quot; title=&quot;Alexa for Apps&quot;&gt;Alexa for Apps&lt;/a&gt;. They’re all fundamentally the same thing, though, with slightly different trimmings: They allow you to surface information from deep within your app in response to a request to a voice assistant controlled by someone else.&lt;/p&gt;&lt;p&gt;Since I’ve spent some words focusing on Amazon already, let’s dig a little further into their implementation of this feature. Take a look at &lt;a href=&quot;https://developer.amazon.com/en-US/docs/alexa/alexa-for-apps/skill-connection-request-reference.html#payload-example&quot;&gt;the response format&lt;/a&gt; for an Alexa skill that wants to connect to an app. Notice the &lt;code&gt;prompts&lt;/code&gt; section, which lets you provide some text for Alexa to read right before it drops the user into your app to see the response. Convenient, right? Well, sort of. If a user’s started an interaction like this via voice, it might be nice to actually be able to finish it with voice inside the app (more on that later), but at least this is a smooth-ish transition between the two, and it helps the user get &lt;em&gt;some&lt;/em&gt; answer even if the screen is locked.&lt;/p&gt;&lt;p&gt;But stop for a moment to think about what data Amazon has about the user interaction at this point:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;audio of the user making the request&lt;/li&gt;&lt;li&gt;Amazon’s ASR result for that audio&lt;/li&gt;&lt;li&gt;the user’s “intent” (or distillation of the ASR result into an action) that you, the developer, provided Amazon with as part of creating your skill&lt;/li&gt;&lt;li&gt;a text response to that intent, again provided by you, an actual human&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;In other words, they’d have a pretty good pipeline for collecting training data for an end-to-end dialogue system, like &lt;a href=&quot;https://www.aclweb.org/anthology/N19-2007/&quot;&gt;this one&lt;/a&gt; that some of their engineers experimented with last year. What could they &lt;em&gt;do&lt;/em&gt; with such a model? For starters, they could learn to answer initial user queries well enough to keep those users inside Alexa instead of delivering them to third-party apps at all. If they were so inclined, they could also aggregate data about popular queries and responses to prioritize development of new business lines for Amazon.&lt;/p&gt;&lt;p&gt;I don’t have any evidence to make concrete accusations about such things, but it doesn’t seem outside the realm of possibility. Keep in mind that encouraging developers to let Alexa access individual app features might also encourage those developers to provide data for features that aren’t a good fit for the smart speaker medium or were “too valuable to put on Alexa”. In other words, a feature that deep links into apps can potentially give them access to data that they can’t just mine from their Alexa Skills Store.&lt;/p&gt;&lt;p&gt;And text prompts aren’t the only data collection boon here. Alexa for Apps also “lets” you provide an app store ID so that Alexa can link a user directly to your app’s install page if they don’t already have it installed. This might boost your app’s discoverability if your skill somehow happens to be more popular than your mobile app, but it also lets Amazon link natural language requests to app store offerings, which is an interesting bit of data to have if you’re building, say, a recommendation engine.&lt;/p&gt;&lt;h2 id=&quot;yeah-but-what-can-i-do-about-it&quot;&gt;&lt;a href=&quot;#yeah-but-what-can-i-do-about-it&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Yeah, but what can I do about it?&lt;/h2&gt;&lt;p&gt;Given all this information, what’s the app developer to do? It’d be easy enough for me to recommend total abstinence, the business equivalent of deleting your Facebook account after the Cambridge Analytica debacle. An isolationist approach might not be fair to you, though. People use these voice assistants, so not participating in them or their advanced features might make your app appear to lag behind similar apps that aren’t as cautious.&lt;/p&gt;&lt;p&gt;Instead, I suggest instead that you think carefully about the data you’re providing in your user responses—the text that gets read to the user in a smart speaker app or before the voice assistant drops the user into your mobile app. For any given interaction, can you respond in a way that’s helpful to the user but also vague, redirecting them into an experience that &lt;em&gt;you&lt;/em&gt; control for more information?&lt;/p&gt;&lt;p&gt;“But what if the user wants or needs a voice-only experience?” you might ask. That’s a fair question—smart speakers have a distinct accessibility advantage, and some requests are just easier to speak than to tap. Finally, I have some good news for you: It’s getting easier to put voice interactions &lt;em&gt;inside&lt;/em&gt; your app and stop relying on the ASR and other components provided by giant companies.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Voice is just another interface&quot; src=&quot;https://www.youtube-nocookie.com/embed/wbJ8fZh-iQw?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;This is Spokestack’s mission: We exist to enable businesses to take back their voice presence. Voice is our business, not advertising or retail. We provide native mobile libraries that bring a full suite of ASR and natural language tools directly to your app. This, in turn, helps you keep your customer interactions private and prevents your thoughtful responses to user queries from being turned into training data that can be used against you.&lt;/p&gt;&lt;p&gt;In fact, we go a couple steps further than what the major voice platforms currently let you do. Spokestack can help you:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Give your app its own wake word (so a user can &lt;em&gt;start&lt;/em&gt; a voice-only interaction from within the app, as if they’d said “Hey Google/Siri” to their phone).&lt;/li&gt;&lt;li&gt;Create your own voice to use for responses, so users can differentiate your brand from a generic AI voice.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Spokestack Overview&quot; src=&quot;https://www.youtube-nocookie.com/embed/MW2cYSQhbZE?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;That’s all well and good for the mobile use case, but what about the smart speakers themselves, where voice is the &lt;em&gt;only&lt;/em&gt; medium? You &lt;em&gt;have&lt;/em&gt; to give detailed data in those responses, right? Maybe, but this is where your brand’s custom voice comes into play. You don’t have to provide text responses that can be immediately appropriated by the smart speaker platform; you can use a separate TTS service to synthesize your response and just provide the smart speaker with audio. If they &lt;em&gt;still&lt;/em&gt; want to take your data, they could of course run this through their own ASR systems, but that injects more error and cost into the process for them. You can do something similar with the NLU systems provided by the smart speakers too, but that’s a topic for another post.&lt;/p&gt;&lt;p&gt;Don’t concede the entire medium of voice interaction to your competitors (current or potential) just because they’ve built a stack of audio and language processing tools to “offer” to you. Don’t give them any more data than necessary. Let your brand be its own mouthpiece. &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;Ask us how&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Jovo Support for Spokestack Enables Mobile App Voice Assistants]]></title><description><![CDATA[It's now easier than ever to build your own Independent Voice Assistant. This new integration helps developers build cutting edge conversational experiences in mobile apps.]]></description><link>https://www.spokestack.io/blog/jovo-support-for-spokestack-enables-mobile-app-voice-assistants</link><guid isPermaLink="false">https://www.spokestack.io/blog/jovo-support-for-spokestack-enables-mobile-app-voice-assistants</guid><pubDate>Wed, 15 Jul 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/3b214589c79e6f15a52d472a2a248e24/8537d/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACBElEQVQoz2WRz6sSURTHZy2IuHnhQtTQcdRx7r0z44zDpFj6/ElK1HtvIoIMKUHsFf0FtXTXtlW08J/o/2hbRG2CNkEUtPjGOTnTe7T4cM65c853zvdezTRNNJtNUKxWq9B1naHcMAyORK1W+++M+ug8nie0uKA4nU4xm82wWCwwHk/R7w8wmUwwGAzQ7XYxHA7R6/UwGo04p346j+eJRNCyLCyXS5yfP8V2u8XDB3dxcucWHj1eY7VaIYrOOBLr9Zp7NpsNoihKtksE/1k2oFdK0KsGDHUTRjNEtVKAYdTYGlEul1EsFhPbdAWXLMeJaZJwHZa6BrF4Byv6BuvkA0T3CerGVf5eKBSgaRqTSqXQaDTY2SXLSim0Wi14fgDfbcAev4JY/kb44gs6u+8wb79H2OkjDAMWojvb7XZIp9PIZrOQUrJwsqHv+/A8Dy2vDd+tQ43foPX8J4ZvPyF8+RXm6UeMZ6c4Pr7BgvP5HPv9Hvl8HplMBkEQwLbtBM113UPhwpZ1qOAM8v5nOJsfaN77BTl5DWnVYNsOjo6uJJYJstjpdNBut9klwZYJWl1KxcPCW8C6voPqPeOfKSVZkPJSqYRcLscPFM/ylXkeaDntr9ABISAOotKqQAkDjmNzo+s6cByHh2NXF2epZkFBIgfoxRghIYRiqJk2jF0kPQdo7qLoH4jhbVLULRmpAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Spokestack and Jovo Collaboration&quot; title=&quot;Spokestack and Jovo Collaboration&quot; src=&quot;/static/3b214589c79e6f15a52d472a2a248e24/05162/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png&quot; srcSet=&quot;/static/3b214589c79e6f15a52d472a2a248e24/2eeed/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png 294w,/static/3b214589c79e6f15a52d472a2a248e24/0d6a1/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png 588w,/static/3b214589c79e6f15a52d472a2a248e24/05162/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png 1175w,/static/3b214589c79e6f15a52d472a2a248e24/8537d/jovo-support-for-spokestack-enables-mobile-app-voice-assistants.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;This new integration helps developers build cutting edge conversational experiences for mobile apps using their Jovo interaction models and Spokestack’s mobile ASR, NLU, and TTS services.&lt;/p&gt;&lt;p&gt;Like so many in the voice community, we’re big fans of the &lt;a href=&quot;https://www.jovo.tech&quot;&gt;Jovo Framework&lt;/a&gt;! We share their love of open-source software as well as their “write once, run everywhere” approach to conversational assistants. That’s why we’re so excited to announce Spokestack integration with Jovo.&lt;/p&gt;&lt;h2 id=&quot;how-does-the-jovo-and-spokestack-integration-work&quot;&gt;&lt;a href=&quot;#how-does-the-jovo-and-spokestack-integration-work&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How does the Jovo and Spokestack integration work?&lt;/h2&gt;&lt;p&gt;Starting today, Jovo developers can take the following steps to begin building mobile voice experiences based on their Jovo interaction model with Spokestack:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Create a Spokestack account to gain access to a Spokestack API key: &lt;a href=&quot;/create&quot;&gt;https://spokestack.io/create&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Add your Spokestack API key and secret to your Jovo project’s configuration file.&lt;/li&gt;&lt;li&gt;Run the &lt;code&gt;jovo build&lt;/code&gt; command to create a &lt;code&gt;platforms/spokestack&lt;/code&gt; directory.&lt;/li&gt;&lt;li&gt;Run the &lt;code&gt;jovo deploy&lt;/code&gt; command to upload your model to Spokestack.&lt;/li&gt;&lt;li&gt;Spokestack will then train the imported model for use with Spokestack’s &lt;a href=&quot;/docs&quot;&gt;embedded NLU solutions for iOS, Android, and React Native&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;From there, developers can either follow our &lt;a href=&quot;/tutorials&quot;&gt;port a smart speaker app to mobile tutorials&lt;/a&gt; or build whatever mobile voice experience they have in their heads!&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;why-would-i-import-my-jovo-model-to-spokestack&quot;&gt;&lt;a href=&quot;#why-would-i-import-my-jovo-model-to-spokestack&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why would I import my Jovo model to Spokestack?&lt;/h2&gt;&lt;p&gt;If you’ve experienced the power of using Jovo to build a conversation that works across the Alexa, Google, and Bixby smart speaker platforms, you may have wondered about transferring that conversational experience to a mobile app. With Spokestack, you can.&lt;/p&gt;&lt;p&gt;We take the interaction model that powers your conversation on smart speakers and port it to an embedded model that will work on a smartphone. So now users can say, “Siri, open {&lt;em&gt;your app name&lt;/em&gt;}” to begin a mobile app conversational experience just like the one they have on smart speakers.&lt;/p&gt;&lt;h2 id=&quot;new-to-mobile-development-want-to-see-how-spokestack-works-on-mobile&quot;&gt;&lt;a href=&quot;#new-to-mobile-development-want-to-see-how-spokestack-works-on-mobile&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;New to mobile development? Want to see how Spokestack works on mobile?&lt;/h2&gt;&lt;p&gt;We realize that the barrier to building conversations on mobile is lacking the skills to build apps that work on iOS and Android. Sometimes it’s thinking through how voice would work on mobile. We’ve tried to address both issues.&lt;/p&gt;&lt;p&gt;On iOS, we have &lt;a href=&quot;https://apps.apple.com/us/app/spokestack-studio/id1508393980&quot;&gt;Spokestack Studio&lt;/a&gt;, which was built to show developers exactly how wake words, speech recognition, text-to-speech, and intent classification work on mobile. It even includes a port of the &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;Alexa Minecraft Helper&lt;/a&gt; skill to iOS.&lt;/p&gt;&lt;p&gt;Spokestack Studio isn’t available for Android yet, but Spokestack does work on Android. We have another tutorial for &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-android-using-spokestack&quot;&gt;porting an Alexa Skill to Android&lt;/a&gt; that walks you step-by-step from Alexa skill to Android app. After going through our tutorial, you should be able to port your smart speaker skill to the mobile platform of your choice.&lt;/p&gt;&lt;h2 id=&quot;the-future-with-jovo&quot;&gt;&lt;a href=&quot;#the-future-with-jovo&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;The future with Jovo&lt;/h2&gt;&lt;p&gt;Like Jovo, we believe in sharing open-source software and building great developer tools. By working together, we can extend the capabilities of both conversational and mobile developers on mobile. We’ll continue working with Jovo to improve the Spokestack integration, including adding our ASR and TTS services options for Jovo developers. Besides the technical integration, we’re also announcing our financial support of Jovo through their &lt;a href=&quot;https://opencollective.com/jovo-framework&quot;&gt;Open Collective initiative&lt;/a&gt;. We plan on being a long-term partner and supporter of the Jovo community.&lt;/p&gt;&lt;p&gt;We’re excited to see what you’ll build using Jovo and Spokestack for mobile voice apps! Tell us about it at &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;hello@spokestack.io&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Wake Words for Mobile Apps]]></title><description><![CDATA[Amazon added the "Hey Alexa" wake word to its mobile app. This is going to change user perceptions of mobile voice apps. Start building your Independent Voice Assistant with Spokestack.]]></description><link>https://www.spokestack.io/blog/wakewords-for-mobile-apps</link><guid isPermaLink="false">https://www.spokestack.io/blog/wakewords-for-mobile-apps</guid><pubDate>Thu, 09 Jul 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/c4b8b0fa0b06d1046d77136e417e2ac5/8537d/wakewords-for-mobile-apps.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABwklEQVQoz3WQT2sTURRH38YiuFBQasjCiutadeOncelC6AcQBHd+BhcujAjiptSNLqRC2mlnJjGxqUnFRKoRKk0nmZm0NtF0JvOOvPnTvoQ6cOC94d5zf/eJ6m/QqRzKmE9HYHfHPHrq8nJ1wIviXxafbFFYsbC8gM+jtHaqX/xPWBvC6m7I+etfEJfrXLjR4v7iW54XCpjtDh/aTiJL6yeEmURncwjGr5Ar801EbpvZmy0Kr0oUrXXeLC/x8ME9LPc43mRCqF/0lFnC3K0ms/NfubrwnXP5JR4/e827xg7LRZuPB5LpQCcJy/0EVaRQwrXdkPydFrmFFvnbTa7d/Um1D40QaqMswGQgYfYkG92E9W6U4ETYvmTlR8DM3DbiUh1xsc7MXIP3OyNsP8LqjScCnCTc6Eks9xQ1IB7iSspOgFkZYJSOYtRZ/VM9hhOxtp8E0PuF7UlKfkq6tjqbrqTmBUx/W36A5UlUnxqstjH2I7JgIpPoqBXsPokwivh2GNDcC2AcseklQr0+k8dC/R10ygdQ9cd0hgFmd4jR/sPe4JiKH1Lqo9WdbqUGCRU3xjmbYifCciS2K+PzmXWpQ73pPwu0BmQueLKgAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Wake Words for Mobile Apps&quot; title=&quot;Wake Words for Mobile Apps&quot; src=&quot;/static/c4b8b0fa0b06d1046d77136e417e2ac5/05162/wakewords-for-mobile-apps.png&quot; srcSet=&quot;/static/c4b8b0fa0b06d1046d77136e417e2ac5/2eeed/wakewords-for-mobile-apps.png 294w,/static/c4b8b0fa0b06d1046d77136e417e2ac5/0d6a1/wakewords-for-mobile-apps.png 588w,/static/c4b8b0fa0b06d1046d77136e417e2ac5/05162/wakewords-for-mobile-apps.png 1175w,/static/c4b8b0fa0b06d1046d77136e417e2ac5/8537d/wakewords-for-mobile-apps.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Amazon added the “Hey Alexa” wake word to its mobile app. This is going to change user perceptions of mobile voice apps.&lt;/p&gt;&lt;p&gt;One of the main hurdles I’ve encountered when explaining mobile voice apps to people is that they don’t seem to understand how to start talking to mobile apps. For some reason, it’s difficult for people to get their heads around the idea that a mobile app is the same as a smart speaker such as an &lt;a href=&quot;https://www.amazon.com/alexa-skills/&quot;&gt;Alexa Skill&lt;/a&gt; or a &lt;a href=&quot;https://assistant.google.com/explore&quot;&gt;Google Assistant Action&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;If you want our smart speaker skill &lt;a href=&quot;https://thebartender.io&quot;&gt;The Bartender&lt;/a&gt; to recommend a cocktail and help you make it, you have to say the following, depending on which smart speaker assistant you’re talking to:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;“Hey Alexa, ask the Bartender for a drink.”&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;or&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;“Hey Google, open the Bartender.”&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;After invoking the Bartender, the user can start a search for a drink by name or ingredient. Pretty simple, right?&lt;/p&gt;&lt;p&gt;So why is it such a leap to think the same interaction can’t happen on mobile?&lt;/p&gt;&lt;p&gt;Because users haven’t been shown enough examples of mobile apps carrying out tasks by voice yet.&lt;/p&gt;&lt;p&gt;We think that if Alexa trains users to understand that they can say&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;“Hey Siri, open Alexa”&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;to have a conversation, it will change users’ expectations of mobile voice conversations.&lt;/p&gt;&lt;blockquote class=&quot;twitter-tweet&quot; data-dnt=&quot;true&quot;&gt;&lt;p lang=&quot;en&quot; dir=&quot;ltr&quot;&gt;Hey &lt;a href=&quot;https://twitter.com/hashtag/voicefirst?src=hash&amp;amp;ref_src=twsrc%5Etfw&quot;&gt;#voicefirst&lt;/a&gt; folks, pls read this piece by &lt;a href=&quot;https://twitter.com/sarahintampa&quot;&gt;@sarahintampa&lt;/a&gt;  on TC: &lt;a href=&quot;https://t.co/xqq9KM8Adw&quot;&gt;https://t.co/xqq9KM8Adw&lt;/a&gt;&lt;br/&gt;&lt;br/&gt;1/ Amazon just gave the Alexa mobile app the &amp;quot;Hey Alexa&amp;quot; wakeword.  This is a big deal. Teaching users to say &amp;quot;Hey Siri, open Alexa&amp;quot; to start a conversation will kick off &lt;a href=&quot;https://twitter.com/hashtag/mobilevoice?src=hash&amp;amp;ref_src=twsrc%5Etfw&quot;&gt;#mobilevoice&lt;/a&gt;&lt;/p&gt;— Spokestack (@spokestack) &lt;a href=&quot;https://twitter.com/spokestack/status/1280660757823160321&quot;&gt;July 8, 2020&lt;/a&gt;&lt;/blockquote&gt;&lt;p&gt;That’s because the Alexa mobile app will help users make the mental connection that, in addition to Siri and Google Assistant, mobile apps can carry on conversations with them on their mobile device. Specifically, they can have robust conversations that help them complete tasks they wouldn’t otherwise be able to accomplish as fast. It might be because they are driving or in some sort of hands-free situation. Or might just be because voice is faster and easier than tapping and typing for certain tasks. Either way, convenience tends to drive consumer adoption once they see it.&lt;/p&gt;&lt;p&gt;For example, did you know you can open any app on your phone by asking for it? Think of your favorite app, then ask Siri or Google to open it for you. Seriously, try it now!&lt;/p&gt;&lt;p&gt;That’s convenient and definitely faster than scrolling through pages of apps to get to the service or content you need. So the next step is adding voice to your app so you can talk with your customer, just like Alexa is going to do.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;The Basic Elements of Voice Interfaces&quot; src=&quot;https://www.youtube-nocookie.com/embed/1x4MdTKEy3E?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;To be fair, Alexa isn’t the first mobile app to add a voice interface or wake word. Spotify, Home Depot, Sephora, Pandora, and Snapchat have all added voice to their apps with varying degrees of interaction. Most focus on initiating a product search, which makes sense. It’s just that the interactions, sans the ones built by Houndify, are not that robust.&lt;/p&gt;&lt;p&gt;We think all of that will change which is why we built Spokestack. So if you’re looking for a wake word, on-device NLU, or custom synthetic voice to speak to your customer, email us at &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;hello@spokestack.io&lt;/a&gt;.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Today, we export to independence!]]></title><description><![CDATA[With Spokestack, your voice app can declare independence!]]></description><link>https://www.spokestack.io/blog/today-we-export-to-independence</link><guid isPermaLink="false">https://www.spokestack.io/blog/today-we-export-to-independence</guid><pubDate>Mon, 29 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;img src=&quot;/6eb19fbff2a307802e7fb32551b439b5/independence.gif&quot; alt=&quot;Today we celebrate our Independence Day!&quot;/&gt;&lt;/p&gt;&lt;p&gt;With apologies to President Bill Pullman, let’s declare our independence from smart speakers. The voice interface, and the skills you’ve developed using it, are wonderful resources that are currently stuck in kitchens and dens all over the world. It’s time to free them to be carried with your users wherever they are, not wherever Amazon and Google allow them to be. Declare skill independence with Spokestack!&lt;/p&gt;&lt;h2 id=&quot;whats-independence&quot;&gt;&lt;a href=&quot;#whats-independence&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What’s independence?&lt;/h2&gt;&lt;p&gt;NLU-capable computers are everywhere — your smartphone, your smartwatch, your laptop, your TV — but nobody is building NLU-enabled voice assistants for them. Smart speakers, on the other hand, are essentially NLU-only skill/action platforms that artifically limit what your voice assistants can accomplish. Why do you have to choose between running free of platform restrictions, or having an NLU? That’s why we built Spokestack. Spokestack runs your independent voice assistants on mobile platforms like Android and iOS, along with providing integrated ASR and TTS services, in convenient, cross-platform, open-source libraries using a simple, consistent API. You can take your independent voice assistants with you as you walk away from the demolished alien mothership, instead of having it stuck in the house on a smart speaker!&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/5f7f5bb44f767ab7df75ef64af6b9134/will_smith.gif&quot; alt=&quot;Will Smith walks away from smoke&quot;/&gt;&lt;/p&gt;&lt;p&gt;We built Spokestack when we realized that NLU, ASR, and TTS on mobile platforms were all siloed between service providers, none of them considering the developer experience when creating complete,independent voice assistants. Not only that, all the service providers have business models that incentivize them to force everything to the cloud. So we built the Spokestack NLU service, using state of the art intent and slot TensorFlow machine learning that is familiar to Alexa and DialogueFlow developers, capable of running entirely on the mobile device to deliver speedy, privacy-preserving results. Originally, Spokestack was just used in our own &lt;a href=&quot;https://thebartender.io/&quot;&gt;multi-modal, cross-platform independent voice assistants&lt;/a&gt;, but since January we’ve been focused on creating a simple way for all developers — mobile, voice, smart speaker, and front-end — to make their own independent voice assistants for mobile platforms.&lt;/p&gt;&lt;h2 id=&quot;why-do-i-want-to-declare-independence&quot;&gt;&lt;a href=&quot;#why-do-i-want-to-declare-independence&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Why do I want to declare independence?&lt;/h2&gt;&lt;p&gt;Unlike smart speakers, independent voice assistants can take advantage of multiple interfaces modalities to allow your users to get the job done using whatever interaction they find convenient. Visual, haptic, and now voice user interfaces are all accessible to apps using Spokestack.&lt;/p&gt;&lt;p&gt;What do you gain from that? You control and learn from the NLU classification of user utterances, not the smart speaker platform. When you’re stuck on Alexa, do you get the raw utterance that your user uses when they’re in your skill, even misclassified or misunderstood ones? With Spokestack, you control your data and your users’ data.&lt;/p&gt;&lt;p&gt;If you’ve ever read the EULA for smart speaker platforms, you know how rapacious they are with your data. With Spokestack, you keep the most useful bits of your data instead of feeding it (for free) to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_Tech&quot;&gt;FAANG&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:800px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/329556224c3020a76d4704d6ef41be48/c60e9/bastille.jpg&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:74.48979591836734%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/jpeg;base64,/9j/2wBDABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVGC8aGi9jQjhCY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2P/wgARCAAPABQDASIAAhEBAxEB/8QAFwAAAwEAAAAAAAAAAAAAAAAAAAIDAf/EABUBAQEAAAAAAAAAAAAAAAAAAAED/9oADAMBAAIQAxAAAAFdg0qzAT//xAAZEAADAQEBAAAAAAAAAAAAAAAAAQISAxP/2gAIAQEAAQUCrpRT0tnqnGjZ/8QAFREBAQAAAAAAAAAAAAAAAAAAABH/2gAIAQMBAT8BR//EABQRAQAAAAAAAAAAAAAAAAAAABD/2gAIAQIBAT8BP//EABkQAAMAAwAAAAAAAAAAAAAAAAABERAhMf/aAAgBAQAGPwKoqOEa2TH/xAAaEAEAAwEBAQAAAAAAAAAAAAABABExIUFR/9oACAEBAAE/IfnAch9NQyCDt2eiOwBVMlvOE//aAAwDAQACAAMAAAAQsA//xAAWEQADAAAAAAAAAAAAAAAAAAAAARH/2gAIAQMBAT8QapB//8QAFxEAAwEAAAAAAAAAAAAAAAAAAAERIf/aAAgBAgEBPxBYVn//xAAcEAEAAwACAwAAAAAAAAAAAAABABEhMVFBYXH/2gAIAQEAAT8QDgUiza+xWFksV6m821epvZTgAX3DErUo5NnGZPFE/9k=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Storming the Bastille&quot; title=&quot;Storming the Bastille&quot; src=&quot;/static/329556224c3020a76d4704d6ef41be48/c60e9/bastille.jpg&quot; srcSet=&quot;/static/329556224c3020a76d4704d6ef41be48/ecd88/bastille.jpg 294w,/static/329556224c3020a76d4704d6ef41be48/896ed/bastille.jpg 588w,/static/329556224c3020a76d4704d6ef41be48/c60e9/bastille.jpg 800w&quot; sizes=&quot;(max-width: 800px) 100vw, 800px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;h2 id=&quot;what-do-i-have-to-do-to-declare-independence&quot;&gt;&lt;a href=&quot;#what-do-i-have-to-do-to-declare-independence&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What do I have to do to declare independence?&lt;/h2&gt;&lt;p&gt;To further that mission to create a complete developer experience for developing voice apps, we’re excited that now you can &lt;a href=&quot;/docs/integrations/export&quot;&gt;export your smart speaker skill&lt;/a&gt; to run on-device in the major mobile platforms, and declare your independence as an independent voice assistant from smart speakers!&lt;/p&gt;&lt;p&gt;You can leverage existing ASR and TTS services, which are the parts that actually benefit from scale and are difficult to DIY. Spokestack provides a seamless, unified API across mobile platforms (&lt;a href=&quot;/docs/ios/getting-started&quot;&gt;iOS&lt;/a&gt;, &lt;a href=&quot;/docs/android/getting-started&quot;&gt;Android&lt;/a&gt;, and &lt;a href=&quot;/docs/react-native/getting-started&quot;&gt;React/React Native&lt;/a&gt;) that makes converting your skill into a indepedent voice app easy.&lt;/p&gt;&lt;p&gt;Finally, like &lt;a href=&quot;https://www.imdb.com/title/tt0116629/characters/nm0000597&quot;&gt;President Bill Pullman&lt;/a&gt;, you must have the hubris to believe.&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;We are fighting for our right to live. To exist. And should we win…the day the world declared in one voice…Today we celebrate our Independence Day!&lt;/p&gt;&lt;/blockquote&gt;&lt;h2 id=&quot;nota-bene&quot;&gt;&lt;a href=&quot;#nota-bene&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Nota bene&lt;/h2&gt;&lt;p&gt;While it’s fun to cheekily reference a cheesy scifi movie to help illustrate what our technology company can help with, please take a bit of time to learn about &lt;a href=&quot;https://www.theatlantic.com/ideas/archive/2018/07/fourth-of-july-black-holiday/564320/&quot;&gt;celebrating Independence Day during the eras of Emancipation and Reconstruction&lt;/a&gt;!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Choosing the Right iOS Wake Word Service]]></title><description><![CDATA[Spokestack offers two wake word services in iOS, `appleWakeword` and `tfLiteWakeword`. Which should you use?]]></description><link>https://www.spokestack.io/blog/choosing-the-right-ios-wakeword-service</link><guid isPermaLink="false">https://www.spokestack.io/blog/choosing-the-right-ios-wakeword-service</guid><pubDate>Tue, 16 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/2a62f9848c4c7071b6238ae72799f495/8537d/choosing-the-right-ios-wakeword-service.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACBUlEQVQoz3WRT2sTQRiHh+BNEAvStM1BSJPd7M7u7P/sbiRNhG3jpR6MNNl8AS3eKn4DeyjSmwfxoBc9NOlFD5481YP1okIEbUUofgKlRws/eWfZuKAuPMzLzrvP/N5ZZpomihiGIVfONURxB257F921LbSTe6gZW6gv11CtViGEBYP6uQqxchuiswlDV8H+LRTguor4yjWcXz5CqfIVc9oJ7tx9jZ3t+xiNUtRrVZgkbdTgDvbgDvZl/ZcwIxf2cKH+EWzhE+aNz3j6/C1evHqJJ48e4tbNNSkUJoeicigKhzCNTEipimQjq4haPVzSpyiLLyiLbzi3sIdy5TJsk6PX7cg+XdfRUBWoqiJrRgIhBCzLmkH3I4VxD/PGFIvWMSrOERbtY7Q769B1BaqmzQI0GpqEahYEAZrNJsIwLBAh8B10rl5HaekD2NwU7OIUpaX3WOmuw/dteJ5fCCBmMJL5vj+DDiCpbVvo9zfw5vAnDt6d4uDwVNY3+htyr9VqSaIokt95nidXRoXrunAcR2Lbttyk+Gk6BPALf54zDIcD6DqXBxMkjOMYeTCWS3JoBHqnaRrSNJWa72fAyY9MmaYjefkUIu+nmuRSWPwZOdTIOUeSJJhMxth9NsaDxxPsj8dIVldl+lyW91MImoxR3P9Bd2nZNkLXQRQ4WR2Gci8ftQjd6W+1ho5Jr8cSaAAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Choosing the Right iOS Wake Word Service&quot; title=&quot;Choosing the Right iOS Wake Word Service&quot; src=&quot;/static/2a62f9848c4c7071b6238ae72799f495/05162/choosing-the-right-ios-wakeword-service.png&quot; srcSet=&quot;/static/2a62f9848c4c7071b6238ae72799f495/2eeed/choosing-the-right-ios-wakeword-service.png 294w,/static/2a62f9848c4c7071b6238ae72799f495/0d6a1/choosing-the-right-ios-wakeword-service.png 588w,/static/2a62f9848c4c7071b6238ae72799f495/05162/choosing-the-right-ios-wakeword-service.png 1175w,/static/2a62f9848c4c7071b6238ae72799f495/8537d/choosing-the-right-ios-wakeword-service.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Spokestack offers two wake word services in iOS: &lt;code&gt;appleWakeword&lt;/code&gt; and &lt;code&gt;tfLiteWakeword&lt;/code&gt;. Which should you use?&lt;/p&gt;&lt;p&gt;The answer, of course, is up to you! Spokestack always gives you options, because one size does not fit everyone, and because, like you, we hate vendor lock-in.&lt;/p&gt;&lt;h2 id=&quot;wait-theres-a-wake-word&quot;&gt;&lt;a href=&quot;#wait-theres-a-wake-word&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Wait, there’s a wake word?&lt;/h2&gt;&lt;p&gt;Indeed! If you want your app to be controllable purely by voice, you need a wake word — a word (or short phrase) that tells your app “the next thing the user says is meant for you”. The wake word detection component in Spokestack is responsible for detecting any of a user-defined set of keyword phrases in &lt;a href=&quot;https://en.wikipedia.org/wiki/Real-time_computing#Criteria_for_real-time_computing&quot;&gt;soft real time&lt;/a&gt;. Once detected, the Spokestack pipeline activates, providing you with an activation event and triggering the configured speech recognition service. The accuracy, speed, and flexibility of wake word detection depends on which service you configure.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Wake words for iOS apps&quot; src=&quot;https://www.youtube-nocookie.com/embed/3qKJMrkbZA8?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;h2 id=&quot;ok-so-which-one-should-i-use&quot;&gt;&lt;a href=&quot;#ok-so-which-one-should-i-use&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Ok, so which one should I use?&lt;/h2&gt;&lt;p&gt;In general, &lt;code&gt;tfLiteWakeword&lt;/code&gt; will have better accuracy and faster activation. &lt;code&gt;appleWakeword&lt;/code&gt; is intended for quick demos where you want to try out different wake words for UX research and not have the overhead of having to build a TensorFlow model for every option. Let’s discuss the guidelines for using them in more detail by asking two questions.&lt;/p&gt;&lt;h2 id=&quot;what-wake-word-do-we-want-our-app-to-use&quot;&gt;&lt;a href=&quot;#what-wake-word-do-we-want-our-app-to-use&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What wake word do we want our app to use?&lt;/h2&gt;&lt;p&gt;What wake word do we want our app to use? Tricky question! Answering the question means &lt;a href=&quot;/blog/user-research-for-voice-experiences&quot;&gt;UX research&lt;/a&gt;, brand identity discussions, and even linguistics consulting. Of course Spokestack is here to help with that.&lt;/p&gt;&lt;h3 id=&quot;prefer&quot;&gt;&lt;a href=&quot;#prefer&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Prefer&lt;/h3&gt;&lt;p&gt;Spokestack’s &lt;code&gt;appleWakeword&lt;/code&gt; uses Apple’s free on-device ASR to transcribe all speech heard while the Spokestack pipeline is running, and then filters that speech for the wake word(s) that you specify in the &lt;code&gt;SpeechConfiguration.wakewords&lt;/code&gt; configuration. This allows you to have a tight testing cycle when trying out the UX of different wake words. Boss doesn’t like your favorite “HAL” wake word idea? Just change a single line and try out their “GUNTER” idea. Just want your app to get an event when a user says “take a selfie”, or even allow your users to choose their own wake words? No need to muck with &lt;a href=&quot;https://voicebot.ai/2020/05/29/new-voice-selfie-app-takes-photos-using-custom-phrases/&quot;&gt;fancy embedded computing models&lt;/a&gt; — just change one line of code!&lt;/p&gt;&lt;h3 id=&quot;avoid&quot;&gt;&lt;a href=&quot;#avoid&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Avoid&lt;/h3&gt;&lt;p&gt;&lt;code&gt;appleWakeword&lt;/code&gt; should not be used in any published app, both because its performance will always be slower than &lt;code&gt;tfLiteWakeword&lt;/code&gt; and because it depends on running constant ASR over all speech while the pipeline is running, which is both a privacy concern and an overuse of resources for the actual task.&lt;/p&gt;&lt;h2 id=&quot;how-do-i-distribute-my-app-with-a-fast-accurate-efficient-on-device-wake-word&quot;&gt;&lt;a href=&quot;#how-do-i-distribute-my-app-with-a-fast-accurate-efficient-on-device-wake-word&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How do I distribute my app with a fast, accurate, efficient, on-device wake word?&lt;/h2&gt;&lt;p&gt;Spokestack’s &lt;code&gt;tfLiteWakeword&lt;/code&gt; fits the bill for you! You’ll gain fast, accurate, efficient, wake word activation that runs entirely on-device. It features a state-of-the-art machine learning pipeline using attention-based models; they operate continuously, each feeding output into the next, for both efficiency and accuracy.&lt;/p&gt;&lt;p&gt;To get you started, Spokestack provides pretrained TensorFlow models that enable on-device wake word detection. These free models, however, only recognize the word “Spokestack”; in order to have your app respond to a different word or phrase, you’ll need your own custom models.&lt;/p&gt;&lt;h3 id=&quot;prefer-1&quot;&gt;&lt;a href=&quot;#prefer-1&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Prefer&lt;/h3&gt;&lt;p&gt;&lt;a href=&quot;/docs/concepts/wakeword-models&quot;&gt;Work with us to develop a custom TensorFlow model&lt;/a&gt;, and then distribute that model with your app configured to use &lt;code&gt;tfLiteWakeword&lt;/code&gt;. When testing your app using &lt;code&gt;tfLiteWakeword&lt;/code&gt;, be sure to consult the &lt;a href=&quot;/docs/concepts/pipeline-configuration&quot;&gt;Spokestack pipeline model hyperparameter configuration guide&lt;/a&gt;.&lt;/p&gt;&lt;h3 id=&quot;avoid-1&quot;&gt;&lt;a href=&quot;#avoid-1&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Avoid&lt;/h3&gt;&lt;p&gt;Don’t use &lt;code&gt;tfLiteWakeword&lt;/code&gt; without a TensorFlow model of the wake word you want to use, or when you don’t know what wake word you want to activate your app.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Announcing the Export to Independence Contest]]></title><description><![CDATA[Export to Independence is a community contest to see who can best port a smart speak voice app to a mobile voice assistant using Spokestack. The contest includes prizes totaling $5000. $5000 will also be given to charity.]]></description><link>https://www.spokestack.io/blog/announcing-the-export-to-independence-contest</link><guid isPermaLink="false">https://www.spokestack.io/blog/announcing-the-export-to-independence-contest</guid><pubDate>Mon, 15 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;a href=&quot;https://docs.google.com/forms/d/e/1FAIpQLSfXBFTLuyK8BWIFThCRNxMZwjgWBhVtE5EsCuQkvtaDaVvRqw/viewform?usp=sf_link&quot;&gt;REGISTER HERE&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/6bff408693ee84fb5cec93d3cf060e2c/8537d/announcing-the-export-to-independence-contest.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:34.69387755102041%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAHCAYAAAAIy204AAAACXBIWXMAAAsTAAALEwEAmpwYAAAB+UlEQVQozyWRPUwTAQBGb3NkURMTFom1Xnv0+nfXu2t71+u11PLTQqXFpiIlhkICqFAJiyNujuLC5KAEg0ap+FPRkCAGE0FFYiKJGsPubFyeEadvecN7+YTJ399Yb0wihjdodfboXn2PUrqBrOfxGwkC0RTBaOpwA1EHv+EQMBwMO00oZuM3kvgN+5D1heIIoVwJq7hEi3uDluAOha19YlN7iJFLeMMqYtjEJeuc9huIoRgexURSTGYnMgS0DGdCFu1aBk+kG6VcRThmveZEzz5WYpRMvsbC8lPuP2qSL9c45VOIJLLkz1fJ9pXRk514wiaSGifZVUQyG7T5e3FlFtEvfOfOvTmEN9vz3N3+RWFsi+mrwyw+eM7qq03q18YpFuLUx3Ncn6kwOzXARC1HZcBhcLDA0soaj5sfuDw5RLr6ltvNP/w8mEfYXRujNHOAq2cTUauQLCxTGp5DMW3kiIlmORQGyhQrF9HtNO2KTiLbx8PGC568XKc6UqOY78dd3MY98hWh48oEXqtGe6yCR01xUjrHUDlDNGHiVZOHyQErSySVI97RSzCWQlQ7SQ5vMvrxM0fNdxxpXcGX/SdURXD7dKRwBEnRkZQ4shblbJdD0EjgVS1kzcan28jG/4d9moVXdRDTt8jvfKGtf5fjnk8s3Jym/uMZfwEcWyD0OoQY2AAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Export to Independence Contest&quot; title=&quot;Export to Independence Contest&quot; src=&quot;/static/6bff408693ee84fb5cec93d3cf060e2c/05162/announcing-the-export-to-independence-contest.png&quot; srcSet=&quot;/static/6bff408693ee84fb5cec93d3cf060e2c/2eeed/announcing-the-export-to-independence-contest.png 294w,/static/6bff408693ee84fb5cec93d3cf060e2c/0d6a1/announcing-the-export-to-independence-contest.png 588w,/static/6bff408693ee84fb5cec93d3cf060e2c/05162/announcing-the-export-to-independence-contest.png 1175w,/static/6bff408693ee84fb5cec93d3cf060e2c/8537d/announcing-the-export-to-independence-contest.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Today we are announcing our first contest!&lt;/p&gt;&lt;p&gt;Let’s get right to the details.&lt;/p&gt;&lt;h3 id=&quot;what&quot;&gt;&lt;a href=&quot;#what&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What&lt;/h3&gt;&lt;p&gt;Export to Independence is a community contest to see who can best port a smart speaker voice app to a mobile voice assistant using Spokestack. The contest encourages participants to export an Alexa or Dialogflow interaction model to create a mobile voice app. Participants will be judged on their creativity and innovation on creating intelligent mobile voice app experiences.&lt;/p&gt;&lt;h3 id=&quot;prizes&quot;&gt;&lt;a href=&quot;#prizes&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Prizes&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;Grand prize winner will receive \$3,000 for best overall voice experience on iOS or Android&lt;/li&gt;&lt;li&gt;4 runners up will receive \$500&lt;/li&gt;&lt;li&gt;\$5,000 will be donated to the &lt;a href=&quot;https://donate.splcenter.org/&quot;&gt;Southern Poverty Law Center&lt;/a&gt; on behalf of all participants who may choose to be listed or anonymous in the donation.&lt;/li&gt;&lt;li&gt;All teams that submit a final app will receive a Spokestack t-shirt for up to 3 team members&lt;/li&gt;&lt;li&gt;Every person who enters and creates an account will receive a Spokestack Smiley Facemask&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;dates---extended&quot;&gt;&lt;a href=&quot;#dates---extended&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dates - Extended!&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;Contest officially starts Monday, June 15th&lt;/li&gt;&lt;li&gt;Submissions must be submitted by Tuesday, September 15th&lt;/li&gt;&lt;li&gt;Winners will be announced Thursday, October 1st&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;requirements&quot;&gt;&lt;a href=&quot;#requirements&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Requirements&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;All submissions must use Spokestack’s iOS, Android or React Native libraries in their applications.&lt;/li&gt;&lt;li&gt;Developers are not required to use an existing Alexa or Google Assistant interaction model for submission. We realize many folks use other NLU or custom interaction models for their conversational experiences.&lt;/li&gt;&lt;li&gt;The Spokestack Export to Independence Developer Contest is open only to individuals who are legal residents of the United States of America or the District of Columbia who are eighteen (18) years of age or older at the time of entry.&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;support&quot;&gt;&lt;a href=&quot;#support&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Support&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;/tutorials&quot;&gt;Tutorials&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;/docs&quot;&gt;Documentation&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;https://forum.spokestack.io/&quot;&gt;Community Forum&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Live Video Workshops: Dates to be announced&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;legal&quot;&gt;&lt;a href=&quot;#legal&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Legal&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;All contestants must agree to the &lt;a href=&quot;/contest-rules&quot;&gt;terms of the contest&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you have any further questions, please &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;contact us&lt;/a&gt;. We’re incredibly excited to begin bringing together the smart speaker and mobile developer communities! We believe this will be the start of a completely new type of mobile app and the realization of voice-driven user experiences promised all the way back in 2011 when Siri first launched.&lt;/p&gt;&lt;p&gt;We look forward to seeing what you build, and we’re thrilled to help you!&lt;/p&gt;&lt;p&gt;Good luck!&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://docs.google.com/forms/d/e/1FAIpQLSfXBFTLuyK8BWIFThCRNxMZwjgWBhVtE5EsCuQkvtaDaVvRqw/viewform?usp=sf_link&quot;&gt;REGISTER HERE&lt;/a&gt;&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Implementing Reprompts with Spokestack]]></title><description><![CDATA[A tutorial for reprompting a user after a period of inactivity]]></description><link>https://www.spokestack.io/blog/implementing-reprompts-with-spokestack</link><guid isPermaLink="false">https://www.spokestack.io/blog/implementing-reprompts-with-spokestack</guid><pubDate>Tue, 09 Jun 2020 05:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/0cbda2e34bbcb7db827140e133f69a57/8537d/implementing-reprompts-with-spokestack.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACM0lEQVQoz22SO08UURSA187CH2DHzzCx0sTGytbCEIyGgLXRRKMgSqGJrUQTiYIQIFBggRTGhBWw2AYERMDwcBeXZXferzv3DvuZe2eJkHiTL3Nmcs4355yZQpxmnCQSyiBURsORvHrfIIwVichYKAX4kUQ2MYSJJANGxif4MDZu4sIpoVDmGiaKrKnY2Em4eG2LhdIfSkv7zC2WsWyPlw/uMj05iT5N4EbXLa53dpj7QiT+dXUcCwVVH4YXHe4N7vNwaJ9HI2Xuv6lwYEmmxkb5Wpw3HR0BfU/76X3SZ+SFqNVVjiJKm6RxnUp5nYGiy/MZixczDv0fbS6177JXEaYTPXIQpy3hMx739JrYjBwKRRAr/CglUmDvzrI818Pc9zpXu39x5fYmlzs2abuwQbkqSDNp9hcJaeQ3u+7Q3tmdj2z5ioaXU3cldS+jbnn4dpndSsKZ86sUzq0YzratsV0ReLHCCZT5QKoJrweHGHj7zsQFLbL9PMHW+Lk0kJBkUFqK+TJv8bnYoPjN5dAW6CYOHUnNlqaJ9Chfga4vuKFC40X6jQovlMQSlte2GB6dYKdmceAJynZAKFOCKDX5TqhysSupWoJDJ82FWuLHJ4gkiYL1zR3Gp6b5uddgdjvi00pEKps4vjBCXaf3rnFbcj1l4fhhkCjz/4VJhh/nY+iTZYofVsjq74gk0bKUIM5MvqFVryc0wpojzT5Oo6jZKdWGoGqlOK7C9SUH9v9yJbVjbMlfZqIgBSvfnq0AAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Implementing Reprompts with Spokestack&quot; title=&quot;Implementing Reprompts with Spokestack&quot; src=&quot;/static/0cbda2e34bbcb7db827140e133f69a57/05162/implementing-reprompts-with-spokestack.png&quot; srcSet=&quot;/static/0cbda2e34bbcb7db827140e133f69a57/2eeed/implementing-reprompts-with-spokestack.png 294w,/static/0cbda2e34bbcb7db827140e133f69a57/0d6a1/implementing-reprompts-with-spokestack.png 588w,/static/0cbda2e34bbcb7db827140e133f69a57/05162/implementing-reprompts-with-spokestack.png 1175w,/static/0cbda2e34bbcb7db827140e133f69a57/8537d/implementing-reprompts-with-spokestack.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Reprompts are a convenient feature provided by most smart speaker platforms. In plain English, a reprompt is a special message given to the user under the following circumstances:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The app requests information from the user, leaving the mic open to listen for the answer&lt;/li&gt;&lt;li&gt;The user remains silent for a pre-set length of time&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Platforms differ in the number of reprompts they allow an app to give the user before the platform itself takes control and shuts off the mic — some only allow one, some will give you the chance to deliver up to four different prompts before finally giving up on the user. With Spokestack, you can choose that number for yourself. In an app where the user is expected to switch back and forth between voice and gesture input frequently, reprompts might be inappropriate altogether, but in one that’s designed to be used hands-free, you might want to give the user a couple chances to answer a question.&lt;/p&gt;&lt;p&gt;In Spokestack, reprompting is a matter of responding to the timeout event sent by the speech pipeline. In Android, this event is received by the &lt;code&gt;OnSpeechEventListener&lt;/code&gt;, in Swift by the &lt;code&gt;SpeechEventListener&lt;/code&gt;, or in React Native by attaching a listener to the &lt;code&gt;onTimeout&lt;/code&gt; event.&lt;/p&gt;&lt;p&gt;The specifics will vary based on how you’ve set up your app, but the basic pseudocode for the process would be:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;js&quot;&gt;&lt;pre class=&quot;language-js&quot;&gt;&lt;code class=&quot;language-js&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; event is &lt;span class=&quot;token constant&quot;&gt;TIMEOUT&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; dialog_manager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;is_waiting_for_response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
    reprompt &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;get_next_reprompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; reprompt&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
      Spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;synthesize&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;reprompt&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// re-open the microphone to listen for the&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// answer again, but don&amp;#x27;t do it here since&lt;/span&gt;
      &lt;span class=&quot;token comment&quot;&gt;// you don&amp;#x27;t want to listen to the reprompt itself&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And that’s it! For the reprompts, at least. Configuring the dialog manager to know when reprompts are appropriate is left as an exercise to the reader. As always, if you have any issues, don’t hesitate to reach out via &lt;a href=&quot;https://forum.spokestack.io/&quot;&gt;Discourse&lt;/a&gt;, &lt;a href=&quot;https://stackoverflow.com/questions/tagged/spokestack&quot;&gt;Stack Overflow&lt;/a&gt;, or a &lt;a href=&quot;https://github.com/spokestack&quot;&gt;GitHub&lt;/a&gt; issue.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Porting the Alexa Minecraft Skill to Android Using Spokestack]]></title><description><![CDATA[Make a voice-based app for smart speakers. Spokestack makes it easy to convert a smart speaker voice app to a mobile app. Follow our process.]]></description><link>https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-android-using-spokestack</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-android-using-spokestack</guid><pubDate>Tue, 09 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/620a9ed21d4bc5e9e2f9bacd1948e58d/8537d/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACJ0lEQVQoz3WTTWsTURSGZ+dSCoIuREsXNc733DtzMx+JDlibNq2maUmbpi22KSr4DYILRRRFrN24UBFpa22NCzf+Ijf+kkfulMQguni43DPnvPc9954xHMdBY9t2sTquW+AHAWmWIaOQSCmySoUsy0iShCAI/uQP1zoOxnDAsizM8XHOj43hlEqUhSCwbXzLQvk+SRxTLpfxPK/I1fRr+6sxfIoQgqzRoNrpkHe7XFhfJ9/cpNJeIl2Yx5UC3/OQUhBKSRSF/3GosSxc12V66w3zX3s09/aZ+7RDc++Aub1Dru4eki8ssdJZY2l5jdkr81xb75LnOaZp/hHU6lpIb6rVKu2NDZqrqzRXOkwuLHLjZoPe+0V+7LTpbdf5/GKC/ZeX+b49wePbNYSMsS1z4NSIoqi4lzAMqdVqrCwv0261aMzOMD01w0Yr58OTCXafT3LwaorD13U+Ppuit1XnQfcSthvguc7AlKGUKsQ0vu8XH/zAx3ZcTMvB83WBVbhQKiSLFUIqAlkmUkePpE31NQwpZfEYehQ0eu/YLnfvX2f77VNiFSPmdlGtLwSez8jJE5wePYOKQpSKiOO4GKW+MUOLaGd99L5UMnn08A7fDt6RJjnhrZ/E937hmYJjZ48zMnqKwD3K1yOkDfVdGjrwN/ougkBQqVzEdS1kPINQdaTwKZ0rYZnmQEzTN6K7M7Tdf6Fb0S0laUoSyyOSpPhb0jQdtDqMjv8GTDh7jtmE3qgAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting the Alexa Minecraft Skill to Android using Spokestack&quot; title=&quot;Porting the Alexa Minecraft Skill to Android using Spokestack&quot; src=&quot;/static/620a9ed21d4bc5e9e2f9bacd1948e58d/05162/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png&quot; srcSet=&quot;/static/620a9ed21d4bc5e9e2f9bacd1948e58d/2eeed/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png 294w,/static/620a9ed21d4bc5e9e2f9bacd1948e58d/0d6a1/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png 588w,/static/620a9ed21d4bc5e9e2f9bacd1948e58d/05162/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png 1175w,/static/620a9ed21d4bc5e9e2f9bacd1948e58d/8537d/porting-the-alexa-minecraft-skill-to-android-using-spokestack.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;This post is part of the &lt;em&gt;Porting a Smart Speaker Voice App to Mobile&lt;/em&gt; series, which discusses how to turn an Alexa skill into a mobile app using Spokestack as a replacement for Amazon’s voice services. Other articles in the series can be found at the following links:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Part 1: &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;Voice Apps on Smart Speakers&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Part 2: &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-2&quot;&gt;Voice Apps on Mobile&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Part 3: &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-3&quot;&gt;Import an Alexa or Dialogflow Interaction Model&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/create-an-alexa-compatible-dialog-manager-in-swift&quot;&gt;Create an Alexa-Compatible Dialog Manager in Swift&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;Porting the Alexa Minecraft Skill to iOS Using Spokestack&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: Porting the Alexa Minecraft Skill to Android Using Spokestack (You are here!)&lt;/li&gt;&lt;li&gt;Tutorial: Porting the Alexa Minecraft Skill to React Native Using Spokestack (Coming Soon)&lt;/li&gt;&lt;/ul&gt;&lt;hr/&gt;&lt;p&gt;This tutorial will walk you through the details of porting an Alexa skill to Android. To be specific, we’ll be recreating a skill that lets the user ask for a recipe in &lt;a href=&quot;https://www.minecraft.net/en-us/&quot;&gt;Minecraft&lt;/a&gt;. The full code for our finished app is &lt;a href=&quot;https://github.com/spokestack/minecraft-skill-android&quot;&gt;here&lt;/a&gt;; feel free to download it for reference as you follow along. We’ll go through it step by step in the tutorial for sake of discussion.&lt;/p&gt;&lt;p&gt;First, a quick note on language choice. Android development has been moving toward Kotlin as its preferred language. Our other Android guides are written with that in mind, but we’re going to switch gears for this one. Amazon provides a &lt;a href=&quot;https://developer.amazon.com/en-US/docs/alexa/alexa-skills-kit-sdk-for-java/overview.html&quot;&gt;Java SDK&lt;/a&gt; for Alexa development, so in order to ease friction between the two ecosystems, we’ll port the skill to Java instead of Kotlin. The example code here shouldn’t be too complex to translate into Kotlin if you’re familiar with it.&lt;/p&gt;&lt;p&gt;With that pretext out of the way, let’s get to coding. Our first job will be to establish the mobile-specific stuff you don’t have to do when setting up an Alexa skill.&lt;/p&gt;&lt;h2 id=&quot;app-scaffolding&quot;&gt;&lt;a href=&quot;#app-scaffolding&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;App Scaffolding&lt;/h2&gt;&lt;p&gt;On Android, the &lt;code&gt;Activity&lt;/code&gt; is one of the fundamental building blocks of an app. Roughly speaking, an &lt;code&gt;Activity&lt;/code&gt; is the code for a single scene’s behavior (minus a description of the visual layout, which is done elsewhere). If you’re using &lt;a href=&quot;https://developer.android.com/studio&quot;&gt;Android Studio&lt;/a&gt; — and you probably should be for this — you’ll want to open a new project and use the “Empty Activity” template. This will create all the boilerplate we’ll need, along with a &lt;code&gt;MainActivity&lt;/code&gt; we’ll be referring to throughout the rest of this guide. When you’re setting up a new project, Android Studio also asks for a minimum SDK. We chose 21 for this guide.&lt;/p&gt;&lt;p&gt;To avoid cluttering this guide too much, we’ll omit the full &lt;code&gt;build.gradle&lt;/code&gt; files and the detailed code for requesting microphone permissions in &lt;code&gt;MainActivity&lt;/code&gt;. See the example files for copy/paste-able examples. Here are the important things you’re looking for in &lt;code&gt;app/build.gradle&lt;/code&gt;:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;The &lt;code&gt;native-dependencies&lt;/code&gt; plugin application and the block associated with it that retrieves Spokestack’s native library (at the top of the file)&lt;/li&gt;&lt;li&gt;The &lt;code&gt;ndkVersion&lt;/code&gt; line in the &lt;code&gt;android&lt;/code&gt; block. See &lt;a href=&quot;https://developer.android.com/studio/projects/install-ndk&quot;&gt;here&lt;/a&gt; for information on installing the NDK, and make sure your version number in &lt;code&gt;build.gradle&lt;/code&gt; matches the one you install.&lt;/li&gt;&lt;li&gt;The &lt;code&gt;compileOptions&lt;/code&gt; block, also under &lt;code&gt;android&lt;/code&gt;. These options allow some of our dependencies to build properly.&lt;/li&gt;&lt;li&gt;The Spokestack library dependency and others associated with it (at the bottom of the file)&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;And in the &lt;code&gt;build.gradle&lt;/code&gt; in your root directory:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;A &lt;code&gt;classpath&lt;/code&gt; dependency that retrieves the &lt;code&gt;native-dependencies&lt;/code&gt; plugin&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;OK, time to dive into &lt;code&gt;MainActivity&lt;/code&gt; and set up Spokestack to handle user speech.&lt;/p&gt;&lt;h2 id=&quot;can-you-hear-me-now&quot;&gt;&lt;a href=&quot;#can-you-hear-me-now&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Can you hear me now?&lt;/h2&gt;&lt;p&gt;In order to turn your phone into a (better) smart speaker, you need to take control of the microphone and process user input from it. That’s why we requested the permissions in the previous section. The Spokestack component used to actually &lt;em&gt;do&lt;/em&gt; something with that data is called &lt;code&gt;SpeechPipeline&lt;/code&gt;. It handles collecting audio from the user and turning it into text (automatic speech recognition, or ASR), and there are other components for extracting meaning from that text and for generating audio in response to the user. These exist separately in the Spokestack library so that an app can pick and choose which ones it needs. We need them all for this app, so let’s make a class to contain and control them all. Let’s call it … I don’t know … &lt;code&gt;Spokestack&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Since we’re building all three components there, this is another file we won’t discuss in its entirety; you can find the full version in the demo project. For the purpose of this guide, we’re just going to pretend it exists and talk about how to interact with it in &lt;code&gt;MainActivity&lt;/code&gt;.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;MainActivity&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;AppCompatActivity&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Spokestack&lt;/span&gt; spokestack&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;token annotation punctuation&quot;&gt;@Override&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;protected&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;onCreate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;Bundle&lt;/span&gt; savedInstanceState&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;onCreate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;savedInstanceState&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;viewBinding &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;ActivityMainBinding&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;inflate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getLayoutInflater&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;setContentView&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;viewBinding&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getRoot&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// see if we were granted the microphone permission&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// during a previous session; if so, go ahead and build&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// Spokestack components&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;checkMicPermission&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;buildSpokestack&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// other code...&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;buildSpokestack&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// extract the models from the asset bundle if we need to&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;checkForModels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Spokestack&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getApplicationContext&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;// start the speech pipeline&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;launch&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;Exception&lt;/span&gt; e&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token class-name&quot;&gt;Log&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;logTag&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Problem starting Spokestack&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; e&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is all we need to interact with the components we’ve built — the &lt;code&gt;start&lt;/code&gt; method gives Spokestack control of the microphone via its &lt;code&gt;SpeechPipeline&lt;/code&gt; so that we can start hearing the user. If you look at how we’ve set the pipeline up for this project, though (in &lt;code&gt;Spokestack&lt;/code&gt;), you’ll notice a line that sets the pipeline’s “profile” to &lt;code&gt;PushToTalkAndroidASR&lt;/code&gt;. This means that we’re not using a wake word (e.g., “Alexa”) to tell the app to start actively listening to the user. Spokestack does support this, and you can see an example configuration in &lt;a href=&quot;/docs/android/cookbook&quot;&gt;our Android cookbook&lt;/a&gt;, but we’re going to use a button here for sake of simplicity. That means we’ll need a microphone button and a handler that starts sending audio through speech recognition when the button is tapped:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token comment&quot;&gt;// still in MainActivity&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;activateAsrTapped&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;View&lt;/span&gt; view&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;spokestack&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;activateAsr&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That wasn’t so bad, was it? We’re manually activating the ASR, but the configuration we’ve set up in the &lt;code&gt;Spokestack&lt;/code&gt; class will handle deactivating it after speech stops.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: We’re using the Android ASR here because it’s the easiest way to demo ASR, but it’s not available on all devices. See our &lt;a href=&quot;/docs/concepts/asr&quot;&gt;ASR documentation&lt;/a&gt; for more information on it and the other ASR providers Spokestack integrates with.&lt;/p&gt;&lt;p&gt;Also note that &lt;a href=&quot;https://developer.android.com/guide/topics/media/mediarecorder&quot;&gt;the Android emulator cannot record audio&lt;/a&gt;. You’ll need to test ASR on a real device.&lt;/p&gt;&lt;p&gt;With those caveats behind us, our next job is to make an effort to actually understand the user…&lt;/p&gt;&lt;h2 id=&quot;integrating-nlu&quot;&gt;&lt;a href=&quot;#integrating-nlu&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Integrating NLU&lt;/h2&gt;&lt;p&gt;&lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-3&quot;&gt;Part 3&lt;/a&gt; of this series includes a refresher on the concept of natural language understanding (NLU) as well as instructions for converting your Alexa model into a format usable by Spokestack, so we’ll just cover the Android-specific parts here.&lt;/p&gt;&lt;p&gt;The configuration for the Spokestack NLU is in the &lt;code&gt;Spokestack&lt;/code&gt; file, just like our ASR setup. It requires three external files — a &lt;a href=&quot;https://www.tensorflow.org/lite&quot;&gt;TensorFlow Lite&lt;/a&gt; model, a JSON metadata file that describes its output, and a vocabulary file used to transform ASR results into input for the model. These files aren’t huge (typically &amp;lt; 20 MB total), but they’re big enough that you probably want to distribute them compressed to keep your app download size down.&lt;/p&gt;&lt;p&gt;To do that, place all three files in the &lt;code&gt;src/main/assets&lt;/code&gt; directory under your app’s root directory. You may have to manually create &lt;code&gt;assets&lt;/code&gt;, but the others should have been created along with your project. Files in the &lt;code&gt;assets&lt;/code&gt; directory have to be decompressed at runtime to be used. We typically do that by decompressing them to the application’s cache directory on startup, then on subsequent starts checking if the files exist and decompressing again if the user has cleared the cache or the app has been updated to a new version. This is the &lt;code&gt;checkForModels()&lt;/code&gt; method from the previous section. In the demo app, this code is in &lt;code&gt;MainActivity&lt;/code&gt;, but you could put it elsewhere for sake of cleanliness:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;checkForModels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// PREF_NAME and VERSION_KEY are static Strings set at the top of the file;&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// we want PREF_NAME to uniquely refer to our app, and VERSION_KEY to be&lt;/span&gt;
    &lt;span class=&quot;token comment&quot;&gt;// unique within the app itself&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;modelsCached&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;decompressModels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;int&lt;/span&gt; currentVersionCode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;BuildConfig&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;VERSION_CODE&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token class-name&quot;&gt;SharedPreferences&lt;/span&gt; prefs &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;getSharedPreferences&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;PREF_NAME&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Context&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;MODE_PRIVATE&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;int&lt;/span&gt; savedVersionCode &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; prefs&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getInt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;VERSION_KEY&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; NONEXISTENT&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;currentVersionCode &lt;span class=&quot;token operator&quot;&gt;!=&lt;/span&gt; savedVersionCode&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;decompressModels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

            &lt;span class=&quot;token comment&quot;&gt;// Update the shared preferences with the current version code&lt;/span&gt;
            prefs&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;edit&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;putInt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;VERSION_KEY&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; currentVersionCode&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;apply&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;modelsCached&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; nluName &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu.tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;File&lt;/span&gt; nluFile &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getCacheDir&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;/&amp;quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; nluName&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; nluFile&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;decompressModels&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;cacheAsset&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu.tflite&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;cacheAsset&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;metadata.json&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;cacheAsset&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;vocab.txt&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;cacheAsset&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;minecraft-recipe.json&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;catch&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;IOException&lt;/span&gt; e&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token class-name&quot;&gt;Log&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;e&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;logTag&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;Unable to cache NLU data&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; e&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;cacheAsset&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; assetName&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;IOException&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;File&lt;/span&gt; assetFile &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;File&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getCacheDir&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;/&amp;quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; assetName&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;InputStream&lt;/span&gt; inputStream &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;getAssets&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;open&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;assetName&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;int&lt;/span&gt; size &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; inputStream&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;available&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; buffer &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;byte&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;size&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    inputStream&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;buffer&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    inputStream&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;FileOutputStream&lt;/span&gt; fos &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;FileOutputStream&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;assetFile&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    fos&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;buffer&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    fos&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;close&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The configuration in our &lt;code&gt;Spokestack&lt;/code&gt; class expects the files to be located in the cache directory (and named as listed above). If you change the files’ locations, you’ll need to update the file paths in &lt;code&gt;Spokestack.java&lt;/code&gt; as well.&lt;/p&gt;&lt;p&gt;Other than that, though, you’re all set with NLU. Doing the actual utterance classification happens each time an ASR transcript is available, via the following code in &lt;code&gt;Spokestack&lt;/code&gt;. Note that it’s an overridden method; &lt;code&gt;Spokestack&lt;/code&gt; implements the Spokestack library’s &lt;code&gt;OnSpeechEventListener&lt;/code&gt; interface to receive events from the &lt;code&gt;SpeechPipeline&lt;/code&gt;.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token annotation punctuation&quot;&gt;@Override&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;onEvent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;SpeechContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;Event&lt;/span&gt; event&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;SpeechContext&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;switch&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;event&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;// other events&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; RECOGNIZE&lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;// the RECOGNIZE event signifies that a result is available from the selected ASR.&lt;/span&gt;
            &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; utterance &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; context&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getTranscript&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token class-name&quot;&gt;NLUResult&lt;/span&gt; result &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;classify&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;utterance&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token class-name&quot;&gt;Response&lt;/span&gt; response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;handleIntent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;response&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice what comes &lt;em&gt;after&lt;/em&gt; the NLU does its thing. That’s our next step: putting another small piece of Amazon’s backend right in the app.&lt;/p&gt;&lt;h2 id=&quot;recreating-dialog-management&quot;&gt;&lt;a href=&quot;#recreating-dialog-management&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Recreating Dialog Management&lt;/h2&gt;&lt;p&gt;A dialog manager takes the intent and slots from the NLU along with the current context (or “state”) of the conversation and decides what sort of response the system should give. In the Alexa SDK, this means that the system consults, &lt;a href=&quot;https://developer.amazon.com/en-US/docs/alexa/alexa-skills-kit-sdk-for-java/handle-requests.html#handler-processing-order&quot;&gt;in order&lt;/a&gt;, a series of &lt;a href=&quot;https://developer.amazon.com/en-US/docs/alexa/alexa-skills-kit-sdk-for-java/develop-your-first-skill.html#implementing-request-handlers&quot;&gt;&lt;code&gt;RequestHandler&lt;/code&gt;&lt;/a&gt;s that have been registered to a &lt;a href=&quot;https://developer.amazon.com/en-US/docs/alexa/alexa-skills-kit-sdk-for-java/develop-your-first-skill.html#implementing-the-skillstreamhandler&quot;&gt;&lt;code&gt;SkillStreamHandler&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In Java terms, this means you have a series of classes that all implement an interface for handling user intents, and a manager class somewhere that has an ordered list of these handlers. When a request comes in, the first handler capable of responding to the request is chosen. That’s all easy enough to replicate without the help of Lambda or the SDK, so let’s do it.&lt;/p&gt;&lt;p&gt;First, the handler interface. We’ll stick closely to the names from the Alexa SDK for sake of analogy even though our objects will be slightly different.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;interface&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RequestHandler&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;boolean&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;canHandle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;HandlerInput&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Response&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;handle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;HandlerInput&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We’re not returning an &lt;code&gt;Optional&lt;/code&gt; from &lt;code&gt;handle&lt;/code&gt; like Amazon does just to maintain compatibility with a wider range of Android devices (it wasn’t introduced until &lt;a href=&quot;https://developer.android.com/reference/java/util/Optional&quot;&gt;API 24&lt;/a&gt;).&lt;/p&gt;&lt;p&gt;And now the manager itself. It’s not much to look at either.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;DialogManager&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;token generics&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt; requestHandlers&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Session&lt;/span&gt; session&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;DialogManager&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;token generics&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt; handlers&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
                         &lt;span class=&quot;token class-name&quot;&gt;Cookbook&lt;/span&gt; cookbook&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;super&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;session &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Session&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;cookbook&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;requestHandlers &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;ArrayList&lt;/span&gt;&lt;span class=&quot;token generics&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlers&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Response&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;handleIntent&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;NLUResult&lt;/span&gt; nluResult&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token class-name&quot;&gt;HandlerInput&lt;/span&gt; handlerInput &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;HandlerInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;nluResult&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;session&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token class-name&quot;&gt;Response&lt;/span&gt; response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;RequestHandler&lt;/span&gt; handler &lt;span class=&quot;token operator&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;requestHandlers&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handler&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;canHandle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlerInput&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                response &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; handler&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;handle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlerInput&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;break&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;token function&quot;&gt;updateSession&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;nluResult&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; response&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; response&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The main addition here is the &lt;code&gt;Cookbook&lt;/code&gt;, which we use to look up Minecraft recipes. It’s related to the conversation state, so we’re putting it right in the &lt;code&gt;Session&lt;/code&gt;, but if we had a more complex response generation system, it would probably belong there instead. In the Amazon SDK, the &lt;code&gt;Session&lt;/code&gt; object is nested inside a &lt;code&gt;RequestEnvelope&lt;/code&gt;, but we don’t have a need for most of the other things their SDK exposes via objects like &lt;code&gt;RequestEnvelope&lt;/code&gt; and &lt;code&gt;AttributesManager&lt;/code&gt;, so we’ve flattened out the API.&lt;/p&gt;&lt;p&gt;That’s it for the guts of the dialog management system. Setup is done, once again, in the &lt;code&gt;Spokestack&lt;/code&gt; class:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;buildDialogManager&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; cacheDir &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appContext&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getCacheDir&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getAbsolutePath&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;token class-name&quot;&gt;List&lt;/span&gt;&lt;span class=&quot;token generics&quot;&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;extends&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt; handlers &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Arrays&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;asList&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;LaunchHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RecipeHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;cacheDir&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;HelpHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;RepeatHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;ExitHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
          &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;ErrorHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;DialogManager&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlers&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You can tell what most of those handlers do from their names; their rather simple code is available in the demo project. Remember that this is a Minecraft skill whose main job is to look up “recipes” for different in-game items. Hence, most of the interesting work is done in &lt;code&gt;RecipeHandler&lt;/code&gt;. A discussion of its business logic is outside the scope of this tutorial, but do take a look through the source code. It’s heavily commented to offer some tips for voice search, an important consideration for many apps.&lt;/p&gt;&lt;p&gt;We’ve now dealt with the input side of the equation, so all that’s left is to make the app talk back. It’s not an easy task, but Spokestack makes the implementation simple.&lt;/p&gt;&lt;h2 id=&quot;speaking-our-mind&quot;&gt;&lt;a href=&quot;#speaking-our-mind&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Speaking our mind&lt;/h2&gt;&lt;p&gt;Before the app speaks, we have to figure out what it should say. This example skill has a simple call-and-response-style conversation, so there aren’t many different states the conversation can be in. An app designed to carry on a longer conversation will probably want a different structure, but we can get away with putting our prompts in a simple enum and selecting from it based on intent at runtime:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Responses&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token function&quot;&gt;WELCOME&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;Welcome to %s. You can ask a question like, what&amp;#x27;s the recipe for a %s? ... Now, what can I help you with?&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;ERROR&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;Sorry, I can&amp;#x27;t understand the command. Please say it again.&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;

    &lt;span class=&quot;token comment&quot;&gt;// other responses...&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;token class-name&quot;&gt;Responses&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;prompt &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; prompt&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;public&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;formatPrompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt; params&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;format&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;prompt&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; params&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Notice that the responses are templates, allowing us to inject data at runtime. Again, we’re taking advantage of our app’s simplicity; you might want a more robust templating solution than plain &lt;code&gt;String.format()&lt;/code&gt; for yours. We’ve also gotten rid of Alexa-style reprompts, which is the name for an additional prompt that can be delivered if the app asks a question but doesn’t receive an answer for a pre-set amount of time. There’s nothing stopping you from implementing reprompts in Spokestack; it’s just once again outside the scope of this guide. We have &lt;a href=&quot;/blog/reprompting-with-spokestack&quot;&gt;a separate tutorial&lt;/a&gt; with more information if you’re interested in including reprompts.&lt;/p&gt;&lt;p&gt;The &lt;code&gt;Responses&lt;/code&gt; enum covers looking up and formatting our responses; all that’s left is to turn them into audio and play them to the user. Unfortunately, dealing with media players on mobile platforms isn’t very straightforward. If your app already deals with audio playback, you may wish to approach TTS differently, but Spokestack can handle both generation and playback. This is how the TTS component is set up in our trusty &lt;code&gt;Spokestack&lt;/code&gt; class:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;buildTTS&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;throws&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;Exception&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;null&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;TTSManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;Builder&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setTTSServiceClass&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;io.spokestack.spokestack.tts.SpokestackTTSService&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setOutputClass&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;io.spokestack.spokestack.tts.SpokestackTTSOutput&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;spokestack-id&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;YOUR-ID-HERE&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setProperty&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;spokestack-secret&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;YOUR-SECRET-HERE&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;addTTSListener&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;setAndroidContext&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appContext&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
              &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It’s the &lt;code&gt;SpokestackTTSOutput&lt;/code&gt; class that’s responsible for playback; if you want to manage that yourself, you’ll want to configure a &lt;code&gt;TTSListener&lt;/code&gt; to receive and play the audio URLs that are returned by the TTS service. Here, &lt;code&gt;Spokestack&lt;/code&gt; is set up as a listener, but it only recognizes error events so they can be logged appropriately.&lt;/p&gt;&lt;p&gt;Note also the &lt;code&gt;spokestack-id&lt;/code&gt; and &lt;code&gt;spokestack-secret&lt;/code&gt; configuration properties: these credentials are available from the account section of the Spokestack website.&lt;/p&gt;&lt;p&gt;With the TTS component established, generating audio is as simple as turning our prompt text into a &lt;code&gt;SynthesisRequest&lt;/code&gt; and sending it along:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;java&quot;&gt;&lt;pre class=&quot;language-java&quot;&gt;&lt;code class=&quot;language-java&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token class-name&quot;&gt;String&lt;/span&gt; text&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token class-name&quot;&gt;SynthesisRequest&lt;/span&gt; request &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;SynthesisRequest&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;Builder&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;text&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;build&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;synthesize&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;request&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each request handler pulls back a prompt template from the &lt;code&gt;Responses&lt;/code&gt; enum and calls &lt;code&gt;formatPrompt&lt;/code&gt; on it to fill it with dynamic data, so by the time the prompt is in a &lt;code&gt;Response&lt;/code&gt; object, it’s already in its final form, and all we need to do is pull it out and send it through TTS.&lt;/p&gt;&lt;p&gt;Some of the prompts end in questions, which imply that the microphone should be left open (and ASR activated) after the prompt plays. This is done in &lt;code&gt;Spokestack&lt;/code&gt; by listening for the &lt;code&gt;PLAYBACK_COMPLETE&lt;/code&gt; event from the TTS component and calling &lt;code&gt;pipeline.activate()&lt;/code&gt; if the last response in the &lt;code&gt;Session&lt;/code&gt; indicates expected user input.&lt;/p&gt;&lt;h2 id=&quot;youre-done&quot;&gt;&lt;a href=&quot;#youre-done&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;You’re done!&lt;/h2&gt;&lt;p&gt;With ASR, NLU, and TTS in place, you’ve effectively recreated the Alexa experience…without Alexa! On a mobile device, you’ll have much more freedom to make your touch-based UI match your voice experience, and of course you can manage user accounts your way instead of Amazon’s.&lt;/p&gt;&lt;p&gt;This tutorial has been a little involved, and we know we haven’t gone over every aspect of the experience with a fine-tooth comb, so if you run into any issues, don’t hesitate to reach out!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Porting the Alexa Minecraft Skill to iOS Using Spokestack]]></title><description><![CDATA[Make a voice-based app for smart speakers. Spokestack makes it easy to convert a smart speaker voice app to a mobile app. Follow our process.]]></description><link>https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack</guid><pubDate>Mon, 08 Jun 2020 00:00:04 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/eaa2c47d2bcc378dec239243b4e72cba/8537d/minecraft-skill-to-ios-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACHElEQVQoz3WTXU8TQRSG97eQiu1+zH7Mfm+rm1gptGgpBCgFIpSoiR+RxMQLjdFoTCo3XqgxBpAU6oU3/iLv/CGPmSULjdGLJ5Nz9px33jMzq1mWhcI0zWK1hChwXJc0y5CBTxCGZPU6WZaRJAmu617WT/daFtp0wjAM9NlZajMzWNUqkefhmiaOYRA6DkkcE0URtm0XtYqyt1y16V08zyPrdmkMBuTDIdd2dsj39qj310lXlhHSw7FtpPTwpSQI/P84VBgGQggWRu9ZPpvQOz5h6eshveMxS8en3D46JV9ZZ3OwzfrGNou3lrmzMyTPc3RdvxRU6kpIBY1Gg/7uLr2tLXqbA+ZW1rh3v8vk0xo/D/tMDtp8e9Pk5O1Nfhw0ef6whSdjTEO/cKoFQVCci+/7tFotNjc26K+u0l3ssDDfYXc15/OLJkev5xi/m+ds1ObLq3kmozb7wxuYwsUW1oUpLQzDQkzhOE7xwXEdTEugGxa2oxoMTKNGFPmEgcT1fFwZEYTnl6RMlRqalLK4DPUUFCq2TMHjJ3c5+PCSOIzxlo4I18YIS1CpVAp8X6LMxHFcPKXSmKZElLMSFVerOs+ePuL7+CNpkuM/+EWy/5urswZZllCr1YrzUvXqCSlDpUtNJf5GnYXretTr1xHCQMYdvLCNlA6VypVCsBRTlEbUdJqy+y/UKGEYkKQpSSzPSZLib0nT9GLUaVT+D+LMfEIb2WpUAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting the Alexa Minecraft Skill to iOS using Spokestack&quot; title=&quot;Porting the Alexa Minecraft Skill to iOS using Spokestack&quot; src=&quot;/static/eaa2c47d2bcc378dec239243b4e72cba/05162/minecraft-skill-to-ios-hero.png&quot; srcSet=&quot;/static/eaa2c47d2bcc378dec239243b4e72cba/2eeed/minecraft-skill-to-ios-hero.png 294w,/static/eaa2c47d2bcc378dec239243b4e72cba/0d6a1/minecraft-skill-to-ios-hero.png 588w,/static/eaa2c47d2bcc378dec239243b4e72cba/05162/minecraft-skill-to-ios-hero.png 1175w,/static/eaa2c47d2bcc378dec239243b4e72cba/8537d/minecraft-skill-to-ios-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;This tutorial is part of the &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;Porting a Smart Speaker Voice App to Mobile&lt;/a&gt; series.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;Spokestack makes it easy to convert a smart speaker voice app like a Google Action or Alexa Skill to a mobile app. In this tutorial, we’ll go through the entire process of converting a sample Alexa skill into an iOS app using SwiftUI.&lt;/p&gt;&lt;p&gt;We’ve provided some components and boilerplate code so you can follow along more easily. You can download the starting and final versions of the &lt;a href=&quot;https://d3dmqd7cy685il.cloudfront.net/docs/minecraft-ios-tutorial.zip&quot;&gt;code here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;This tutorial is a direct port of the &lt;a href=&quot;https://github.com/alexa/skill-sample-nodejs-howto&quot;&gt;Alexa sample skill&lt;/a&gt; from Alexa’s GitHub repository. We stayed as true as possible to the original and didn’t try to improve on it too much at this point.&lt;/p&gt;&lt;h2 id=&quot;installation-and-setup&quot;&gt;&lt;a href=&quot;#installation-and-setup&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Installation and Setup&lt;/h2&gt;&lt;p&gt;The easiest way to install Spokestack on iOS is using CocoaPods as described in more detail in the &lt;a href=&quot;/docs/ios/getting-started&quot;&gt;getting started docs&lt;/a&gt;. Download and unzip the &lt;a href=&quot;https://d3dmqd7cy685il.cloudfront.net/docs/minecraft-ios-tutorial.zip&quot;&gt;example code&lt;/a&gt;. In the “start” folder, you’ll see a &lt;code&gt;Podfile&lt;/code&gt; with the following contents:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;ruby&quot;&gt;&lt;pre class=&quot;language-ruby&quot;&gt;&lt;code class=&quot;language-ruby&quot;&gt;platform &lt;span class=&quot;token symbol&quot;&gt;:ios&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;#x27;13.0&amp;#x27;&lt;/span&gt;

target &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Minecraft Skill Demo&amp;#x27;&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;do&lt;/span&gt;
    use_frameworks&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;
    pod &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Spokestack-iOS&amp;#x27;&lt;/span&gt;
    pod &lt;span class=&quot;token string&quot;&gt;&amp;#x27;Fuse&amp;#x27;&lt;/span&gt;
&lt;span class=&quot;token keyword&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Run the command &lt;code&gt;pod install&lt;/code&gt; to download and install the dependencies. Next, open &lt;code&gt;Minecraft Skill.xcworkspace&lt;/code&gt; using XCode.&lt;/p&gt;&lt;h2 id=&quot;make-the-app-listen&quot;&gt;&lt;a href=&quot;#make-the-app-listen&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Make the App Listen&lt;/h2&gt;&lt;p&gt;If you run the app now, you’ll see it doesn’t do much. It just displays a screen with a microphone button.&lt;/p&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:372px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/71b04344453babb0410770a5a700c3e0/00cb3/minecraft-skill-to-ios-screen-1.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:177.55102040816325%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAkCAIAAAAGkY33AAAACXBIWXMAAAsTAAALEwEAmpwYAAAAxklEQVRIx+2VPwrCMBSHewiPIAjewCN4Bg/jKsXZwcnFpVfQEwgukYSAQ6FgVbpVSkL+/aStYEGQqINLPt6SwPd+700vyvM8TVPOeZZlAFwDACEEY4wQUhRF+48Gay0ASmmSJJFSqqoqpZQxBh2cc1JKIYTWGi9IKcuyjPADQQ5ykIMc5CD/U3ad+ibZuro+S24v8eWG8QqjBQ7nRyMv2dTXG7MtelMMY0zW9VNbT7kJ2RwxmKMfY7l7dvTauZ2cXbE/vdv5Dip0V46FgwsKAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;App with microphone button&quot; title=&quot;App with microphone button&quot; src=&quot;/static/71b04344453babb0410770a5a700c3e0/00cb3/minecraft-skill-to-ios-screen-1.png&quot; srcSet=&quot;/static/71b04344453babb0410770a5a700c3e0/2eeed/minecraft-skill-to-ios-screen-1.png 294w,/static/71b04344453babb0410770a5a700c3e0/00cb3/minecraft-skill-to-ios-screen-1.png 372w&quot; sizes=&quot;(max-width: 372px) 100vw, 372px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Next, you’re going to start the Spokestack speech pipeline so that your app can hear you. You’re going to configure the app so that it starts listening when you press a button.&lt;/p&gt;&lt;p&gt;The example code contains a class &lt;code&gt;PipelineStore&lt;/code&gt; that implements &lt;a href=&quot;/docs/ios/speech-pipeline&quot;&gt;the delegates&lt;/a&gt; in the Spokestack speech pipeline. The &lt;code&gt;init&lt;/code&gt; method of our &lt;code&gt;ContentView&lt;/code&gt; already calls the &lt;code&gt;pipelineStore.start()&lt;/code&gt; method to start background processing. To begin actually listening, add a new function to &lt;code&gt;PipelineStore&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;activatePipeline&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;[&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;mode&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;] manually activating pipeline&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipeline&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;activate&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isListening &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This new function does two things. First, it calls the &lt;code&gt;activate()&lt;/code&gt; method on the pipeline to start actively listening for voice input. Second, it sets the variable &lt;code&gt;isListening&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;. Since &lt;code&gt;isListening&lt;/code&gt; is &lt;code&gt;@Published&lt;/code&gt;, changes to the variable will automatically be reflected in the UI.&lt;/p&gt;&lt;p&gt;Let’s also modify the &lt;code&gt;SpeechEventListener&lt;/code&gt; implementation in &lt;code&gt;PipelineStore&lt;/code&gt; so that you can update the UI with what the user said:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;didRecognize&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;SpeechContext&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;[&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;mode&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;] didRecognize &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeech&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt; and transscript &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token constant&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;userSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Using the &lt;code&gt;ContentView&lt;/code&gt; class, you can modify your button so it will toggle between listening and not listening when pressed:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;    &lt;span class=&quot;token function&quot;&gt;Button&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;action&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;deactivatePipeline&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;activatePipeline&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;ListeningIcon&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; $pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;background&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;blue&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;foregroundColor&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;white&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;cornerRadius&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;40&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;padding&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Try running the app now. At this point, you should be able to press the button, say something, and see what you said appear in the UI.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/45c3eef4f754351e1f24e84e6d5e75eb/minecraft-skill-to-ios-screen-2.gif&quot; alt=&quot;Animation of app listening to speech&quot;/&gt;&lt;/p&gt;&lt;h3 id=&quot;make-the-app-understand&quot;&gt;&lt;a href=&quot;#make-the-app-understand&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Make the App Understand&lt;/h3&gt;&lt;p&gt;For this tutorial, we built an Alexa skill that allows you to ask for instructions on how to build an item in Minecraft. To do this, we needed to understand what the user is asking about and respond accordingly.&lt;/p&gt;&lt;p&gt;This tutorial includes Alexa’s &lt;a href=&quot;https://github.com/alexa/skill-sample-nodejs-howto/blob/master/models/en-US.json&quot;&gt;interaction model&lt;/a&gt; JSON file. To make our app understand, you’ll need to import this interaction model into your Spokestack account and get an on-device NLU model back. The NLU model consists of two files. For convenience they are already included in the &lt;code&gt;Resources&lt;/code&gt; folder of the tutorial iOS project:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;nlu.tflite - a TensorFlow Lite model file&lt;/li&gt;&lt;li&gt;metadata.json - the model metadata&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;There is also a third file &lt;code&gt;vocab.txt&lt;/code&gt; that is the BERT Wordpiece vocabulary. This is the same for every NLU model and can be &lt;a href=&quot;https://d3dmqd7cy685il.cloudfront.net/nlu/vocab.txt&quot;&gt;downloaded here&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The sample code has a convenience class &lt;code&gt;NLUService&lt;/code&gt; that loads these files and creates a &lt;code&gt;NLUTensorflow&lt;/code&gt; instance. Now you can add an instance of the NLU service to &lt;code&gt;PipelineStore&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;    &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; nluService &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;NLUService&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Let’s use this service to update our &lt;code&gt;didRecognize()&lt;/code&gt; function again. The following code will pass the results from speech recognition to your NLU model for classification. For now, you should see results in your console.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;didRecognize&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;SpeechContext&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;

    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;[&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;mode&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;] didRecognize &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeech&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt; and transscript &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token constant&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;userSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; nluService&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;nlu&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;classify&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;utterances&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;transcript&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;subscribe&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;on&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;global&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;qos&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;userInitiated&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;sink&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;receiveCompletion&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; completion &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;switch&lt;/span&gt; completion &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;failure&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; error&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;// respond appropriately to an error in classification&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu failure &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;error&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;break&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;finished&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;token comment&quot;&gt;// respond appropriately to finished classification&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu finished&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;break&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; receiveValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; results &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; result &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu result&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&quot;make-the-app-respond&quot;&gt;&lt;a href=&quot;#make-the-app-respond&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Make the App Respond&lt;/h3&gt;&lt;p&gt;The process of taking classification results and converting them into a response is the job of a dialog manager. Previously, we discussed how to &lt;a href=&quot;/blog/create-an-alexa-compatible-dialog-manager-in-swift&quot;&gt;create a dialog manager in Swift&lt;/a&gt; that mimics the Alexa SDK for Node.js syntax. The example code includes a partially complete dialog manager &lt;code&gt;SkillDialogManager&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;If you compare the code for the &lt;code&gt;SkillDialogManager&lt;/code&gt; to the &lt;a href=&quot;https://github.com/alexa/skill-sample-nodejs-howto/blob/master/lambda/custom/index.js&quot;&gt;Alexa Node.js handler&lt;/a&gt;, you will see it is shockingly similar. Our dialog manager is just missing the &lt;code&gt;RecipeHandler&lt;/code&gt;. Let’s add that to &lt;code&gt;SkillDialogManager&lt;/code&gt; now.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;RecipeHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;RequestHandler&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;canHandle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlerInput&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;HandlerInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Bool&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;requestEnvelope&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;request&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;type &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;IntentRequest&amp;quot;&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;requestEnvelope&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;request&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;name &lt;span class=&quot;token operator&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;RecipeIntent&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;handle&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;handlerInput&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;HandlerInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;Response&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token comment&quot;&gt;//return HandlerOutput(speak: &amp;quot;&amp;quot;)&lt;/span&gt;

        &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; item&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Slot&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;requestEnvelope&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;request&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;intent&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;slots&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;Item&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

        &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;got item &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;item&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; repromptSpeech &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; responses&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;RECIPE_NOT_FOUND_REPROMPT&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;

        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;item &lt;span class=&quot;token operator&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;token constant&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; itemValue&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; item&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;value&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;as&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;got value &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;itemValue&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

            &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; fuse &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;Fuse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;threshold&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; keys &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;recipes&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;keys&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

            &lt;span class=&quot;token comment&quot;&gt;//let&amp;#x27;s do a fuzzy match to deal with an imperfect ASR&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; results &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; fuse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;search&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;itemValue&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; keys&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

            results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;forEach &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; item &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
                &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;score: &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;item&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;score&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt; - &lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;keys&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;item&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;index&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; recipe&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;?&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token constant&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;;&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;count&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                recipe &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; recipes&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;keys&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;results&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;index&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;token keyword&quot;&gt;guard&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; recipeContent &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; recipe &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; speak &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;format&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; responses&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;RECIPE_NOT_FOUND_WITH_ITEM_NAME&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;itemValue&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;responseBuilder&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;speak &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; repromptSpeech&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;reprompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;repromptSpeech&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getResponse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;responseBuilder&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;recipeContent&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getResponse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; handlerInput&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;responseBuilder&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;responses&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;RECIPE_NOT_FOUND_WITHOUT_ITEM_NAME&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;+&lt;/span&gt; repromptSpeech&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;reprompt&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;repromptSpeech&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;getResponse&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;RecipeHandler&lt;/code&gt; is set up to handle &lt;code&gt;RecipeIntent&lt;/code&gt;. The code looks to see if there is an “item” slot type that holds the name of the item we want to return a recipe for. The Alexa skill fulfillment logic is very straightforward. It just looks up the item name from a &lt;a href=&quot;https://github.com/alexa/skill-sample-nodejs-howto/blob/master/lambda/custom/recipes.js&quot;&gt;JSON dictionary&lt;/a&gt; and returns the recipe that matches. We can do the same in Swift. In fact, we copied the JSON file straight into XCode as &lt;code&gt;recipe.json&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;However, we improved this experience. ASR is not perfect and in some cases it really can’t be. For example, if I ask for a “recipe for red dye,” ASR might register that as “recipe for red die.” While “red dye” is in the dictionary, “red die” is not. There are a few strategies for resolving this that are beyond the scope of this tutorial, but a simple one is to implement a fuzzy match. Our &lt;code&gt;RecipeHandler&lt;/code&gt; uses the &lt;a href=&quot;https://github.com/krisk/fuse-swift&quot;&gt;Swift Fuse library&lt;/a&gt; to find the best similar match instead of an exact match.&lt;/p&gt;&lt;p&gt;The next step is to register the new &lt;code&gt;RecipeHandler&lt;/code&gt; with the dialog manager:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;token class-name&quot;&gt;SkillDialogManager&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; session&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; handlers&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;RequestHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;LaunchRequestHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;RecipeHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;HelpHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;RepeatHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;ExitHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;ErrorHandler&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Lastly, update &lt;code&gt;didRecognize()&lt;/code&gt; again to run NLU results through your dialog manager:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; result &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu result&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;turn&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;type&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;IntentRequest&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; nluResult&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;At this point your app should be able to understand and answer questions.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;/1ce57bc7f4f06fd647d8fd71f5166627/minecraft-skill-to-ios-screen-3.gif&quot; alt=&quot;Animation of app responding to a question&quot;/&gt;&lt;/p&gt;&lt;h3 id=&quot;make-the-app-speak&quot;&gt;&lt;a href=&quot;#make-the-app-speak&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Make the App Speak&lt;/h3&gt;&lt;p&gt;Up to this point, your app has recognized user speech, interpreted intent using an on-device NLU, and handled dialog just like the Alexa skill. Now, you want to make your app respond.&lt;/p&gt;&lt;p&gt;To add TTS capabilities, add this to your &lt;code&gt;PipelineStore&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;lazy&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;private&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; tts&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;TextToSpeech&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;TextToSpeech&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; configuration&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;SpeechConfiguration&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; request&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Response&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; request&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeaking &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; request
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;TextToSpeechInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;request&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then, make a final modification to &lt;code&gt;didRecognize()&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;token number&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; results&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt; result &lt;span class=&quot;token keyword&quot;&gt;in&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;nlu result&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;turn&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;type&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;IntentRequest&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;,&lt;/span&gt; nluResult&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; result&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeaking &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;TextToSpeechInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;These changes make the app respond with recipe instructions and update the UI accordingly while the app is speaking.&lt;/p&gt;&lt;p&gt;To make the app behave more like a smart speaker skill, you can also handle a &lt;code&gt;LaunchRequest&lt;/code&gt; when the app starts. The app will call the dialog manager to get intro text and read it to user when the app launches.&lt;/p&gt;&lt;p&gt;To achieve this, modify &lt;code&gt;ContentView&lt;/code&gt; as follows:&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;var&lt;/span&gt; body&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;some&lt;/span&gt; &lt;span class=&quot;token builtin&quot;&gt;View&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;VStack&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeaking &lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;Button&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;action&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;token keyword&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;deactivatePipeline&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
                    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;activatePipeline&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token function&quot;&gt;ListeningIcon&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; $pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isListening&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;background&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;blue&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;foregroundColor&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token builtin&quot;&gt;Color&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;white&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;cornerRadius&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token number&quot;&gt;40&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;padding&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;onAppear&lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;let&lt;/span&gt; speak &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;turn&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;type&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;LaunchRequest&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;pipelineStore&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;onAppear&lt;/code&gt; event handler fires when the view first appears. The &lt;code&gt;!pipelineStore.isSpeaking&lt;/code&gt; logic hides the mic button while the app is speaking.&lt;/p&gt;&lt;h3 id=&quot;handling-reprompts&quot;&gt;&lt;a href=&quot;#handling-reprompts&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Handling Reprompts&lt;/h3&gt;&lt;p&gt;If the user doesn’t say anything while the app is listening, we should reprompt. Again in &lt;code&gt;SpeechPipeline&lt;/code&gt; modify the &lt;code&gt;didTimeout()&lt;/code&gt; function.&lt;/p&gt;&lt;div class=&quot;gatsby-highlight&quot; data-language=&quot;swift&quot;&gt;&lt;pre class=&quot;language-swift&quot;&gt;&lt;code class=&quot;language-swift&quot;&gt;&lt;span class=&quot;token keyword&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;token function&quot;&gt;didTimeout&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token string&quot;&gt;&amp;quot;[&lt;span class=&quot;token interpolation&quot;&gt;&lt;span class=&quot;token delimiter variable&quot;&gt;\(&lt;/span&gt;mode&lt;span class=&quot;token delimiter variable&quot;&gt;)&lt;/span&gt;&lt;/span&gt;] didTimeout&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    currentResponse &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;try&lt;/span&gt;&lt;span class=&quot;token operator&quot;&gt;!&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;dialogManager&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;turn&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;type&lt;span class=&quot;token punctuation&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;token string&quot;&gt;&amp;quot;TimeoutRequest&amp;quot;&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token function&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;token builtin&quot;&gt;DispatchQueue&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;main&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;async &lt;span class=&quot;token punctuation&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;userSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token constant&quot;&gt;nil&lt;/span&gt;
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;appSays &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak
        &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;isSpeaking &lt;span class=&quot;token operator&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;token boolean&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;token keyword&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;tts&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;speak&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;token function&quot;&gt;TextToSpeechInput&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;(&lt;/span&gt;currentResponse&lt;span class=&quot;token punctuation&quot;&gt;.&lt;/span&gt;speak&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;token punctuation&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;token punctuation&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This implementation is not very clever and will continue reprompting forever or until the user taps the mic button to deactivate the pipeline. A smarter implementation might only reprompt three times before giving up.&lt;/p&gt;&lt;h3 id=&quot;wrapping-up&quot;&gt;&lt;a href=&quot;#wrapping-up&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Wrapping Up&lt;/h3&gt;&lt;p&gt;This tutorial gives you a blueprint for converting any Alexa skill to a mobile app. Of course, mobile apps also have a screen. This only scratches the surface of what you could do to customize a skill. You can also display images, videos, or complex visual user interfaces as part of a response to a voice request.&lt;/p&gt;&lt;p&gt;If you get stuck at any point during this tutorial, please reach out to us on the Spokestack community &lt;a href=&quot;https://forum.spokestack.io&quot;&gt;forum&lt;/a&gt;. And if you build something cool, please let us know!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Porting a Smart Speaker Voice App - Part 3]]></title><description><![CDATA[Spokestack introduces a new way to build and access voice apps independent from major virtual assistant platforms. Take your smart speaker app mobile.]]></description><link>https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-3</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-3</guid><pubDate>Mon, 08 Jun 2020 00:00:02 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACTUlEQVQoz3WTzU8TQRjGe/FijFEjMWoEAqW0+70z+9mlUBUpELAUYltKiVASOXgw0RgNB0lEE+CMUYOIAhcvJl7847z8zCzycfHwy7s7mefZZ955N2OaJgrDMNJqWlaK47oUkwTpe/hBQDI0RJIkxHGM67pn+89rTZPM+QVd19FyOQp9fZj5PKEQuIaBo+sEjkMcRYRhiG3b6V7FifakZs5/RQhBUq1SarUodzoMLy1RXllhqNmgODeLJQWObSOlwJMS3/f+k1Ch61iWxcTmFrOHR9T29pn5vEtt7zszewc8/HJAea7BQmuRxvwiU9OzPF7qUC6X0TTtzFC5KyP1UiqVaC4vU2u3qS20GJur82S1ytFOnZ+7TY62J/n6dpT9jQf82B5l7WkFISMMXTtNmvF9P+2L53lUKhXarRbN+iOqU1OMj0+zXL/Lh7V77K6P8u39OIebk3xcP67POvcxLBfbsk5DZYIgSM08z0+bnS+oRlvYtolj5HBsA0f42I5PHIUEvsAVPtKLCIOIMAr/6Y/JSCnTy3AcByk8NjZf8eLlKjIYQ87/xp94g6ENcPVmF909PeRyg2T7+7nT28utnl7CIKBYLHISLKNmSqHSua7k084W79af40QN/Nd/iNq/GOjr5kL2OpeuXGZkeITB3AAXr3Vx8cZttHw+NVKtSw2VkUIlVHMVhjFBEGIYGrZfw5XDCGFTyBcQriCbzVLQNHRNS42V7jiMizptRk2+IoqitIZhkF5SHBeJQ5c4CtLnk79EVXXElCQ51SnU2l/KYnl2fPmzIQAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting a Smart Speaker Voice App to Mobile&quot; title=&quot;Porting a Smart Speaker Voice App to Mobile&quot; src=&quot;/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png&quot; srcSet=&quot;/static/053cd837b819e6d09663045bdc0aa117/2eeed/porting-a-smart-speaker-voice-app-to-mobile.png 294w,/static/053cd837b819e6d09663045bdc0aa117/0d6a1/porting-a-smart-speaker-voice-app-to-mobile.png 588w,/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png 1175w,/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;h2 id=&quot;import-an-alexa-or-dialogflow-interaction-model&quot;&gt;&lt;a href=&quot;#import-an-alexa-or-dialogflow-interaction-model&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Import an Alexa or Dialogflow Interaction Model&lt;/h2&gt;&lt;p&gt;In &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;Part 1&lt;/a&gt; and &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-2&quot;&gt;Part 2&lt;/a&gt; of this series we covered what smart speaker voice apps are and how you can port them to mobile using Spokestack. This post describes how you can easily bring your voice app’s natural language understanding over to a smartphone.&lt;/p&gt;&lt;p&gt;Voice apps running on smart speakers use the platform’s natural language understanding (NLU) to infer intent from a user’s utterance. When moving an Alexa Skill or a Google Action to mobile, you’ll need a different way to perform NLU. Fortunately, Spokestack provides custom NLU models that run on device using &lt;a href=&quot;https://www.tensorflow.org/lite&quot;&gt;TensorFlow Lite’s interpreter&lt;/a&gt;. Even better, you can export your existing Alexa or Dialogflow interaction model and let our system build a custom on device model for you.&lt;/p&gt;&lt;p&gt;NLU is a critical component of modern voice user interfaces (VUI). This process takes what the user says, or &lt;code&gt;utterance&lt;/code&gt;, and classifies it as something the voice application can understand. For smart speaker voice apps, this classification results in an &lt;code&gt;intent&lt;/code&gt; with optional &lt;code&gt;slots&lt;/code&gt;. For example, the utterance “Turn off the hallway lights” might result in an intent called &lt;code&gt;TOGGLE_LIGHTS&lt;/code&gt; with slots for &lt;code&gt;mode:off&lt;/code&gt; and &lt;code&gt;location:hallway&lt;/code&gt;. A similar utterance “Turn on the bedroom lights” could also give the intent &lt;code&gt;TOGGLE_LIGHTS&lt;/code&gt;, but with slots &lt;code&gt;mode:on&lt;/code&gt; and &lt;code&gt;location:bedroom&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;The exact name of an intent is not important except as a mnemonic for the programmer. The same goes for the slot names and values. What this is doing is scoping the huge domain of natural language down to a few actionable intents with predefined possible values. As a voice interface designer, it’s your job to define which intents the interface can respond to just like a GUI designer chooses which visual elements a user can interact with. The voice app can choose to ignore any utterance that doesn’t match with a known intent or prompt for further clarification. For more technical information on Spokestack’s NLU system, take a look at our &lt;a href=&quot;/docs/concepts/nlu&quot;&gt;NLU concepts guide&lt;/a&gt; in the documentation.&lt;/p&gt;&lt;p&gt;If you’ve built a Google Action or Alexa Skill, you’ve already defined a natural language interaction model using Dialogflow or Alexa Skills Kit. Behind the scenes, these services generate an NLU model for you and make it available to your skill using their cloud services. Spokestack can also generate an NLU model for you, except you get to run it in your app!&lt;/p&gt;&lt;p&gt;I want to pause to point out that you don’t have to use the Spokestack on-device NLU for your embedded assistant. Spokestack’s speech pipeline provides you with text from the ASR service. You could choose to classify this text using a hosted NLU service’s API like Google’s Dialogflow or Amazon Lex. However, running on device provides some advantages. First, it eliminates a network round trip that could result in faster performance depending on connection speeds. Second, it allows NLU to work even offline or with a poor signal. Finally, on-device NLU enhances privacy for your users by not sharing transcripts of their voice interactions with third party services.&lt;/p&gt;&lt;p&gt;To get started with a free NLU model that’s compatible with Spokestack, &lt;a href=&quot;/account/create&quot;&gt;create a free account&lt;/a&gt; and upload your existing Alexa or Dialogflow interaction model. The following instructions for that process are paraphrased from our &lt;a href=&quot;/docs/integrations/export&quot;&gt;export guide&lt;/a&gt;:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Log into the Amazon developer console, find your skill, click the “Build” tab at the top, and look for “JSON Editor” listed under your intents and slots on the left side (at the time of this writing).&lt;/li&gt;&lt;li&gt;Copy the entire contents of the JSON editor and paste them into a new file on your computer. Save it as &lt;code&gt;&amp;lt;YOUR-MODEL-NAME-HERE&amp;gt;.json&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;Log in to your Spokestack account, click on “Language Understanding” in the navigation, and upload your JSON file using the “Import” button.&lt;/li&gt;&lt;li&gt;Watch the email address you used to sign up for the account; we’ll email you when your files are ready, and you can download them from your account page.&lt;/li&gt;&lt;/ol&gt;&lt;h2 id=&quot;next-steps-tutorials&quot;&gt;&lt;a href=&quot;#next-steps-tutorials&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Next Steps: Tutorials&lt;/h2&gt;&lt;p&gt;Spokestack has libraries for iOS, Android, and React Native.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/create-an-alexa-compatible-dialog-manager-in-swift&quot;&gt;Create an Alexa-Compatible Dialog Manager in Swift&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;Porting the Alexa Minecraft Skill to iOS Using Spokestack&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: Porting the Alexa Minecraft Skill to Android Using Spokestack (Coming Soon)&lt;/li&gt;&lt;li&gt;Tutorial: Porting the Alexa Minecraft Skill to React Native Using Spokestack (Coming Soon)&lt;/li&gt;&lt;/ul&gt;</content:encoded></item><item><title><![CDATA[Porting a Smart Speaker Voice App - Part 2]]></title><description><![CDATA[Spokestack introduces a new way to build and access voice apps independent from major virtual assistant platforms. Take your smart speaker app mobile.]]></description><link>https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-2</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-2</guid><pubDate>Mon, 08 Jun 2020 00:00:01 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACTUlEQVQoz3WTzU8TQRjGe/FijFEjMWoEAqW0+70z+9mlUBUpELAUYltKiVASOXgw0RgNB0lEE+CMUYOIAhcvJl7847z8zCzycfHwy7s7mefZZ955N2OaJgrDMNJqWlaK47oUkwTpe/hBQDI0RJIkxHGM67pn+89rTZPM+QVd19FyOQp9fZj5PKEQuIaBo+sEjkMcRYRhiG3b6V7FifakZs5/RQhBUq1SarUodzoMLy1RXllhqNmgODeLJQWObSOlwJMS3/f+k1Ch61iWxcTmFrOHR9T29pn5vEtt7zszewc8/HJAea7BQmuRxvwiU9OzPF7qUC6X0TTtzFC5KyP1UiqVaC4vU2u3qS20GJur82S1ytFOnZ+7TY62J/n6dpT9jQf82B5l7WkFISMMXTtNmvF9P+2L53lUKhXarRbN+iOqU1OMj0+zXL/Lh7V77K6P8u39OIebk3xcP67POvcxLBfbsk5DZYIgSM08z0+bnS+oRlvYtolj5HBsA0f42I5PHIUEvsAVPtKLCIOIMAr/6Y/JSCnTy3AcByk8NjZf8eLlKjIYQ87/xp94g6ENcPVmF909PeRyg2T7+7nT28utnl7CIKBYLHISLKNmSqHSua7k084W79af40QN/Nd/iNq/GOjr5kL2OpeuXGZkeITB3AAXr3Vx8cZttHw+NVKtSw2VkUIlVHMVhjFBEGIYGrZfw5XDCGFTyBcQriCbzVLQNHRNS42V7jiMizptRk2+IoqitIZhkF5SHBeJQ5c4CtLnk79EVXXElCQ51SnU2l/KYnl2fPmzIQAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting a Smart Speaker Voice App to Mobile&quot; title=&quot;Porting a Smart Speaker Voice App to Mobile&quot; src=&quot;/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png&quot; srcSet=&quot;/static/053cd837b819e6d09663045bdc0aa117/2eeed/porting-a-smart-speaker-voice-app-to-mobile.png 294w,/static/053cd837b819e6d09663045bdc0aa117/0d6a1/porting-a-smart-speaker-voice-app-to-mobile.png 588w,/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png 1175w,/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;h2 id=&quot;voice-apps-on-mobile&quot;&gt;&lt;a href=&quot;#voice-apps-on-mobile&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Voice Apps on Mobile&lt;/h2&gt;&lt;p&gt;In &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-1&quot;&gt;Part 1&lt;/a&gt; of this series we covered covered what smart speaker voice apps are and some of their limitation. Part 2 describes what we need to do to port a smart speaker voice app to a smartphone.&lt;/p&gt;&lt;p&gt;If you’ve authored an Alexa Skill or Google Action, you’ve already mixed the ingredients needed to produce a voice app. The smart speaker hardware and virtual assistant software provide the ASR, NLU, and TTS. You create the fulfillment and dialog management with the help of the platform SDKs.&lt;/p&gt;&lt;p&gt;To move your smart speaker skill or action to a mobile device, you need to mix the same 5 ingredients we introduced in Part 1.&lt;/p&gt;&lt;h3 id=&quot;without-spokestack&quot;&gt;&lt;a href=&quot;#without-spokestack&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Without Spokestack&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;Modern smartphones have integrated microphones and speakers you can access. Both Android and iOS devices allow app to access audio input and output using their development kits.&lt;/li&gt;&lt;li&gt;Android provides programmatic access to Google’s ASR. iOS apps have access to Apple’s ASR.&lt;/li&gt;&lt;li&gt;You can write a dialog manager to handle conversation turns in your app.&lt;/li&gt;&lt;li&gt;You can train an NLU model using a cloud service like Amazon Lex or Google’s Dialogflow.&lt;/li&gt;&lt;li&gt;You can integrate a service like Amazon Polly or Google Cloud’s Speech-to-Text.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;If you’re highly competent with native Android and iOS SDKs as well as with integrating third-party web services, you could write code to mix four of the five ingredients necessary for a mobile voice app. But, it’s not easy. We’ve done it. This experience is what drove us to create Spokestack. Our libraries for iOS, Android, and React Native consolidate mobile ASR, NLU, and TTS into a single, easy-to-use API.&lt;/p&gt;&lt;h3 id=&quot;with-spokestack&quot;&gt;&lt;a href=&quot;#with-spokestack&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;With Spokestack&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;✅ Spokestack managed hardware control&lt;/li&gt;&lt;li&gt;✅ Spokestack managed ASR&lt;/li&gt;&lt;li&gt;Custom Dialog Mananger&lt;/li&gt;&lt;li&gt;✅ Import Alexa or Dialogflow conversation model for Spokestack NLU&lt;/li&gt;&lt;li&gt;✅ Spokestack TTS&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The Spokestack libaries simply make it easier for you to focus on the custom portions of your voice assistant experience like the conversation flow and the content. You have to worry much less about platform-specific implementation details.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-3&quot;&gt;Part 3&lt;/a&gt; of this series will cover how to import an interaction model from Alexa or Dialogflow to create an on-device Spokestack NLU model. Following that we have a tutorial on how to &lt;a href=&quot;/blog/create-an-alexa-compatible-dialog-manager-in-swift&quot;&gt;write a dialog manager&lt;/a&gt; that mimics how your Alexa Skill or Google Action works on a smart speaker. Finally, we’ll go through some complete tutorials on how to add Spokestack to a mobile app on various platforms to recreate an Alexa Skill as an embedded assistant.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Porting a Smart Speaker Voice App to Mobile - Part 1]]></title><description><![CDATA[Spokestack introduces a new way to build and access voice apps independent from major virtual assistant platforms. Take your smart speaker app mobile.]]></description><link>https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-1</link><guid isPermaLink="false">https://www.spokestack.io/blog/porting-a-smart-speaker-voice-app-to-mobile/part-1</guid><pubDate>Mon, 08 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACTUlEQVQoz3WTzU8TQRjGe/FijFEjMWoEAqW0+70z+9mlUBUpELAUYltKiVASOXgw0RgNB0lEE+CMUYOIAhcvJl7847z8zCzycfHwy7s7mefZZ955N2OaJgrDMNJqWlaK47oUkwTpe/hBQDI0RJIkxHGM67pn+89rTZPM+QVd19FyOQp9fZj5PKEQuIaBo+sEjkMcRYRhiG3b6V7FifakZs5/RQhBUq1SarUodzoMLy1RXllhqNmgODeLJQWObSOlwJMS3/f+k1Ch61iWxcTmFrOHR9T29pn5vEtt7zszewc8/HJAea7BQmuRxvwiU9OzPF7qUC6X0TTtzFC5KyP1UiqVaC4vU2u3qS20GJur82S1ytFOnZ+7TY62J/n6dpT9jQf82B5l7WkFISMMXTtNmvF9P+2L53lUKhXarRbN+iOqU1OMj0+zXL/Lh7V77K6P8u39OIebk3xcP67POvcxLBfbsk5DZYIgSM08z0+bnS+oRlvYtolj5HBsA0f42I5PHIUEvsAVPtKLCIOIMAr/6Y/JSCnTy3AcByk8NjZf8eLlKjIYQ87/xp94g6ENcPVmF909PeRyg2T7+7nT28utnl7CIKBYLHISLKNmSqHSua7k084W79af40QN/Nd/iNq/GOjr5kL2OpeuXGZkeITB3AAXr3Vx8cZttHw+NVKtSw2VkUIlVHMVhjFBEGIYGrZfw5XDCGFTyBcQriCbzVLQNHRNS42V7jiMizptRk2+IoqitIZhkF5SHBeJQ5c4CtLnk79EVXXElCQ51SnU2l/KYnl2fPmzIQAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Porting a Smart Speaker Voice App to Mobile&quot; title=&quot;Porting a Smart Speaker Voice App to Mobile&quot; src=&quot;/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png&quot; srcSet=&quot;/static/053cd837b819e6d09663045bdc0aa117/2eeed/porting-a-smart-speaker-voice-app-to-mobile.png 294w,/static/053cd837b819e6d09663045bdc0aa117/0d6a1/porting-a-smart-speaker-voice-app-to-mobile.png 588w,/static/053cd837b819e6d09663045bdc0aa117/05162/porting-a-smart-speaker-voice-app-to-mobile.png 1175w,/static/053cd837b819e6d09663045bdc0aa117/8537d/porting-a-smart-speaker-voice-app-to-mobile.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;h2 id=&quot;voice-apps-on-smart-speakers&quot;&gt;&lt;a href=&quot;#voice-apps-on-smart-speakers&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Voice Apps on Smart Speakers&lt;/h2&gt;&lt;p&gt;Over the last several years, intelligent virtual assistants such as Siri, Alexa, and Google Assistant have become increasingly more important in our everyday lives. Smart speakers like Amazon Echo and Google Home brought convenient, hands-free access to assistants into our homes. One way independent developers have improved these assistants is by creating add-ons collectively known as voice apps. Users access these voice apps through virtual assistant platforms on smart speakers.&lt;/p&gt;&lt;p&gt;Spokestack introduces a new way to build and access voice apps independent from major virtual assistant platforms. These articles cover what you need to move a smart speaker voice app onto a mobile device.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Part 1: Voice Apps on Smart Speakers&lt;/li&gt;&lt;li&gt;Part 2: &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-2&quot;&gt;Voice Apps on Mobile&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Part 3: &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-3&quot;&gt;Import an Alexa or Dialogflow Interaction Model&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/create-an-alexa-compatible-dialog-manager-in-swift&quot;&gt;Create an Alexa-Compatible Dialog Manager in Swift&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-ios-using-spokestack&quot;&gt;Porting the Alexa Minecraft Skill to iOS Using Spokestack&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: &lt;a href=&quot;/blog/porting-the-alexa-minecraft-skill-to-android-using-spokestack&quot;&gt;Porting the Alexa Minecraft Skill to Android Using Spokestack&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Tutorial: Porting the Alexa Minecraft Skill to React Native Using Spokestack (Coming Soon)&lt;/li&gt;&lt;/ul&gt;&lt;h3 id=&quot;whats-wrong-with-smart-speaker-voice-apps&quot;&gt;&lt;a href=&quot;#whats-wrong-with-smart-speaker-voice-apps&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;What’s Wrong With Smart Speaker Voice Apps?&lt;/h3&gt;&lt;p&gt;Voice apps like Alexa Skills and Google Actions have been an exciting new frontier for developers over the last few years. Voice apps let virtual assistants delegate some interactions to third-party applications running on the platform. Voice apps are to smart speakers what mobile apps are to smartphones.&lt;/p&gt;&lt;p&gt;Unfortunately, consumer adoption of voice apps has not been as successful as adoption of mobile apps. Discovery and retention has been a &lt;a href=&quot;https://voicebot.ai/smart-speaker-consumer-adoption-report-2019/&quot;&gt;long-standing challenge&lt;/a&gt;. Monetization remains elusive for most skill developers. As a result, 2019 saw the &lt;a href=&quot;https://www.adweek.com/digital/new-voice-apps-are-declining-as-a-breakout-hit-remains-elusive/&quot;&gt;lowest number&lt;/a&gt; of new Alexa skills released since 2016.&lt;/p&gt;&lt;p&gt;This doesn’t mean consumers are moving away from voice as an interface. Voice assistant usage is on track to &lt;a href=&quot;https://techcrunch.com/2019/02/12/report-voice-assistants-in-use-to-triple-to-8-billion-by-2023/&quot;&gt;triple&lt;/a&gt; in the next few years. Consumers have embraced voice as an interface. However, the model for third-party skills and actions has not been a success. Fortunately, there’s a way to free these voice apps and promote them as independent voice assistants embedded in smartphone apps.&lt;/p&gt;&lt;h3 id=&quot;transform-your-skill-to-a-mobile-voice-assistant&quot;&gt;&lt;a href=&quot;#transform-your-skill-to-a-mobile-voice-assistant&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Transform Your Skill to a Mobile Voice Assistant&lt;/h3&gt;&lt;p&gt;Moving voice apps off of smart speakers and into an app on a smartphone resets the conversation. You gain full control of your app’s branding and visual UI. This gives you the flexibility to integrate images, videos, and traditional GUI components as you see fit. You’re free from the commerce limitations Amazon and Google place on smart speaker voice apps.&lt;/p&gt;&lt;p&gt;Smart speakers and virtual assistants provide key components that are used to build voice apps. In order to move a voice app to a smartphone, you have to make sure you still have all the ingredients. Let’s take a look at the ingredients needed to make a general voice app, regardless of platform.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Spokestack Overview&quot; src=&quot;https://www.youtube-nocookie.com/embed/MW2cYSQhbZE?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;h3 id=&quot;universal-voice-app-ingredients&quot;&gt;&lt;a href=&quot;#universal-voice-app-ingredients&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Universal Voice App Ingredients&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;A device with a microphone and speaker&lt;/li&gt;&lt;li&gt;A way to recognize human speech using the device’s microphone and a way to convert that speech to text&lt;/li&gt;&lt;li&gt;A dialog management system that takes the speech transcript and formulates a text response&lt;/li&gt;&lt;li&gt;A way to infer computer-readable meaning from the speech transcript&lt;/li&gt;&lt;li&gt;A way to convert the text back to speech and play it through the device’s speaker&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;If you are coming from a background in voice apps, chat bots, or virtual assistants, these ingredients will be very familiar to you. For those who aren’t, some of these topics are worth a quick review.&lt;/p&gt;&lt;h4 id=&quot;speech-recognition--text-to-speech&quot;&gt;&lt;a href=&quot;#speech-recognition--text-to-speech&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Speech Recognition &amp;amp; Text-to-Speech&lt;/h4&gt;&lt;p&gt;Converting speech to text is often called automated speech recognition (ASR). Converting text back to speech is known as text-to-speech (TTS). Both of these are memory-intensive operations usually performed using a cloud service. Smart speakers have built-in access to cloud services to perform both ASR and TTS.&lt;/p&gt;&lt;h4 id=&quot;dialog-management-system&quot;&gt;&lt;a href=&quot;#dialog-management-system&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Dialog Management System&lt;/h4&gt;&lt;p&gt;A dialog management system takes the text transcript as input and formulates a response. This is known as a dialog turn. Developers are responsible for providing content for the responses in a process called fulfillment. Some dialog management systems will automatically perform tasks such as slot filling or re-prompting for misunderstood utterances.&lt;/p&gt;&lt;h4 id=&quot;natural-language-understanding&quot;&gt;&lt;a href=&quot;#natural-language-understanding&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Natural Language Understanding&lt;/h4&gt;&lt;p&gt;In order to write logic for turns, developers need a way to infer meaning from text. This is known as natural language understanding (NLU). The state-of-the-art way of performing NLU is using deep learning models to infer “intent” from “utterances.” Both Amazon and Google have cloud-based services where a developer can train an NLU model by giving example utterances for various intents. The developer decides which types of intents a voice app should respond to. Then he or she provides example phrases for each. Google and Amazon make their resulting NLU services available for voice apps to use.&lt;/p&gt;&lt;h3 id=&quot;voice-apps-on-smart-speakers-1&quot;&gt;&lt;a href=&quot;#voice-apps-on-smart-speakers-1&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Voice Apps on Smart Speakers&lt;/h3&gt;&lt;p&gt;If you’ve authored an Alexa Skill or Google Action, you’ve already mixed the ingredients needed to produce a voice app. The smart speaker hardware provides the microphone and speaker. Each smart speaker is integrated behind the scenes with a cloud service to perform ASR, NLU, and TTS. Amazon and Google provide SDKs to create a fulfillment web service that includes dialog management.&lt;/p&gt;&lt;p&gt;In this introduction we covered what smart speaker voice apps are and some of their limitation. We also covered what components are needed to build one and how the smart speaker hardware provides them. In &lt;a href=&quot;/blog/porting-a-smart-speaker-voice-app-to-mobile-part-2&quot;&gt;Part 2&lt;/a&gt; of this series we explore how to find these ingredients on a different device, a mobile phone.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Test Your Voice Assistant with Real People]]></title><description><![CDATA[Testing your voice assistant with actual users is a worthwhile investment in usability. Learn to test voice assistants during the development process.]]></description><link>https://www.spokestack.io/blog/test-your-voice-assistant-with-real-people</link><guid isPermaLink="false">https://www.spokestack.io/blog/test-your-voice-assistant-with-real-people</guid><pubDate>Fri, 05 Jun 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/15cfbebd20807abfcbd6ca54d6b1611e/8537d/user-testing-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACAUlEQVQoz3WSTWgTURRGZyuCYgRX6kbRXRUr/qylroIVosX0L5m8eTNpIRSlUYkt2oUillI37lzFdCEqrhRBTYqFQiioMUWlBqqitHHVNlU0k8mRN0nLJNaBjzvz4J537mU0TJO1VE1JVapqghT8HIgzHphg7lYSXmeoTE3ydWSYrBHie0Sv9UjpZo2hNQClqhJHWiB0StHzaL737DhUIHrlG8nHH3mYukNb4DRPDQW0qPwDrB+4qRsqKDGLmZtJ2vQCx4MFth74REf4BeMjg7QcPMqbCzEwJE7dch3o/ahaJoRM/owa5DPXST14x1CqyNiTIrHby+xsuUH7ySO0Hm5lwn8K+vqwvUJS1kb2GhKR/L5ksjKqs2IOsH1Pji178/j2f2bfsTGEfo4T7R3cD5yFaNQFetemqR2sxzBqNSIhpLPYf5lNu3Jovjzatlk2754m13+ReSFZMiQ02bmGCuB44gLVbUKH4WtkZ0pMZWuZzq7C0FWIhF0bWwkYRkO/5jQB1S1lKZnt6uJtfBCn8ovaY1NaXuSeZXI3cIaXwSAYhjuyXZ/MBXp1nfo+VoUg4fcT7+4m/+UHzxcc0vMOZWAhkaDY2cmSEA19zkbAhp9UCIjHqb6aZO5Zmg+P0tiZjHumzLAsmmVcoNLdKGrsshCUe3og3AuRXvfdFoL/9aj8BSywXxxnxIcMAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Test Your Voice Assistant with Real People&quot; title=&quot;Test Your Voice Assistant with Real People&quot; src=&quot;/static/15cfbebd20807abfcbd6ca54d6b1611e/05162/user-testing-hero.png&quot; srcSet=&quot;/static/15cfbebd20807abfcbd6ca54d6b1611e/2eeed/user-testing-hero.png 294w,/static/15cfbebd20807abfcbd6ca54d6b1611e/0d6a1/user-testing-hero.png 588w,/static/15cfbebd20807abfcbd6ca54d6b1611e/05162/user-testing-hero.png 1175w,/static/15cfbebd20807abfcbd6ca54d6b1611e/8537d/user-testing-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Setting aside time to test your voice assistant(s) with actual users is a worthwhile investment in usability. At Spokestack, I’ve conducted tests using prototypes and fully-realized voice user interfaces (VUI). How I test products is similar in execution to user interviews with a few exceptions. Your goal should be to gauge whether users are able to complete tasks using their voice. Refer to my &lt;a href=&quot;/blog/user-research-for-voice-experiences&quot;&gt;previous article&lt;/a&gt; on how to find candidates within your network. What will change is what you ask and look for. Here’s how I test voice assistants throughout the development process.&lt;/p&gt;&lt;h2 id=&quot;come-prepared-with-research-questions&quot;&gt;&lt;a href=&quot;#come-prepared-with-research-questions&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Come prepared with research questions&lt;/h2&gt;&lt;p&gt;Write a test script to guide usability testing that includes instructions, tasks, and questions. Here’s a &lt;a href=&quot;https://docs.google.com/document/d/1291FI3KTP8ycVwqcaTJITAe5jQHfnt_t5XaLnMBbD6E/edit&quot;&gt;sample test script&lt;/a&gt; I adopted from Steve Krug’s &lt;a href=&quot;https://www.amazon.com/Rocket-Surgery-Made-Easy-Yourself/dp/0321657292&quot;&gt;&lt;em&gt;Rocket Surgery Made Easy&lt;/em&gt;&lt;/a&gt; that incorporates my suggestions. I start every test script with a list of research questions. Think of these as internal measurement tools for your observations. These will include any uncertainties you want to put to rest without directly asking users these questions. Here are some examples.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Do users know what to ask or say?&lt;/li&gt;&lt;li&gt;Are they running into any errors?&lt;/li&gt;&lt;li&gt;(If applicable) Do they understand that your app is wake word enabled? Did they use this feature? If not, why?&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;keep-context-in-mind-when-setting-up-your-test-environment&quot;&gt;&lt;a href=&quot;#keep-context-in-mind-when-setting-up-your-test-environment&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Keep context in mind when setting up your test environment&lt;/h2&gt;&lt;p&gt;This is where survey demographics will come into play. Decide where you want candidates to test usability. For example, if your voice assistant will be used primarily outside where there’s lots of background noise, make sure you test outside. If you’re unable to replicate the environment, start each task by framing context. For example, if your target demographic uses voice commands frequently while driving, start a task with something like “Imagine you’re commuting to work.” Remind them of context as it changes. With the previous example, if they’re supposed to complete a task while they’re no longer commuting, include something like “You’ve reached your destination and are out of the car.”&lt;/p&gt;&lt;h2 id=&quot;write-tasks-focused-on-usability-not-opinions&quot;&gt;&lt;a href=&quot;#write-tasks-focused-on-usability-not-opinions&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Write tasks focused on usability, not opinions&lt;/h2&gt;&lt;p&gt;Refer back to your research questions for these. You’ll want to test features that are likely going to be most used as well as features you’re the most unsure about. Here are my tips for writing tasks that effectively test voice assistants:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Remind subjects to share what they’re doing and thinking out loud as much as possible.&lt;/li&gt;&lt;li&gt;Ask users for their first impression after they’ve completed their first task. Encourage constructive feedback when warranted and avoid asking if they “like” or “dislike” something.&lt;/li&gt;&lt;li&gt;If you included the last research question from above in your script, don’t explicitly tell users your app is voice-enabled. You want to understand whether or not they can determine this on their own. Avoid including trigger or wake words in your tasks. If a user mistakenly activates their device, include phrasing for how to stop an interaction and continue.&lt;/li&gt;&lt;li&gt;Don’t use technical terms.&lt;/li&gt;&lt;li&gt;Include a few negative questions to reframe their thinking.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;moderate-some-but-mostly-listen--observe&quot;&gt;&lt;a href=&quot;#moderate-some-but-mostly-listen--observe&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Moderate some, but mostly listen &amp;amp; observe&lt;/h2&gt;&lt;p&gt;If you see a user struggle to complete a task, try not to interrupt them unless they’re unable to continue. Ask follow-up questions directly after a related task has been completed. Otherwise, subjects are likely to forget any insights they had in the moment. Try to answer these by observing rather than explicitly asking them. Their actions will speak volumes. Here are some things to be on the lookout for:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Do they know your app is voice-enabled?&lt;/li&gt;&lt;li&gt;Do they understand what they can say? What they can access using gestures?&lt;/li&gt;&lt;li&gt;Do they understand how to use voice commands moving forward?&lt;/li&gt;&lt;li&gt;Are they asking the right questions? If not, what &lt;em&gt;are&lt;/em&gt; they asking?&lt;/li&gt;&lt;li&gt;Are they ever at a loss for words?&lt;/li&gt;&lt;li&gt;Do they default to touch instead of voice? If so, how often? It could be out of habit or personal preference. It could also be a sign that there were too many speech recognition or understanding errors.&lt;/li&gt;&lt;li&gt;If they use the wake word more than once, what is their reaction after repeated usage? Did they find it useful or cumbersome to use?&lt;/li&gt;&lt;li&gt;Do they understand when the app is listening to them?&lt;/li&gt;&lt;li&gt;If your app supports voice output, how are they reacting to different responses? Are you providing too much information or just enough for them to move forward?&lt;/li&gt;&lt;li&gt;What is their reaction to your synthetic voice and any sounds used? Are they too robotic? Too personable? If their reaction is negative, what would they prefer?&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;incorporate-what-you-observed-into-your-product-road-map&quot;&gt;&lt;a href=&quot;#incorporate-what-you-observed-into-your-product-road-map&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Incorporate what you observed into your product road map&lt;/h2&gt;&lt;p&gt;Your colleagues will have likely written brief notes during the test. After testing, I go back and watch recordings to document takeaways and discuss with fellow observers by doing the following:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Write down insights from your observations without providing a solution. Include direct quotes that stood out to you after listening to your discussion a second time.&lt;/li&gt;&lt;li&gt;Discuss and group overlapping ideas with your team.&lt;/li&gt;&lt;li&gt;Write a brief problem statement that accurately captures each group. Discuss and document possible solutions. This could include things like bug fixes, tweaks to existing features, new features, etc.&lt;/li&gt;&lt;li&gt;Look back at your solutions list and let your observations guide the level of priority for each. Incorporate these into your existing product road map.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5c71700e63c7a4fc87fdeb83917dfe00/8537d/user-testing-feature-ideas.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:54.761904761904766%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABwUlEQVQoz4VS/YsTMRDd//8P8QcRweopFs+PFpEi4tfdtSdUj5Nrt7ubZLKb3exHM08yvSvaCjfwmCGTvMy8mWQ2fYnF/AxkNLS2WC5/4/p6jSxTuLq6gVaE+4yZbz2QjE8e4cflHMZoONdgtcqR5warmwxZZqAKQpoW2GyUwJgSRTxbF+JjEdttiHRCnExePcXH2QfkeQbvO+S5Rlk6IdPKyifx4t+IBCHsMAzbf3LJ8ycPcX72DdZatG0vP95VoVSsoERZ1lCFBVG1i5WVnPftvuU94fs3J/jy+ROybIOuG5BnevegIDRNi7bt5KMYN43f+3h+WN0t4QtcLi5gtBHC9boQsqjJ/cbHhLPpGIv5BdI0FW3SVElrznnRtKoa9P0gz0PYPbrT75BMCN+ORzj//hWWSFqI+sVhRA2jXptUg0wluf+tyxHh5PQZ//q5ZDKGhyFwkRsmqtj7luu6ZWMqgbWOu67npm7ZuYbr2nMIgQ8tGT1+gMm717I2cTHjQKx1qKpa2o0rFCcdEaWoay+5OBjm47aT6emItFLkXE19P1BZOjKmoqIgcs6TJSexUpa6ro/T2oOZKYRAgYPEEX8AH09Jv478b+EAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Feature ideas during user testing&quot; title=&quot;Feature ideas during user testing&quot; src=&quot;/static/5c71700e63c7a4fc87fdeb83917dfe00/05162/user-testing-feature-ideas.png&quot; srcSet=&quot;/static/5c71700e63c7a4fc87fdeb83917dfe00/2eeed/user-testing-feature-ideas.png 294w,/static/5c71700e63c7a4fc87fdeb83917dfe00/0d6a1/user-testing-feature-ideas.png 588w,/static/5c71700e63c7a4fc87fdeb83917dfe00/05162/user-testing-feature-ideas.png 1175w,/static/5c71700e63c7a4fc87fdeb83917dfe00/8537d/user-testing-feature-ideas.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Continue to iterate and test throughout your product’s evolution. For more help designing independent voice assistants, visit the &lt;a href=&quot;/docs/design/getting-started&quot;&gt;design&lt;/a&gt; section of our documentation.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[What to Include in a Survey]]></title><description><![CDATA[Ask the right questions up front and save time later. Get examples of new voice UI product survey questions and see our sample TTS product survey.]]></description><link>https://www.spokestack.io/blog/what-to-include-in-a-survey</link><guid isPermaLink="false">https://www.spokestack.io/blog/what-to-include-in-a-survey</guid><pubDate>Fri, 29 May 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/14c9e6ebaae62c94ab3200393352919f/8537d/survey-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABf0lEQVQoz62RzW7aQBhF/VTBRIF3rH/oPsssK8+QRdKKbSOqhiQKbSlKqClWKIgQTDGDfaIZsOOIqt3U1tH83e/6zmfrsOmjqTa90rid/4uq3NdZxULuBNLHlj4HwqWSIzVesdZntnQL07KxpRPlG7b0qDUbnHz7zPl9l7NBl9aoz4ewx/vwK62f32lFPYLOLcedj9RO91Na5Q1tWD99Sz+awBxUrMiSNaw2pL8TyIAN5qwfTY3W3tXnSa2iF9Iz16g3G1wNIuYPCbPHGLVRpGSkWYZ+4+WKx4cVnUH0yvAloXDJqQQONelz9SPiabxmNouJl0uStWKtNqg0ZbFImI8To9EfN72UpYS52aHwsIVLXTboDSekU1ALxd6TQDqB3vAXNeGbELo297Gqwnu5snDNTwm+3NK+G3IR3tMebbkcD7mejvg0Cmnfhbzr3nAkfOzAoRq4JcNd1DIV4XCgCTRvdmNpLhyjKWpeJfyDoe7JFu8v+Ht1W0Md97/gGJ4BPiZxylTHz6AAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;What to Include in a Survey&quot; title=&quot;What to Include in a Survey&quot; src=&quot;/static/14c9e6ebaae62c94ab3200393352919f/05162/survey-hero.png&quot; srcSet=&quot;/static/14c9e6ebaae62c94ab3200393352919f/2eeed/survey-hero.png 294w,/static/14c9e6ebaae62c94ab3200393352919f/0d6a1/survey-hero.png 588w,/static/14c9e6ebaae62c94ab3200393352919f/05162/survey-hero.png 1175w,/static/14c9e6ebaae62c94ab3200393352919f/8537d/survey-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Recruiting folks to test your Voice User Interface (VUI) or &lt;a href=&quot;/blog/user-research-for-voice-experiences&quot;&gt;give insights on a concept&lt;/a&gt; is no easy task. Finding candidates in your target demographic takes time and prowess. You need to know where your users are in order to reach them in the first place. Relying on your network is a great way to quickly get feedback. However, even with old acquaintances, I’ve found myself (maybe not wanting to, but) needing a tool to capture basic demographic information. For this, I recommend using a survey.&lt;/p&gt;&lt;h2 id=&quot;reallyyou-want-me-to-use-a-survey&quot;&gt;&lt;a href=&quot;#reallyyou-want-me-to-use-a-survey&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Really…You Want Me to Use a Survey?&lt;/h2&gt;&lt;p&gt;The value of surveys has &lt;a href=&quot;https://medium.com/mule-design/on-surveys-5a73dda5e9a0&quot;&gt;come into question&lt;/a&gt; in the greater design community, and with good reason. I urge you to take your results with a grain of salt. For one thing, they can be inherently biased. Use them as a starting point rather than a means to an end. These results are not definitive nor are they statistically accurate.&lt;/p&gt;&lt;p&gt;Leverage them to help you gain &lt;em&gt;some&lt;/em&gt; perspective, but don’t let them dictate the whole picture. For example, in one study, I wanted to interview people who used voice assistants while driving. One user said in the survey he used Google Assistant on Android daily and spent a decent amount of time commuting in his car. However, during our interview, he admitted he wouldn’t use voice commands while he had a sleeping baby in the back seat which was most of his time spent in the car. The survey results provided a good starting point for our discussion that lead me to discover a new demographic I hadn’t considered.&lt;/p&gt;&lt;h2 id=&quot;ask-the-right-questions-up-front--save-time-later&quot;&gt;&lt;a href=&quot;#ask-the-right-questions-up-front--save-time-later&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Ask the &lt;em&gt;Right&lt;/em&gt; Questions Up Front &amp;amp; Save Time Later&lt;/h2&gt;&lt;p&gt;Start by your letting subjects know what to expect. Be transparent about the length. Use a mix of &lt;a href=&quot;https://www.nngroup.com/articles/qualitative-surveys/&quot;&gt;quantitative and qualitative research questions&lt;/a&gt; Limit your questions to no more than ten to increase engagement. In general, stick with multiple choice or questions that can be answered with a short response.&lt;/p&gt;&lt;p&gt;Decide what information is most critical to your research. Understand how your audience already uses voice in their daily routines. For example, when testing a voice-enabled mobile audio service with users, I surveyed fifty or so candidates from my community. I wanted to test with a mix of voice assistant power users and people who didn’t use voice assistants at all. I found this was a good way to understand and, as a result, better design for both perspectives. I also asked a mix of questions about their listening habits - what they usually listen to, how, etc. Knowing these answers helped me better prepare my conversation, saving time for both me and my subjects.&lt;/p&gt;&lt;figure&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/0bc13209718d5662bb9fa03354f76957/8537d/survey-google.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:55.78231292517006%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABRklEQVQoz4WSfU7DMAzFe/8TcQSuwJ9IILGuJM537bR5KG67MW0DS0+uGvnF+dlDSQIignMEcoSUIpgZrTX0WNeGpa5YlkXVo5+t6woRQZkLXl8+8P5mYeiMIQdBCrzJM3ISMF/lKcNMEcZYlXCFSNVL5iLIURBor42CwZ4KzKlA81fRw2maYIyB9wEhBqSUEEJAiluutQINCCHBnrPWHj4DnWccsmNB8oKZZ+Sc9Vl/RQwJ9J1B49VjOIw0nwqiE+3qMOu87oSNr3cRdkq7YXlmyPqkfSaa+2C62i79bg3R86X2seFY4O0MRx5kHax1KIV1AFUW1J539X+RWJv47fGwwxj76sy6Jq0959h5/2uYOsPgb3bu0F+Gh8/DDvvC4r7+wvHC0N126M58zzDsDO3OsLM0hnRFNpZXhn1ne81Wzxg/HX4AfOtVTM0f9CwAAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Survey example&quot; title=&quot;Survey example&quot; src=&quot;/static/0bc13209718d5662bb9fa03354f76957/05162/survey-google.png&quot; srcSet=&quot;/static/0bc13209718d5662bb9fa03354f76957/2eeed/survey-google.png 294w,/static/0bc13209718d5662bb9fa03354f76957/0d6a1/survey-google.png 588w,/static/0bc13209718d5662bb9fa03354f76957/05162/survey-google.png 1175w,/static/0bc13209718d5662bb9fa03354f76957/8537d/survey-google.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;figcaption&gt;Here’s an example of some survey questions used when recruiting folks to usability test Radiobrain.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Limit open-ended feedback. You’ll be able to further probe for detail when conducting user interviews. Draft your survey and get feedback from those you trust. Here’s a &lt;a href=&quot;https://docs.google.com/forms/d/1faU7-M5zxTLjpTrweklUl1AELeaKsPMzcY7ipENxL60/edit&quot;&gt;sample survey&lt;/a&gt; I created to get you started.&lt;/p&gt;&lt;p&gt;For more help designing independent voice assistants, visit the &lt;a href=&quot;/docs/design/getting-started&quot;&gt;design&lt;/a&gt; section of our design documentation.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[TTS, Outside the (Black) Box]]></title><description><![CDATA[Avoid text to speech pitfalls and learn more about TTS output. Josh Ziegler shares some of the edge cases we've encountered while building our system.]]></description><link>https://www.spokestack.io/blog/tts-outside-the-black-box</link><guid isPermaLink="false">https://www.spokestack.io/blog/tts-outside-the-black-box</guid><pubDate>Wed, 27 May 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/5ffc394003fd492270ce472f783274b8/8537d/tts-outside-the-black-box.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABfklEQVQoz52QW1PCMBCF+38UxAK2Mo6++OJlwMvgQKFc1dHxDQVUrIKM42/Npkl/wnE2BcWiD/pwZrPJ2S8nsdZqBFbai2ty/VdZi4C0J/4FWZyz1uawBUOqKpDyhDF+W8/61epsvyqWoNZn4wmsVASKdwrNiUb5UaF4q1AZaRz1FQ57IY4HCu3XCNUnjdN7BS/QWPcTCbnJ1OOEO1cSjbFGsadwNtRojSOUBwqlnkJnEsF71Kg8aLReIlxMI+MtdKVJPA9m5VuEjY5Evk3YvpZoTyOcjRWKkxAnE4X9IMTBKIT3rnH4FKL0rFB+i3UyVHA70oThUCYhg3ItAoPTdcLuTYi9UQgnEHAfBDYDgjMg5PsCzpDgDAXcQGDrhVC4lMjyLDOaMcfKNgl240sZn5CpEXJ+LLtGsOuErE/IJiqD3K6Eey6xMQtm2T6Zj11Sg8CXcQJTE+LL517u+YUGyBv8/p/EZ/YMnEuAFj1zH3ssE7kr4fwiN6FfPeexPgD/NSqyEtKlawAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;TTS, Outside the (Black) Box&quot; title=&quot;TTS, Outside the (Black) Box&quot; src=&quot;/static/5ffc394003fd492270ce472f783274b8/05162/tts-outside-the-black-box.png&quot; srcSet=&quot;/static/5ffc394003fd492270ce472f783274b8/2eeed/tts-outside-the-black-box.png 294w,/static/5ffc394003fd492270ce472f783274b8/0d6a1/tts-outside-the-black-box.png 588w,/static/5ffc394003fd492270ce472f783274b8/05162/tts-outside-the-black-box.png 1175w,/static/5ffc394003fd492270ce472f783274b8/8537d/tts-outside-the-black-box.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;At Spokestack, we process a lot of, well, speech. Our libraries help you both listen to and talk to your users. Both parts are tricky, requiring sophisticated machine learning models and lots of data. On the text-to-speech side, we’re constantly working to improve our models to give our voices higher fidelity and smoother prosody, and do it all even faster.&lt;/p&gt;&lt;p&gt;It’s easy to focus on that sort of improvement exclusively and forget that there’s more to reading text than a left-to-right (or right-to-left) translation of characters into sounds. I recently posted on &lt;a href=&quot;https://medium.com/@josh_z/tts-outside-the-black-box-81ed0b96553b&quot;&gt;Medium&lt;/a&gt; about some of the edge cases we’ve encountered while building our system. There are quite a few ways to trip over yourself, so if you’re interested in responding to your users naturally — with a voice unique to your brand — &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;get in touch&lt;/a&gt;!&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Tools and Tips for Mapping Multimodal Products Remotely]]></title><description><![CDATA[Spokestack uses experience mapping tools to formulate multimodal flows for our products based on researched observations and ideas. Explore our toolset.]]></description><link>https://www.spokestack.io/blog/tools-and-tips-for-mapping-multimodal-products-remotely</link><guid isPermaLink="false">https://www.spokestack.io/blog/tools-and-tips-for-mapping-multimodal-products-remotely</guid><pubDate>Tue, 19 May 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/8537d/ux-mapping-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABwklEQVQoz62STWsTURSG829cuHZdu0gMmeh/cONKsIt0ISLdSTcKIjXm5rMgFE1TrRoMSJFCV8E2sUi0aZUm2uYLMncyJqHB8ZE7k0zHVF154eXew7nvM+ecub5eKoySMdmTmn02Upf/Ia9HY8JQ8rmBAiWdpJm4hCn8mCLg2QNubCRC6Alt7Am7Pgc4BtkJGxziy+PrVJ/dpJqLsL82TzU3z14uwt5qhP3sHO/XF3mzfIdOLISZcrwu0EvvqXZEgOb2Ou1KgdZOlnZ5jVb5Oc3tLEfFRfTdJV6vPOTGtas0oiH66bBbpZLPnVvSmZsUfnofClgtk9FhjZPaIaN6HavZ5ej+XcwH98jcWuDC+XMcK2BmqmWZ1DhVGD0WQN99iX6cZzT8hlo/sZCNPJ23ccxPm9RaFTZjc3RF8PRHTiqUnkDNUMb8dCsbbDTqtId9G/gDGA4OWPn8kfzXA75jUt7J0RN+u4jfgMYU0IgHMQoLnJRWGRaXGRTTDIoZrNITtl4J3hXiWKWndF7cRsaDeDs8C5zMUn05OoN8NOtKj15kIGbpC3Wese8YU9X9Bag5jzd95cyDlmO5uT94fTKh8T/1C2u8jbz3KcJBAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Tools and Tips for Mapping Multimodal Products Remotely&quot; title=&quot;Tools and Tips for Mapping Multimodal Products Remotely&quot; src=&quot;/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/05162/ux-mapping-hero.png&quot; srcSet=&quot;/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/2eeed/ux-mapping-hero.png 294w,/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/0d6a1/ux-mapping-hero.png 588w,/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/05162/ux-mapping-hero.png 1175w,/static/cf5d19b776d5f4a27dbfbfae5c2dbcab/8537d/ux-mapping-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;At Spokestack, we use &lt;a href=&quot;/docs/design/map-out-integration&quot;&gt;experience mapping&lt;/a&gt; as a tool to formulate multimodal flows for our products based on &lt;a href=&quot;/blog/user-research-for-voice-experiences&quot;&gt;researched observations and ideas&lt;/a&gt;. It’s a great way to get everyone on the same page before designing or building anything. Plus, it forces us to consider &lt;em&gt;what&lt;/em&gt; exactly we want users to ask and &lt;em&gt;how&lt;/em&gt; we’ll respond, both conversationally and visually. When our team committed to being 100% distributed over two years ago, I wanted to recreate this exercise digitally. What I came up with follows many of the same steps outlined in our &lt;a href=&quot;/docs/design/getting-started&quot;&gt;design docs&lt;/a&gt;. However, the tools and set-up are a bit different. Here are a few considerations to keep in mind when adopting this exercise.&lt;/p&gt;&lt;h2 id=&quot;tools-we-use&quot;&gt;&lt;a href=&quot;#tools-we-use&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Tools We Use&lt;/h2&gt;&lt;p&gt;I use &lt;a href=&quot;https://zoom.us/&quot;&gt;Zoom&lt;/a&gt; to screen share and record conversations with my team. Meeting recordings are a good reference point if you or a teammate needs to be reminded of how decisions were made and why. You’ll still need to designate someone to lead and record the map. That person will need to screen share documentation while mapping and record the meeting. For our team, I’m usually the designated leader. I use one of two tools (both free) for mapping:&lt;/p&gt;&lt;h3 id=&quot;stickiesio&quot;&gt;&lt;a href=&quot;#stickiesio&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Stickies.io&lt;/h3&gt;&lt;p&gt;&lt;a href=&quot;https://stickies.io/&quot;&gt;Stickies.io&lt;/a&gt; does a great job mimicking the tactile nature of mapping with post-it notes (and, without the clean up). Create a new “board” for each project. If your project requires more than one experience map (i.e. if you’re mapping for new and return users), you’ll need to create a new sheet for each in the top nav. Use stickies here like you would post-it notes in a conference room. List actors along the left axis. The bottom axis will represent time. Pick a starting point and list each action on a separate sticky in the same color as the corresponding actor.&lt;/p&gt;&lt;figure&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/787ef05eaf297a1532aada4d47d259c3/8537d/ux-mapping-stickies.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:51.70068027210885%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAKCAYAAAC0VX7mAAAACXBIWXMAAAsTAAALEwEAmpwYAAACQ0lEQVQoz5WPy0tUARTG76J/olW7lv0BSY9FiyBIsIxMLa1WRZtIyEqMlB5KBfmoJMNIpSKU7EFqk1guBJ95HeM6ozPqOI/r6Iwzc++d1z2/yBmMoE0HPvg45/A9FJc3BNk3YJyEZBUkH4FRDqkasBrAOAPpO2BcBasSrGpYL4boYUhUQfQiGJfAqIfIWRQ9GiPpm8YY68JU32PODxGf6MZ0fiLxc4DEWCfWXD8JtQ9jsgdj5jHhrzuJOXZgzF7DXBzBcn/G8gxjan0owVgcY3iE8M3bRJqfEul8TaiugciTdiIvXxGub2Szo4twSxvrdx+w0dTKSmUFgdISgpev4DlRhq+0gkBVNcvl51CC+hr/M6klD76yEvzFRazfb2Sjo53oi+fot2pZOlaIElxbQwA7D7EFyWYR284hz+28YNKzgqfoKN4DBcTf9W4bhVubWNi/FyUYTZJdfIsMFiLfzyNmcMtAxEZEctzOkBmvg2+nsScfYs26SU6q2KGNbcGML4Q5rqKEI2n8/k7GJ/Yxox7Hsla3HrYFBWw7jTpVyujwHpxzFajBe8zoNczrLUy4bzDtvc683owaqEWJb0Skd8UvxaNTcmHKKSErmQslf088syamHRBXbEVOORxyqP+LdC04JZtySSbtlm7XDzniGBIlqOtspmHJgFUTMrb8jpdP9wdb3QGvAQUDfnZ9DNC2YG1XDlkw6F5HSSQSGqCBrYFoiGjyD2Tt3H0hZmkHPzi13T1O7dlidGuXsrMaktWW5+e0XyGqvMEBmH0XAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Stickies.io&quot; title=&quot;Stickies.io&quot; src=&quot;/static/787ef05eaf297a1532aada4d47d259c3/05162/ux-mapping-stickies.png&quot; srcSet=&quot;/static/787ef05eaf297a1532aada4d47d259c3/2eeed/ux-mapping-stickies.png 294w,/static/787ef05eaf297a1532aada4d47d259c3/0d6a1/ux-mapping-stickies.png 588w,/static/787ef05eaf297a1532aada4d47d259c3/05162/ux-mapping-stickies.png 1175w,/static/787ef05eaf297a1532aada4d47d259c3/8537d/ux-mapping-stickies.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;figcaption&gt;Here&amp;#x27;s an example of an experience map for Radiobrain created using Stickies.io&lt;/figcaption&gt;&lt;/figure&gt;&lt;h3 id=&quot;shared-spreadsheets&quot;&gt;&lt;a href=&quot;#shared-spreadsheets&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Shared Spreadsheets&lt;/h3&gt;&lt;p&gt;These are equally great if not more so for complex maps that capture lots of actions and actors. Create a new spreadsheet for each project. Use tabs to keep track of multiple experience maps where necessary. Use cells here like you would post-it notes same as you would above and make sure that they are color coded accordingly. Unlike stickies.io, if an action is reoccurring across the x axis, you won’t need to repeat that cell.&lt;/p&gt;&lt;figure&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/51b1c4c6a3b65929e57c694bce7b407c/a815f/ux-mapping-google-sheet.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:55.78231292517006%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACYklEQVQozz2L2ZKaYBSEef8nSVVSlatkklmizowbAiIoo4KisskiyO6Cp1P+qczFV919Tje3ywJy/T0FYUhZnlNZVVSUJdWnE+Pu77fz9UrXpqHmdmOcLxdq6EYgkBP79GXygz7MJXF5liM5xDiVFU5liSLNUOU5TlWFMs9RFQWKNEWWHlEWOevcu/kxxbmucalrXE8nNOczU85Y7Mjlx+QJMjmDIentDkXSmLwhT0arTXZ/QLtuj3l/JNCeH5HV7f2j16dV55UCQaRMUSmRJ8R5YQZpEUKa7yHOLPDqDvLchajZ6Eo6epKBt9EcnaGG/thg/OkqeOyIeOpI4NUt5LmH6SqEagTg/GMJ0S7wvvDxNt9jsDqA3xzRXQZ4EJZoz2y8KFv8EnTm2zOH5WfZxNN4zejqASSnhGDl4IKogKyEULUYyuwAdRZDmR4gKwF4wYI88SFKLoYji/1U1onQHZhovc4xnvise78r0whcmsSwF1ME5hKursE3l9iv5gg3OhLbhKN/wNE1JPYa0dZAtNWxnsrYaDLWUxHh1kCw0dn+DldnPq5eHxTwaPZD3HweCEe4eH3ExgvOTg/puo299hvF9hXl7g1nt49o+Qxb+cl2n/g8uDoN0dgiLpaI66dKKFYDWNIjzjsB9YZHpL3CkZ8Z2E/gqy2Y/AMaW2Kb/3D1MUBt9tBYI1wtnnFzRsiNdyy631Fv+rg5AuBJmHW+Ydr5yvJBa8GSHphv7BHIFtiWy44JQkdHEm6QhxbSYIcs2DGNPZNpEdmoEheRs2bUiYvy4KA42KgTD0VsIz5sUMUu/gLoFR1HAohAPAAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Google Spreadsheets&quot; title=&quot;Google Spreadsheets&quot; src=&quot;/static/51b1c4c6a3b65929e57c694bce7b407c/05162/ux-mapping-google-sheet.png&quot; srcSet=&quot;/static/51b1c4c6a3b65929e57c694bce7b407c/2eeed/ux-mapping-google-sheet.png 294w,/static/51b1c4c6a3b65929e57c694bce7b407c/0d6a1/ux-mapping-google-sheet.png 588w,/static/51b1c4c6a3b65929e57c694bce7b407c/05162/ux-mapping-google-sheet.png 1175w,/static/51b1c4c6a3b65929e57c694bce7b407c/a815f/ux-mapping-google-sheet.png 1199w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;figcaption&gt;Here&amp;#x27;s an example of the same experience map from Stickies.io recreated using Google Sheets.&lt;/figcaption&gt;&lt;/figure&gt;&lt;blockquote&gt;&lt;p&gt;Not sure where to start? Here’s a &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1epKA1i_2Cbb8sCEnV_D1mHl4VHfdJhY7-EXZGIrPjbM/edit?usp=sharing&quot;&gt;template&lt;/a&gt; using Google Sheets to get you started. Make sure to duplicate this document before making any changes.&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;For more tips on designing multimodal voice experiences, visit the &lt;a href=&quot;/docs/design/getting-started&quot;&gt;design&lt;/a&gt; section of our documentation.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[User Research for Voice Experiences]]></title><description><![CDATA[Design for verbal input is a specific UX problem if designing voice chat conversations in mobile UIs. Learn more about designing voice user interfaces.]]></description><link>https://www.spokestack.io/blog/user-research-for-voice-experiences</link><guid isPermaLink="false">https://www.spokestack.io/blog/user-research-for-voice-experiences</guid><pubDate>Fri, 15 May 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/db64f3f1ac76760496021266b8aa54f1/8537d/user-research-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABcklEQVQoz62Ry07CQBSG+zh4KUW5lZiwMUYXSkARBMq9aOItyEVkoyQiCxN9z5kyVJ/gN6dDsTZl5+LLmZmc+fudjqJVOP4T5XfDEC7Lqhls1RDxoAXUcJmtCSwvG4ocap5DLTJsFBhC5wyhglxvFuXerVsXbBXqBiurgwpzQuJdhr1XhsQtx+H9HJm+QHYokBsK7N/McdKT9bgnkL6yoJZ8hm7yTpVDK3HoA470B0O0z3A6EjCnNmqThcPZSMB4XqAwFric2c6HyNZrqcQbHIkWYSHZthCrceh1Cw+f3zDfbFy/f6Ezs9Ga2jBeFk4lSGC7xBAxfI8Sb3LEGh6aHLppITcSyAwEqpMFso8C+bHA6ZPAUXeOg7s5UqaFVMdCsmX9ua9E6xy7tSVVOTqt1YqELNSyJEyjkYnBQfdIhqYigcRSTHFD/FBorC6b6HLMgyvh9jrhriEd0H8Iwg2OBgR5e7wCik7KbfkgQeg+1vaYkh/JCChtgTM82QAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;User Research for Voice Experiences&quot; title=&quot;User Research for Voice Experiences&quot; src=&quot;/static/db64f3f1ac76760496021266b8aa54f1/05162/user-research-hero.png&quot; srcSet=&quot;/static/db64f3f1ac76760496021266b8aa54f1/2eeed/user-research-hero.png 294w,/static/db64f3f1ac76760496021266b8aa54f1/0d6a1/user-research-hero.png 588w,/static/db64f3f1ac76760496021266b8aa54f1/05162/user-research-hero.png 1175w,/static/db64f3f1ac76760496021266b8aa54f1/8537d/user-research-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Over the last three years, I’ve worked on designing voice and chat conversations paired with mobile UIs including &lt;a href=&quot;https://www.thebartender.io&quot;&gt;The Bartender&lt;/a&gt; and &lt;a href=&quot;https://www.tasted.com&quot;&gt;Tasted&lt;/a&gt;. One thing has remained clear: design for verbal input is a very different UX problem. The number of actions a user can take is not constrained by buttons and navigational menus. That’s why it’s important to start every project by identifying who exactly you’re designing for. This will help you better understand how voice can improve your overall experience.&lt;/p&gt;&lt;h2 id=&quot;what-can-i-help-you-with&quot;&gt;&lt;a href=&quot;#what-can-i-help-you-with&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;“What Can I Help You With?”&lt;/h2&gt;&lt;p&gt;If you’ve ever said “Hey Siri” to an iOS device, you’re likely familiar with the assistant’s opening screen. “What can I help you with?” can be intimidating. You might not be sure what or how Siri can help you, if at all. Siri serves a broad audience (every iPhone user in the world) and an even broader domain of knowledge similar to a search query, which can be overwhelming. &lt;a href=&quot;/blog/what-is-an-independent-voice-assistant&quot;&gt;Independent Voice Assistants&lt;/a&gt; (IVA) serve a more limited domain of knowledge. The actions a user can take, or “intents,” focus on specific tasks in your app like buying movie tickets, recording a run, or finding a cocktail recipe.&lt;/p&gt;&lt;p&gt;With fewer intents, your assistant can provide a better experience without having to know how to respond if a user asks something unrelated like “How many miles to the moon?” But, what intents &lt;em&gt;do&lt;/em&gt; your users care about? To answer that question, you need to start by understanding who you’re building for. Even if you already have an idea of who your target audience is, adding a voice assistant to your mobile app may shift your product’s reach.&lt;/p&gt;&lt;h2 id=&quot;identify-how-users-talk-about-whats-important-to-them&quot;&gt;&lt;a href=&quot;#identify-how-users-talk-about-whats-important-to-them&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Identify How Users Talk About What’s Important to Them&lt;/h2&gt;&lt;p&gt;Take a step back at start of a project and consider what’s most important to your intended audience. What are they trying to accomplish? How can voice make tasks easier? This will help you decide how a voice assistant can improve your app’s experience.&lt;/p&gt;&lt;p&gt;If you don’t know the answers to these questions, it’s ok. I’m going to help you seek these out before diving into product specifics by gathering anecdotal insights. And, if you’re starting a new project that hasn’t been built yet, that’s ok too. This process can be adopted to gain insights for concepts, prototypes, and/or real-world examples. It’s important you start this discussion early in the process and keep it ongoing.&lt;/p&gt;&lt;figure&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/50bd7fc820e49429dbe5a8405998fe72/8537d/user-research-questions.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:57.14285714285714%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAAByElEQVQoz41SXUtVQRS9P7If0Ls/oecgRCIwAgslQ0kiU0QCX4QCjbiE3ZBbYqndj6Ndzzn3fMyZMx/nzIq9xxHzxTYMe8/smTVrzZoOANRSIs8yxJcTpEkCpRS0UqjrGlUloOoaoix5XxrHsNaibVs6iv7hIV6/WsbM/Xs4+t5HhxbpcJHnqKoKd4Vz7p/55OICJ8dHWH0+i2maekBrDCohYIyB0ZoZNdbCWsNM6ELK1NdaM3NSQRGNhtj7sIsnDx/g1/FPD8hSkgRlUSCbTlkeDZrTrfQcYZ4mMe8lcIrfZ2f4vP8RGyvzyLLMAxIjOsiyhYAjVlr7d5QSTdNwTZnY0RsG6b2Dr9jeXMfi01mMhsMrydZeb/zfCKb86PexvfEWL5/NIRqPPaCUErKqIIRgYKoJXF65T6xJLmXqkdzA8DyKcND9hHcrCyiK0gNeTv6wW8ZoBtLamxCM8AZZroPcAPil28XW+husvphHNI7Qcb5zPW7Ob/dujqZpOA8GA7e7895trS25shSOGZIs72iCbJryf0zimHNBzsYxS6UfEP4smUjxrdfD2vISFh4/wunJKUvOnXN5yG3b5rfX7uoppfLz8Yjrv2qtQ9csRpNdAAAAAElFTkSuQmCC&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Proof of concept&quot; title=&quot;Proof of concept&quot; src=&quot;/static/50bd7fc820e49429dbe5a8405998fe72/05162/user-research-questions.png&quot; srcSet=&quot;/static/50bd7fc820e49429dbe5a8405998fe72/2eeed/user-research-questions.png 294w,/static/50bd7fc820e49429dbe5a8405998fe72/0d6a1/user-research-questions.png 588w,/static/50bd7fc820e49429dbe5a8405998fe72/05162/user-research-questions.png 1175w,/static/50bd7fc820e49429dbe5a8405998fe72/8537d/user-research-questions.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;figcaption&gt;Above are some questions I brainstormed with my team for an initial proof of concept. We wanted to better understand the pain points of managing AirBnBs.&lt;/figcaption&gt;&lt;/figure&gt;&lt;h2 id=&quot;recruit-from-your-network&quot;&gt;&lt;a href=&quot;#recruit-from-your-network&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Recruit from Your Network&lt;/h2&gt;&lt;p&gt;Relying on your network can be a great way to quickly get initial feedback. Use a survey to gather demographic information on candidates, even old friends. Here are some tips that have helped me during recruitment:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Allow yourself at least a week to schedule one-on-one time with each person. Following &lt;a href=&quot;https://www.nngroup.com/articles/how-many-test-users/&quot;&gt;Jakob Nielsen’s advice&lt;/a&gt;, I’ve found success limiting this to around five people.&lt;/li&gt;&lt;li&gt;Share a calendar link. I’ve found &lt;a href=&quot;https://calendly.com/&quot;&gt;Calendly&lt;/a&gt; and &lt;a href=&quot;https://calendar.google.com/&quot;&gt;Google Calendar&lt;/a&gt; to work well for this. Be sensitive of their time and schedule. Make it easy for them to talk to you.&lt;/li&gt;&lt;li&gt;Offer your subjects an incentive if you’re able. We often give our participants gift cards for example.&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;how-to-guide-the-discussion&quot;&gt;&lt;a href=&quot;#how-to-guide-the-discussion&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How to Guide the Discussion&lt;/h2&gt;&lt;p&gt;This will be a guided discussion that you will take charge of. It’s important to come prepared. Here are a three things I make sure to do prior to an interview:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Create a document for each person you interview that includes background information from an email correspondence or survey results. Leave room to jot down questions during your interview. Getting feedback from your users usually happens fast during an interview or usability test. You’ll have time to take more in-depth notes later. Here’s a &lt;a href=&quot;https://docs.google.com/document/d/15YtXuLhlOKrNa6m9ElFBuG-O2ppCc4obVlivl4Dhqfk/edit?usp=sharing&quot;&gt;template&lt;/a&gt; I usually use for this.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Write a script that provides a framework for your conversation, but improvise as you see fit (see the next section). Include instructions and questions. Here’s a &lt;a href=&quot;https://docs.google.com/document/d/1KdaeVRv1nlvMvZTMyglkvuv8EG8z6Lcv8Ca9-oLUkjY/edit?usp=sharing&quot;&gt;sample script&lt;/a&gt; I adopted from Steve Krug’s &lt;a href=&quot;https://www.amazon.com/Rocket-Surgery-Made-Easy-Yourself/dp/0321657292&quot;&gt;&lt;em&gt;Rocket Surgery Made Easy&lt;/em&gt;&lt;/a&gt; that helps me stay on task during interviews.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;It will feel silly, but it really helps to rehearse your script by doing a dry run. Have someone pretend they’re the user and read the script aloud to them. This will help iron out any kinks that may remain (and ease any pre-interview jitters).&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2 id=&quot;dont-forget-who-your-real-users-are-psst-its-not-you&quot;&gt;&lt;a href=&quot;#dont-forget-who-your-real-users-are-psst-its-not-you&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Don’t Forget Who Your Real Users Are (Psst, It’s Not You!)&lt;/h2&gt;&lt;p&gt;You may be more savvy about voice assistants or tech in general than your audience. It’s hard to step back and really hear what users are saying when you’re eager to start designing or building a product. Ask interviewees how they would normally complete certain tasks. Observe their language including nouns, verbs, and sentiment. Your users’ natural speech will inform which utterances and intents your assistant supports and how you’ll prompt these. Be on the look out for any contextual clues that might reveal how or where they’re completing these tasks. Voice interfaces allow certain tasks to be completed while multitasking. Use this to your advantage.&lt;/p&gt;&lt;p&gt;Here’s an example from a user interview I did for a virtual host concept for AirBnb hosts. An owner I spoke with had a complicated coffee maker in one of his units. He had made a binder full of useful tips for his guests that explained how to use his appliances. However, housekeeping often misplaced the binder making it hard for his guests to find. A simple “how does the coffee maker work?” to a voice assistant followed by step-by-step instructions could satisfy the problems of both the owner and guest.&lt;/p&gt;&lt;p&gt;If candidates feel comfortable, stream and record the discussion (we use &lt;a href=&quot;https://zoom.us/&quot;&gt;Zoom&lt;/a&gt;). This will allow your team to ask questions in real time.&lt;/p&gt;&lt;h2 id=&quot;post-interview&quot;&gt;&lt;a href=&quot;#post-interview&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Post-Interview&lt;/h2&gt;&lt;p&gt;After the interview, I like to use recording to document takeaways by doing the following:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Write down observations and transcribe any statements that stood out to you after listening to your discussion a second time.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Discuss and group reoccurring insights with your team.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Label each group to reflect a product feature or intent.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Let your observations guide the level of priority for each. Which intents will most ease the user’s experience? This will dictate focus when you start to map your multimodal flow.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;figure&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/c366ee8a38d45a1612c151b69ab45098/8537d/user-research-trends.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:54.761904761904766%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACIElEQVQoz01S207bQBD1/39B+1LBQ8tDC1RtEaBCBVQRqASIWkKEUDANgVx82fFl1/ba3lPNOJE60si7szNnfM6M935jAzf9PtZmjEEYBAiWS5BSSNMURApaa4kvFwvEUSTnLMug8xyubaXWRvfwfg9u4I/HEmiaBkVRYDqZ4OX5GQmRFL5Op4jCEJOnJ9zd3mIxn0tu27biznWAtRrD+7C5ifNeT5LevXmL0+MfODk+xuHeHr7u7uK630fv7Azf9/dxdHCAm6srlGWJ/20NaJUP79fFBZ58H1VZ4s9ggMH1NV6nL9IgXAYd7SRBmqQojEFVVXDOCc3aWrkDrgOkCbzPn7alMxvTZY3MqtBaCyJClqYwRmMxm63ORrTlXBXHqKpSIG0yhXd0eIj70Z0AdklzKcqzTAbDwFzEzu9hsJShsHOMdWaNK1ujzmbwri4v8fPkRACzFQgnMT0umj5PoJTqaLqOGsvAg+OBcY7OMxhTwmYLeF92dtA7O+1WRmuhISuRpkgSQhxHMoT57FUkEK2qShis72uzeQDvfjRyo+GQW7s8z5yKY/GEyEVh6IwxbqW6eNu23bdpXFPXrigKp3Uusdoo533c2sK33W3pkOe57BtTjKJQNGtXS8vGfzoaDvHX9+E/Pspejh8ehI3scZXDI6VoMZ8RS0OkSMUxaa0pDALKspSapiHnnLzzuSpLKsuS6roma63E+M0B1NaW/gHFLDtBnbinIAAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Airbnb hosts&amp;#x27; observations&quot; title=&quot;Airbnb hosts&amp;#x27; observations&quot; src=&quot;/static/c366ee8a38d45a1612c151b69ab45098/05162/user-research-trends.png&quot; srcSet=&quot;/static/c366ee8a38d45a1612c151b69ab45098/2eeed/user-research-trends.png 294w,/static/c366ee8a38d45a1612c151b69ab45098/0d6a1/user-research-trends.png 588w,/static/c366ee8a38d45a1612c151b69ab45098/05162/user-research-trends.png 1175w,/static/c366ee8a38d45a1612c151b69ab45098/8537d/user-research-trends.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;figcaption&gt;After sharing my observations from Airbnb hosts with the team, here are some trends we identified.&lt;/figcaption&gt;&lt;/figure&gt;&lt;p&gt;Don’t forget to re-visit this discussion as needed throughout the design process, especially when testing a product’s usability, to determine if what you’re building remains relevant to your intended audience. In further posts, I will show you how to take this research and turn it into a great voice user experience.&lt;/p&gt;&lt;p&gt;For more help designing independent voice assistants, visit the &lt;a href=&quot;/docs/design/getting-started&quot;&gt;design&lt;/a&gt; section of our documentation.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[What Is an Independent Voice Assistant?]]></title><description><![CDATA[A voice assistant is a digital assistant using voice recognition, speech synthesis, and natural language processing (NLP) for service via an app.]]></description><link>https://www.spokestack.io/blog/what-is-an-independent-voice-assistant</link><guid isPermaLink="false">https://www.spokestack.io/blog/what-is-an-independent-voice-assistant</guid><pubDate>Tue, 12 May 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/39d05e958352c94f7219aec87094c0e6/8537d/iva-hero.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAACV0lEQVQoz3WSTUhUURSAHxE0QRBGmlZgRS0ESwlqXZvQTduiDCdLLbKFhBUEoUSrBGlhixJaBG4SctEiQSFMZ+b9zJvnzLvvvZnJnyYrKnGKQOf/i/dmSCFcfJzDvYfvnnvPlRJobCZeUomXNJKomGs6/SOC1yHBWEgwMCIw1yIkULDzile7gMHo9AtGp0dYIIK0lXAeFW01gq/eQtprIe022bZfoK7oLKGTwiJR0jxhV6+fjp62DWFZspmyUE9HqDlhsf2gYPjlMk7yByu/08wYkwy96sfKyywjuPWom5sDnXzGRHIF/3dZFobTBnsaLKRawamWOBe6HJayKZ4866f1zGnMTNAT+u+1caXvEp+IbnRoFxUPx40FhQUU9LRObZNNTaPDrqM2O48ItF86VnGGuT8zJAmTIkpH31Xa7/i9XIrlQsxlg/8wsiHmChqxosL7LwZ1zTbVjQ51TXEOnLQJfNdwkLGKKiIvkyLGjYfddD7o8nIpmgvhSs18iFhexswFWFwa5q32gaGpbxw7G0faVx5IVYNF6GcYsxhAX3OZxSKMv/cy13sv4qAjmXkZUZCxCm5USRSDOPpjoqkxpr7GqW0W+A4JfIcF1cctlJWw12EsJ3s3cohwd/A29wd7SGAgWQUFq/J+ZVTsksZHZNRVjR31NlKV5eGrt5E9YQiRV7wmFoly/loLre3nSLrfZmMYaoWy1CkpiEyYFxMxno5HPZ6/i2JmNG/PO7iokERncn6cieQbbDSkSCaAsRXZAAkCJCu4ubtmZIKVmiCR9QACzUNfn+Uv7vzIx4Zkxs8AAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;What Is an Independent Voice Assistant?&quot; title=&quot;What Is an Independent Voice Assistant?&quot; src=&quot;/static/39d05e958352c94f7219aec87094c0e6/05162/iva-hero.png&quot; srcSet=&quot;/static/39d05e958352c94f7219aec87094c0e6/2eeed/iva-hero.png 294w,/static/39d05e958352c94f7219aec87094c0e6/0d6a1/iva-hero.png 588w,/static/39d05e958352c94f7219aec87094c0e6/05162/iva-hero.png 1175w,/static/39d05e958352c94f7219aec87094c0e6/8537d/iva-hero.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;“A voice assistant is a digital assistant that uses voice recognition, speech synthesis, and natural language processing (NLP) to provide a service through a particular application.”&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;If you search ”&lt;a href=&quot;https://www.google.com/search?q=what+is+a+voice+assistant&quot;&gt;what is a voice assistant?&lt;/a&gt;” on Google, the definition above appears as the first result. This definition works because it lists the technologies required to provide human-to-computer voice interactions, a.k.a. a “voice interface”. Voice recognition, speech synthesis, and natural language processing technologies together create a &lt;a href=&quot;https://en.wikipedia.org/wiki/Solution_stack&quot;&gt;software stack&lt;/a&gt; that enables voice assistants to operate (a “Spokestack” you might say).&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Voice is just another interface&quot; src=&quot;https://www.youtube-nocookie.com/embed/wbJ8fZh-iQw?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;So does providing a voice interface to a company’s digital services make it a voice assistant?&lt;/p&gt;&lt;p&gt;The answer is yes. Now let’s discuss the difference between a smart speaker (e.g. Alexa) smart speaker apps (e.g. &lt;a href=&quot;https://www.amazon.com/gp/product/B019G0M2WS&quot;&gt;The Jeopardy Skill&lt;/a&gt;), and independent voice assistants.&lt;/p&gt;&lt;p&gt;An independent voice assistant, sometimes referred to as an “owned assistant,” is a type of voice interface that enables companies to directly communicate with their customers. With independent assistants, there are no intermediaries (e.g. Alexa, Siri, Google Assistant) standing between companies and their customers. The conversation is heard, understood, and responded to directly by the company. This provides multiple benefits for both the consumer and the company.&lt;/p&gt;&lt;h2 id=&quot;mutual-benefits-for-customers-and-companies&quot;&gt;&lt;a href=&quot;#mutual-benefits-for-customers-and-companies&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Mutual Benefits for Customers and Companies&lt;/h2&gt;&lt;p&gt;Building an independent voice assistant offers more control over the customer experience, thereby creating the experience a company wants their customer to have with their brand. Benefits include:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Easier access to customer account and payment information.&lt;/li&gt;&lt;li&gt;A voice that best represents their brand over the platform’s voice (e.g. Siri’s voice)&lt;/li&gt;&lt;li&gt;Data on intents and experiences that the company can use to inform product and service decisions.&lt;/li&gt;&lt;/ol&gt;&lt;h2 id=&quot;how-are-independent-voice-assistants-used&quot;&gt;&lt;a href=&quot;#how-are-independent-voice-assistants-used&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;How Are Independent Voice Assistants Used?&lt;/h2&gt;&lt;div class=&quot;floating-image--left&quot; style=&quot;width:300px&quot;&gt;&lt;img src=&quot;../assets/blog/what-is-an-independent-voice-assistant/iva-what-can-i-play-for-you.png&quot; width=&quot;300&quot; alt=&quot;What can I play for you?&quot;/&gt;&lt;/div&gt;&lt;p&gt;Voice assistants can be integrated with existing mobile apps and/or websites. These can entail something as simple as offering voice search using a touch-to-talk mic button. They can also be more dynamic experiences that involve things like wake words to activate listening and content that is broken out into conversational pieces like step-by-step instructions.&lt;/p&gt;&lt;div class=&quot;floating-image--right&quot; style=&quot;width:100px&quot;&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:200px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/a245c094ff3d5dc51a2f09b936ccc4a5/36ca5/iva-add-to-siri.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:200.99999999999997%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAoCAYAAAD+MdrbAAAACXBIWXMAAAsTAAALEwEAmpwYAAAEe0lEQVRIx5VX20tjRxg/f0XpPtSnsg+ljxWxL/uyPnhBtOJ1Y9m0WNQtFKsVNV5WfOiDWqGC1YiL9ZJFRSOFqmxQqusNyvZCWdrVtrtbYxJNdD3JSUxyTvIr3xzncExONA58mTDzzW9+891mjgAANpsNJpMJVVVVKC0tRXl5Oaqrq5OEdEyme6wvKytjQmtqamowPT1NUBBWVlaQlZWFvr4+LNjtWF5eRkZGBgRBuFLevnUL8/PzWFhYQFdXJzIzM7G0tAShq6sLg4ODDP3g9SvW0yZm833U1dWhtraW9XW19L8O9fUPYDabsbi4CI/HA6fTiUOnEwMDA+ju7obQ3t6OyckpBhSIxBCMKLhJi8fjrJ+bm0NraysES5sFExMTEAE8dwdxLClAXIEsy5CjMlOmLSJQF7JxWYaiKIjFYohGo2x8ZmYGLS0tEJqbmzE5OXmxnaLtSsrMDL868eRTG57UjuLPn39jYzTHmREwB2QMGxsbMTWlHjkqy0yRK4tvwvjyzhC++/AR5j+2ovmjz/Da5dRA9YCzs7Mqw7a2No2hfAGoxFTAnV9ceP/zH+DY9cLx17/44KsGLKw5VN0LID0gY0g/iYCcofdUwh3rDu4+fIovTN/i9r37eOG+hqHFYrkEqPcctd9fnuGTgQ1Y2r7Hs2fPkzzMAcnLDQ0NyYAczON24/jYize+I3g9h4goEZyc+LC3t4eTkxMNiJMghk1NTRCIJncKTfKjUHD39/ejv/8bTE/b8Nj2GGtra+jt7YXdbtfCJQkw0SncPiSSJMHpcsPt8cDtduPo6AihUOiSnfU2ZIBqphg7hVgc/PMCL//eZ2m2v7+PcDis2c/QhkYMoQPFj48Q2fsD4nkY4VAoyWlJXk5kqF/AMiYcRiyq2jamKJfmDOPQCJAtoNzVMUksBkYMUwJqLHS20ts2lVOuZJhuS5vhdZKYKRpgR0eHYaakW1ivBeTKkUiECQXy+fk5gsGgJjxL0mLIM+Xg4ACHh4c4PT2Fz+eDy+VimULjlDUcUJ96Vx6ZWG1ubmJ9fR0OhwO7u7vsVtva2sLZ2dnNbMhBKZeJFTEiIZZer1crIImAlHWGRzYK4HScYgiYGLw3CZtrGepZpgJNi+FN5Eovc2PfpPE1hoCiKLJwoWCmAKaehLwtBQKsp3Hq+VwgEEgNSNWYbEKgfr9fFVGEJAUQlCT2nwBoYwIjdpRNKQG5gROdEopGEQqHk8pZWrmcWJGDPi+kxgcI1psR+O+VdkWknXoau5i6s/j0J8Tfewd49y2I9jnEVFrpF9gkhn4/xK8fwt/RDOn4KOnO0b++2CXV2dmpXfRkYFLgbz/eiHc0IVS4nt4pLLB7enowNDSUMsYUek0oMrvxFPbQNI7VkZEREDmB3srZ2dkYHx/Hzs4ONjY2WImisjU2Ngar1YpRLqOjTFZXV7G9vc10qazRCQmDLnuB7ECLcnJykJubi4KCAhQWFqK4uBgVFRXss0H9nFA/OyorK1FSUoKioiKml5+fj7y8PAwPD7MT/Q+IYN3mHZ8XfAAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Add to Siri&quot; title=&quot;Add to Siri&quot; src=&quot;/static/a245c094ff3d5dc51a2f09b936ccc4a5/36ca5/iva-add-to-siri.png&quot; srcSet=&quot;/static/a245c094ff3d5dc51a2f09b936ccc4a5/36ca5/iva-add-to-siri.png 200w&quot; sizes=&quot;(max-width: 200px) 100vw, 200px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/div&gt;&lt;p&gt;Independent voice assistants can also co-exist with third party voice platforms (e.g. Siri, Google, Alexa). Currently, iOS and Android prefer using deep links (&lt;a href=&quot;https://support.apple.com/en-us/HT209055&quot;&gt;Siri Shortcuts&lt;/a&gt; for iOS; &lt;a href=&quot;https://developers.google.com/assistant/app/overview&quot;&gt;App Actions&lt;/a&gt; for Android) to interface with Siri and Google Assistant. Deep links require developers to link to parts of their application based on what the user says to Siri and Google Assistant. Developers who add voice to their app can have deep links resolve to a listening screen that asks the user “How may I help you?” and take over the conversation.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;../integrating-spokestack-google-app-actions-3/images/app-actions-demo.gif&quot; alt=&quot;screen capture of Google Assistant handing voice control to an app&quot;/&gt;&lt;/p&gt;&lt;p&gt;There is another scenario where companies can retain more control of the conversation on smart speakers by Alexa, but that’s a bit more complicated. We’ll save that for another post.&lt;/p&gt;&lt;h2 id=&quot;different-types-of-voice-assistants&quot;&gt;&lt;a href=&quot;#different-types-of-voice-assistants&quot; aria-hidden=&quot;true&quot; tabindex=&quot;-1&quot;&gt;&lt;svg aria-hidden=&quot;true&quot; focusable=&quot;false&quot; width=&quot;16&quot; height=&quot;16&quot; viewBox=&quot;0 0 16 16&quot;&gt;&lt;path fill-rule=&quot;evenodd&quot; d=&quot;M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z&quot;&gt;&lt;/path&gt;&lt;/svg&gt;&lt;/a&gt;Different Types of Voice Assistants&lt;/h2&gt;&lt;p&gt;The difference between the types of voice assistants comes down to who controls the experience and how the voice assistant is accessed. The following is how we think about the different types of voice assistants based on control and access.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Platform Assistants&lt;/strong&gt; handle natural language processing, speech synthesis, and a wake word that activates listening (“Hey Siri”), &lt;em&gt;and&lt;/em&gt; are tightly integrated with hardware such as smart speakers and mobile devices. These assistants are usually &lt;a href=&quot;https://www.springboard.com/blog/narrow-vs-general-ai/&quot;&gt;general AIs&lt;/a&gt; that try to answer every question a user may have. They also provide marketplaces for third party “voice apps” that make platform assistants smarter.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Examples&lt;/em&gt;: Siri, Alexa, Google Assistant, Bixby, &lt;a href=&quot;https://mycroft.ai/&quot;&gt;Mycroft&lt;/a&gt;, &lt;a href=&quot;https://snips.ai/&quot;&gt;Snips&lt;/a&gt; (now Sonos).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Voice Apps&lt;/strong&gt; exist within a platform assistant. They wrap content and services into an API that platform assistants can access and distribute. These apps offload natural language processing, speech synthesis and wake word control to platform assistants. They’re usually described in relation to the platforms where they operate, such as &lt;a href=&quot;https://www.amazon.com/alexa-skills/b?ie=UTF8&amp;amp;node=13727921011&quot;&gt;Alexa Skills&lt;/a&gt;, &lt;a href=&quot;https://assistant.google.com/explore&quot;&gt;Google Actions&lt;/a&gt;, &lt;a href=&quot;https://www.samsung.com/us/explore/bixby/&quot;&gt;Bixby Capsules&lt;/a&gt;, etc.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Examples&lt;/em&gt;: Jeopardy! Skill, &lt;a href=&quot;https://assistant.google.com/services/a/uid/000000eab80a7f99&quot;&gt;Pizza Hut Action&lt;/a&gt;, &lt;a href=&quot;https://www.itranslate.com/bixby&quot;&gt;iTranslate Capsule&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Independent Assistants&lt;/strong&gt; are voice assistants independent of platforms meaning that customer conversations are controlled by companies and developers without help from a platform assistant. This means companies are able to control natural language processing, speech synthesis, and wake word(s). These assistants can be integrated within existing products such as mobile apps, websites, and/or proprietary hardware.&lt;/p&gt;&lt;p&gt;&lt;div class=&quot;gatsby-resp-iframe-wrapper&quot; style=&quot;padding-bottom:56.42857142857143%;position:relative;height:0;overflow:hidden;margin-bottom:25px&quot;&gt; &lt;div class=&quot;embedVideo-container&quot;&gt; &lt;iframe title=&quot;Spokestack Overview&quot; src=&quot;https://www.youtube-nocookie.com/embed/MW2cYSQhbZE?rel=0&quot; class=&quot;embedVideo-iframe&quot; style=&quot;border:0;position:absolute;top:0;left:0;width:100%;height:100%&quot; loading=&quot;eager&quot; allowfullscreen=&quot;&quot; sandbox=&quot;allow-same-origin allow-scripts allow-popups&quot;&gt;&lt;/iframe&gt; &lt;/div&gt; &lt;/div&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Examples:&lt;/em&gt; &lt;a href=&quot;https://support.spotify.com/us/listen_everywhere/voice_assistants/spotify-voice/&quot;&gt;Spotify&lt;/a&gt; (voice search), &lt;a href=&quot;https://corporate.homedepot.com/newsroom/5-technologies-changing-how-we-shop&quot;&gt;Home Depot&lt;/a&gt; (powered by Google’s Dialogflow for voice search), &lt;a href=&quot;https://blog.soundhound.com/pandora-launches-voice-mode-in-mobile-app-powered-by-houndify-7d9091c66817&quot;&gt;Pandora&lt;/a&gt; (powered by Soundhound).&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://spokestack.io/&quot;&gt;Spokestack&lt;/a&gt; provides technology and services to enable developers to build independent voice assistants. Please email us at &lt;a href=&quot;mailto:hello@spokestack.io&quot;&gt;hello@spokestack.io&lt;/a&gt; if you are interested in hearing more about how we can help you build an independent voice assistant.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Keeping Perspective]]></title><description><![CDATA[Learn more about perspective while developing new voice products. Marshall McLuhan reminds us, "We shape our tools, and thereafter our tools shape us."]]></description><link>https://www.spokestack.io/blog/keeping-perspective</link><guid isPermaLink="false">https://www.spokestack.io/blog/keeping-perspective</guid><pubDate>Fri, 20 Mar 2020 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;I watch &lt;a href=&quot;https://vimeo.com/34017777&quot;&gt;this presentation&lt;/a&gt; by Wilson Minor at the 2011 Build Conference at least once a year. Last week, I rewatched it trying to refocus my mind from pandemic to product development work. Watching Wilson’s talk reminds me why we make things, both the impact and consequences. If you can spare 38 minutes, it’s worth your time.&lt;/p&gt;&lt;p&gt;Nine years ago, before we knew the full impact of mobile and social apps would have on our lives, Wilson used a quote from Marshall McLuhan and adapted by Winston Churchill to explain the future:&lt;/p&gt;&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/d17ccc6039be717a4c5ae3257f272928/8537d/keeping-perspective.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABS0lEQVQoz62QzU7CQBSF53n8F/kpCJi41ZXBQBGwpQVFExdslCAQkZ2J78lcO/AGx9yZDqkNLkxcfLm9M+eee6bi8EbiPxHJ5qC5/LNBekZsE+zUl9hzl9htmLrvmmrPdB/XtKmwBwwLjtsS5w+EUkCo9AnFgOB0pa6lkFDuEfKe1DXnyY3pxtC6s9FRS2pRa6bQGCu0Zwr+fIWr5wjXowi1lwjem8Ll8AutqUL1jn6kZAQbFLqEk1uJi2GEp881/MUKjx9rNKcKzYmCO1Hw5iu4rwr+u7mrjxVqo0inZiMOoxPmfImsJ7Vh3pco9QjlvqEYku556DSu3J8NCJV7QnVAKASk5y2CjTIdg3021+SZphMTa/iODdjQCUmH0YaZdmIoBS/TdBLfMcll3NuXCrtxGzYpD2RTRkmN1bFGOBw5MP9iG06KXzWh4RsCHSeXZJzLvgAAAABJRU5ErkJggg==&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Keeping Perspective&quot; title=&quot;Keeping Perspective&quot; src=&quot;/static/d17ccc6039be717a4c5ae3257f272928/05162/keeping-perspective.png&quot; srcSet=&quot;/static/d17ccc6039be717a4c5ae3257f272928/2eeed/keeping-perspective.png 294w,/static/d17ccc6039be717a4c5ae3257f272928/0d6a1/keeping-perspective.png 588w,/static/d17ccc6039be717a4c5ae3257f272928/05162/keeping-perspective.png 1175w,/static/d17ccc6039be717a4c5ae3257f272928/8537d/keeping-perspective.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;This quote is a reminder that the things we build impact the world. Even if it &lt;a href=&quot;https://cdixon.org/2010/01/03/the-next-big-thing-will-start-out-looking-like-a-toy&quot;&gt;looks like a toy at first&lt;/a&gt;, like &lt;a href=&quot;https://twitter.com/cdixon&quot;&gt;Chris Dixon&lt;/a&gt; said 10 years ago. At the beginning of a project, you may not understand all potential uses for what you are building. But, if you keep your eyes and ears open, some of those uses may find you.&lt;/p&gt;&lt;p&gt;At &lt;a href=&quot;https://www.spokestack.io&quot;&gt;Spokestack&lt;/a&gt;, we build tools and services that make your app fully voice-enabled. The idea started with “wouldn’t it be cool if you could talk to apps?”. That came from frustration. If you ask Siri a question, you’ll likely hear, “this is what we found on the web”. The answers are already on our phones but locked inside our apps. Why can’t they answer our voice?&lt;/p&gt;&lt;p&gt;One common request when talking to customers is improved mobile accessibility. Especially for those who need more than a touch-only experience. Existing companies already build hardware for the visually impaired. There are also companies like &lt;a href=&quot;https://gettecla.com/pages/tecla-e&quot;&gt;Tecla&lt;/a&gt; that build devices to help those with limited upper-body mobility to use mobile devices. Voice technology has the power to increase usability for millions of people.&lt;/p&gt;&lt;p&gt;Startups like ours are always looking for a product-market fit, but times like these compel us to reexamine how we think of the “market”. What we make should provide for societal needs in addition to making a great business.&lt;/p&gt;&lt;p&gt;If you have specific problems a voice-enabled interface could solve, please reach out. Help us shape Spokestack’s tools for everyone.&lt;/p&gt;</content:encoded></item><item><title><![CDATA[Why We’re Building Spokestack]]></title><description><![CDATA[At Spokestack, we build tools and services that make your app fully voice-enabled. 58% of Americans already use their smartphone as a voice assistant.]]></description><link>https://www.spokestack.io/blog/why-were-building-spokestack</link><guid isPermaLink="false">https://www.spokestack.io/blog/why-were-building-spokestack</guid><pubDate>Tue, 10 Dec 2019 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;&lt;span class=&quot;gatsby-resp-image-wrapper&quot; style=&quot;position:relative;display:block;margin-left:auto;margin-right:auto;max-width:1175px&quot;&gt;
      &lt;a class=&quot;gatsby-resp-image-link&quot; href=&quot;/static/1ffbda3a59e1d65d15f4ba751a8166f3/8537d/why-were-building-spokestack.png&quot; style=&quot;display:block&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;
    &lt;span class=&quot;gatsby-resp-image-background-image&quot; style=&quot;padding-bottom:56.4625850340136%;position:relative;bottom:0;left:0;background-image:url(&amp;#x27;data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAALCAYAAAB/Ca1DAAAACXBIWXMAAAsTAAALEwEAmpwYAAABSUlEQVQoz62R604CMRCF+z6gyG1vLDHxAUy8JCLKctnuwgOoICEhEeK7dia7PMIx7VIFhD/GH18605k5PZOKyhPhPxE2OO+qP4vszopKd79Q7iicPSqUOkVc2uY6LttzBytmT2ETexEmDHdIuJwwWpLRThneiEzsx2zqgSxqml8ObXLxTKj2CNcvGW7fMsj1BjevGe5nGTrz3OR30wwP8xy9ZY7uIke0zFHrkdnACovmgOCMGM0hoT1hyM+NIV5tcDVjBGuFYKEQfCj4CwX/nRCuFLypMo4b/YNP0UKNAaHeJyPaStkQpgw/YTgJwU3JnI4kODHBTQj+mBGO2ZjR8xahhWpRgV7Zrm7v9Eq1I3k9KgTcmOFJhrM1JnSxegL9mHmw/xN/30X7fU3r0Lo6hnWqBxoHQrs9tk/3CE9bjtlYP4Z3wMkeWfAF3H4mNwVzm18AAAAASUVORK5CYII=&amp;#x27;);background-size:cover;display:block&quot;&gt;&lt;/span&gt;
  &lt;img class=&quot;gatsby-resp-image-image&quot; alt=&quot;Why We&amp;#x27;re Building Spokestack&quot; title=&quot;Why We&amp;#x27;re Building Spokestack&quot; src=&quot;/static/1ffbda3a59e1d65d15f4ba751a8166f3/05162/why-were-building-spokestack.png&quot; srcSet=&quot;/static/1ffbda3a59e1d65d15f4ba751a8166f3/2eeed/why-were-building-spokestack.png 294w,/static/1ffbda3a59e1d65d15f4ba751a8166f3/0d6a1/why-were-building-spokestack.png 588w,/static/1ffbda3a59e1d65d15f4ba751a8166f3/05162/why-were-building-spokestack.png 1175w,/static/1ffbda3a59e1d65d15f4ba751a8166f3/8537d/why-were-building-spokestack.png 1200w&quot; sizes=&quot;(max-width: 1175px) 100vw, 1175px&quot; style=&quot;width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0&quot; loading=&quot;lazy&quot; decoding=&quot;async&quot;/&gt;
  &lt;/a&gt;
    &lt;/span&gt;&lt;/p&gt;&lt;p&gt;Over &lt;a href=&quot;https://voicebot.ai/2019/01/15/twice-the-number-of-u-s-adults-have-tried-in-car-voice-assistants-as-smart-speakers/&quot;&gt;58% of Americans&lt;/a&gt; already use their smart phone as a voice assistant - more than smart speaker, smart watch and desktop voice assistant users combined. That’s because we have them with us at all times.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Smart phones are smart because apps like yours make them so. In order for apps to become smarter, they need a voice.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Here’s an example: try saying “Hey Siri, open {your app name}” or “OK, Google, open {your app name}” depending on your phone.&lt;/p&gt;&lt;p&gt;Did your app say anything? Did it ask what you needed? If not, Spokestack can help.
At Spokestack, it’s our mission to give every company a unique voice, one that reinforces their brand’s relationship with customers. We believe every company should have direct conversations with their customers rather than broker with intermediaries. Customers should know who they’re talking to when they ask for your brand. Voice enables a new way to interact with customers that will grow your business by making your app more useful.&lt;/p&gt;&lt;p&gt;Finally, we believe in open source software, best practices around privacy, and making sure you have full control over conversations with your customers. We’re at the forefront of the next great leap in user experiences. We’re excited to partner with you to build that future together.&lt;/p&gt;</content:encoded></item></channel></rss>