In the video game series Halo, the relationship between the protagonist and his AI companion Cortana is very close and intensive. On the new Windows platforms, users now also have the chance to interact with Cortana as a personal assistant. But for achieving the same intensive relationship with the user that Master Chief and Cortana already have, one important thing is missing: The integration of your app. Read here, how to close that gap.
Define voice commands
First of all you have to define, what the user can say to let Cortana connect these sentences with your app. For this, you need to create a new XML file that describes the different scenarios in the VCD format. Defining
CommandSet nodes that contains the commands for each language you want to support is a good start. Within these nodes you need to define the
CommandPrefix. These are the keywords that let Cortana put the spoken words in context with your app. Usually this
CommandPrefix is the name of your app but you can choose whatever you like to make the interaction more natural. In my ears the sentence “Switch on the light!” sounds way better that “LightSwitcher, turn the light on!”. You are also advised to add an
Example to your command set that will appear in the list of commands when you ask Cortana, what you can say.
Now you define the commands that the user should be able to use. Make sure that they build a logical sentence with the preceded
Command node need a
Name property and some self-explaining children. The
Example works similar to the previous one and the Feedback is the sentence that Cortana answers the user after activating your app. Beyond doubt, the most important node is
ListenFor. Here you define the sentence that is mapped to your command. You can use two types of variables inside of it. Those written in square brackets are optional words that the user can say or leave out. The variables in curly braces point to one of the
PhraseList strings that you define below. The user has to say one of the words or phrases that you define there. Please note, that you are not able to change the commands once the app is running on the users device. What you can manipulate are the
PhraseList items. If the user adds a new room for example you are able to extend the room list by this entry. We will see you how to do this in the next chapter. Last you can choose, what should happen, if a user uses the command. The simplest case is navigation to your app, which is why we used the
Navigate property here. You can also start a background service or let Cortana interact deeper with the user but we will discuss these scenarios in one of the following posts.
Let Cortana learn them
Once you have defined all your commands you can add them to Cortana’s vocabulary. Usually you do this, when your app starts which implicates that the user have to start the app at least one time. Add the following line to the
OnLaunched event handler of your central App class, to load the XML file and register it to Cortana.
As mentioned above, you will sometimes need to edit the list of options that you defined in one of these
PhraseList nodes. You do this by calling
SetPhraseListAsync() on a
VoiceCommandSet you extracted out of the installed command sets.
React on app activation by Cortana
Whenever your app get activated by Cortana, she tells you as a developer that she did so and passes you some additional information about the activation context and the spoken commands. To access this, you need to override the
OnActivated event handler inside the
App class. Here you can differentiate between different
ActivationKinds. If it is a
VoiceCommand, you know that your app has been started via Cortana and need to react. Access the
SemanticInterpretation of the spoken text to extract the variables like room and status out of it that you have defined in the VCD file before. You can wrap this information in a single separated string and pass them to your
MainPage while navigating to it.
At last we need to teach our
MainPage to deal with those navigation arguments and to react on them properly. Inside the
OnNavigatedTo event handler we can access the navigation arguments and split them into single parts. Now we can proceed regarding to the spoken command.
As you can see, teaching Cortana to work with your app is not that hard but it can get very complex if you have to consider each scenario and create the right sentences and phrase lists. Cortana is very picky when it comes to recognizing that the spoken words belong to your apps. This means if the user says a word too much or misses out something you have not marked with curly braces, it won’t work and Cortana opens the Bing search instead. And let’s be honest: Who wants that?
So make sure that the command you created covers all usage scenarios. You are also free to create multiple commands for the same use case but with different wordings. It can also get a little bit heavier when you need to handle different languages in your code.
If you want to achieve a solid Cortana integration, make sure to also read these useful articles and documentations:
- MSDN Cortana integration docs
- Support natural language voice commands
- Discussion on how to handle multiple languages in code
In the next Cortana posts, we will dive deeper into the other usecases you can use the personal assistant for. Stay tuned.