Cortana Integration #1: Hey Cortana, work with my Windows app!
In the video game series Halo, the relationship between the protagonist and his AI companion Cortana is very close and intensive. On the new Windows platforms, users now also have the chance to interact with Cortana as a personal assistant. But for achieving the same intensive relationship with the user that Master Chief and Cortana already have, one important thing is missing: The integration of your app. Read here, how to close that gap.
Define voice commands
First of all you have to define, what the user can say to let Cortana connect these sentences with your app. For this, you need to create a new XML file that describes the different scenarios in the VCD format. DefiningCommandSet
nodes that contains the commands for each language you want to support is a good start. Within these nodes you need to define the CommandPrefix
. These are the keywords that letCortana putthe spoken wordsin context with your app. Usually this CommandPrefix
is the name of your app but you can choose whatever you like to make the interaction more natural. In my ears the sentence “Switch on the light!” sounds way better that “LightSwitcher, turn the light on!”. You are also advised to add an Example
to your command set that will appear in the list of commands when you ask Cortana, what you can say.
Now you define the commands that the user should be able to use. Make sure that they build a logical sentence with the preceded CommandPrefix
. A Command
node need a Name
property and some self-explaining children. The Example
works similar to the previous one and the Feedback is the sentence that Cortana answers the user after activating your app. Beyond doubt, the most important node is ListenFor
. Here you define the sentence that is mapped to your command. You can use two types of variables inside of it. Those written in square brackets are optional words that the user can say or leave out. The variables in curly braces point to one of the PhraseList
strings that you define below. The user has to say one of the words or phrases that you define there. Please note, that you are not able to change the commands once the app is running on the users device. What you can manipulate are the PhraseList
items. If the user adds a new room for example you are able to extend the room list by this entry. We will see you how to do this in the next chapter. Last you can choose, what should happen, if a user uses the command. The simplest case is navigation to your app, which is why we used the Navigate
property here. You can also start a background service or let Cortana interact deeper with the user but we will discuss these scenarios in one of the following posts.
Let Cortana learn them
Once you have defined all your commands you can add them to Cortana’s vocabulary. Usually you do this, when your app starts which implicates that the user have to start the app at least one time. Add the following line to the OnLaunched
event handler of your central App class, to load the XML file and register it to Cortana.
As mentioned above, you will sometimes need to edit the list of options that you defined in one of these PhraseList
nodes. You do this by calling SetPhraseListAsync()
on a VoiceCommandSet
you extracted out of the installed command sets.
React on app activation by Cortana
Whenever your app get activated by Cortana, she tells you as a developer that she did so and passes you some additional information about the activation context and the spoken commands. To access this, you need to override the OnActivated
event handler inside the App
class. Here youcan differentiate between different ActivationKind
s. If it is a VoiceCommand
, you know that your app has been started via Cortana and need to react.Access the SemanticInterpretation
of the spoken text to extract the variables like room and status out of it that youhave defined in the VCD file before. Youcan wrap this information in a single separated string and pass them to your MainPage
while navigating to it.
At last we need to teach our MainPage
to deal with those navigation arguments and to react on them properly. Inside theOnNavigatedTo
event handler we can access the navigation arguments and split them into single parts. Now we can proceed regarding to the spoken command.
Next steps
As you can see, teaching Cortana to work with your app is not that hard but it can get very complex if you have to consider each scenario and create the right sentences and phrase lists. Cortana is very picky when it comes to recognizing that the spoken words belong to your apps. This means if the user says a word too much or misses out something you have not marked with curly braces, it won’t work and Cortana opens the Bing search instead. And let’s be honest: Who wants that?
So make sure that the command you created covers all usage scenarios. You are also free to create multiple commands for the same use case but with different wordings. It can also get a little bit heavier when you need to handle different languages in your code.
If you want to achieve a solid Cortana integration, make sure to also read these useful articles and documentations:
- MSDN Cortana integration docs
- Support natural language voice commands
- Discussion on how to handle multiple languages in code
In the next Cortana posts, we will dive deeper into theother usecases you can use the personal assistantfor. Stay tuned.