As part of our Clever WMS Devices app you can use your voice to input data and receive prompts for instructions, for your handheld device. This functionality is available to use with either the Google or Azure services.

The speech-to-text functionality involves the device speaking back to the warehouse user with prompts or confirmation of actions. It allows the warehouse user to speak input into the device, allowing them to work more seamlessly.

Clever Device Framework Setup

Choose whether to use Google or Azure services on the Clever WMS Devices Setup page. Navigate to the Clever Device Framework Setup card from the menu. In the Voice Setting FastTab you will need to set the following fields, the first is Voice Service, use the drop down to select to use either Azure or Google.

Then set the Voice Confirmation Phrase, this is the default phrase to be used by the device user when confirming an action using the voice service.

The Voice Confirmation Phrase can be setup to be in various languages. Click on the ellipsis next to the field (the three dots) to bring up the translations list. Enter the language ID and corresponding translation for the phrase.

Device User

You can setup the voice functionality for individual device users. Search for Device users in the menu and highlight the user from the list that you wish to edit, this will open the device user card.

In the General FastTab of the Device User Card you will see some fields around Language and voice.

The Language ID indicates the language this device user will be using. Either type in the Language ID if you know it, otherwise use the drop-down arrow to select from the list.

Once the Language ID has been selected the Language will automatically populate by the app based on the Language ID you selected.

The last field is the Enable Voice option setting this indicates whether to enable voice recognition services when this device user is logged onto the device

Device Functions

Voice functionality can be enabled per Device Function. Search for Device Functions in the menu and open the card for the chosen device function.

In the General FastTab of the Device Function card there are a few fields relating to Voice.

By toggling the flag on the Enable Voice field you are deciding whether to enable voice recognition services when this function is being carried out on the device

If enabled, you should populate the Confirmation Text. The text in this field is what is spoken by device for confirmation of posting the transaction

If you populate the Posting Notification Text, then whatever text you put in the box will be spoken by the device at the time of posting. This can be configured to be an instruction for the next activity.

The Confirmation and Post Notification Texts can be translated by clicking on the ellipsis next to the fields. Enter the language ID and corresponding translation for the texts in the list.

The text fields can also make use of the data items linked to the device function. Any data the device already has, can be spoken back to the user. Enter the Device function column name into square brackets [[…]] for the app to know which column it needs to retrieve this data from. E.g. The Confirmation Text can be Picking [[ITEMNO]] from bin [[TAKEBINCODE]], do you want to continue? The Item No. and Take Bin Code will be read from the Data Items for this pick and spoken back to the user as part of the confirmation text.

Device Function Columns

Device Functions of type Transaction have Device Columns. These consist of data items either requested from or shown on the device. Voice functionality can be applied against each of the Device Function Columns.

From the Device Function card, select a Column of type Data Item. On the Columns subpage select the Voice Setting action.

This will open a new window with different options relating to that particular date item and the voice settings.

When setting the Data Type, you have 2 options in the drop down either Text or Number. This helps the service understand what type of data is being requested. I.e. this would allow the voice recognition service to distinguish between whether the user said ‘two’ or ‘too’.

By populating the Spoken Prompt Text box, you are indicating what the text should be spoken by the device when prompting for this Data Item

Enabling Allow Voice Input, allows this data item to be entered by the Device User. Using speech-to-text functionality

The next field down is the Confirm Recognized Speech field if enabled it will allow the device to confirm it has recognised the spoken text.

The final field Confirmation Text, when you put text into this field, this will be the text to be spoken by the device as confirmation after this Data Item has been entered

Like the Device Function Confirmation/Posting Notification Text fields, you can also make use of the data from Device Function Columns in the Voice Settings. Any data the device already has, can be spoken back to the user. Enter the Device function column name then %1, for the app to know which column it needs to retrieve this data from.


  1. Setup Voice Setting on Framework Setup with Azure.

  2. Enable Voice on Device User DEMO

  3. Enable Voice on Device Function PICK

    • General Tab

      • Enable Voice on PICK

      • Set Confirmation Text to Confirm pick of [[ITEMNO]] for quantity [[QUANTITY]] from bin [[TAKEBINCODE]]

      • Set Posting Notification Text to Posting Pick [[SOURCENO]]

  • On the Device Function PICK card

    • Select Data Item Quantity on the Columns subpage

    • Select Voice Setting action for Data Item Quantity

    • Set Data Type to Number

    • Set Spoken Prompt Text to Quantity to handle

    • Enable Allow Voice Input and Confirm Recognised Speech

    • Set Confirmation Text to Confirm %1 entered

NB: Anything in [[..]] is the current value in that data item

Handheld Device Changes

Changes have also been made to the Clever WMS Devices app (device side) to distinguish whether a data item can be entered using voice services.

When the app is requesting data to be input a microphone symbol will appear next to the data items that have enabled voice. The microphone symbol will pulsate with animation when it is listening for input.

The microphone symbol changes to a loading circle when the device is processing what has been input. The loading circle is also shown when the device is calculating what to say next.

When using the Google services the user will see the Google dialog and symbols when speaking to the device.


Further settings have also been done around the device settings.

From the Login screen navigate to the Settings by clicking on the Settings cog action. Along the top navigation bar select More and Speech Options. This will present the user with the following fields on the app:

  • Preferred gender – The voice used by the app when speaking to the Device User. Options are Female or Male

  • Microphone Sensitivity – The level of sensitivity the microphone will use on the device with the app installed. Default value: 0.150

  • Timeout for confirmation (secs) – How long the app will wait for input from the user before timing out

  • Trim the audio data – When using the Azure voice services. The audio can be trimmed to exclude any silences at the beginning or end of speech. This will speed up the requests and response between Azure and the app

  • Use Bluetooth headset if available – When enabled the device will always check for a Bluetooth headset when the device user logs in.