With the arrival of machine studying and synthetic intelligence, the iOS SDK already comes with quite a lot of frameworks for builders to develop apps with machine learning-related options. On this tutorial, let’s discover two built-in ML APIs for changing textual content to speech and performing language detection.
Utilizing Textual content to Speech in AVFoundation
Let’s say, if we’re constructing an app that reads customers’ enter message, it is advisable to implement some type of text-to-speech capabilities.
The AVFoundation
framework has include some text-to-speech APIs. To make use of these APIs, we now have to first import the framework:
Subsequent, create an occasion of AVSpeechSynthesizer
:
let speechSynthesizer = AVSpeechSynthesizer() |
To transform the textual content message to speech, you possibly can write the code like this:
speechSynthesizer.converse(utterance)
let utterance = AVSpeechUtterance(string: inputMessage) utterance.pitchMultiplier = 1.0 utterance.charge = 0.5 utterance.voice = AVSpeechSynthesisVoice(language: “en-US”)
speechSynthesizer.converse(utterance) |
You create an occasion of AVSpeechUtterance
with the textual content for the synthesizer to talk. Optionally, you possibly can configure the pitch, charge, and voice. For the voice parameter, we set the language to English (U.S.). Lastly, you cross the utterance
object to the speech synthesizer to learn the textual content in English.
The built-in speech synthesizer is able to talking a number of languages similar to Chinese language, Japanese, and French. To inform the synthesizer the language to talk, you must cross the right language code when creating the occasion of AVSpeechSynthesisVoice
.
To seek out out all of the language codes that the system helps, you possibly can name up the speechVoices()
methodology of AVSpeechSynthesisVoice
:
for voice in voices {
print(voice.language)
}
let voices = AVSpeechSynthesisVoice.speechVoices()
for voice in voices { print(voice.language) } |
Listed below are among the supported language codes:
- Japanese – ja-JP
- Korean – ko-KR
- French – fr-FR
- Italian – it-IT
- Cantonese – zh-HK
- Mandarin – zh-TW
- Putonghua – zh-CN
In some instances, chances are you’ll have to interrupt the speech synthesizer. You’ll be able to name up the stopSpeaking
methodology to cease the synthesizer:
speechSynthesizer.stopSpeaking(at: .fast) |
Performing Language Identification Utilizing Pure Language Framework
As you possibly can see within the code above, we now have to determine the language of the enter message earlier than the speech synthesizer can convert the textual content to speech appropriately. Wouldn’t it’s nice if the app can routinely detect the language of the enter message?
The NaturalLanguage
framework gives quite a lot of pure language processing (NLP) performance together with language identification.
To make use of the NLP APIs, first import the NaturalLanguage
framework:
You simply want a pair strains of code to detect the language of a textual content message:
let languageRecognizer = NLLanguageRecognizer() languageRecognizer.processString(inputMessage) |
The code above creates an occasion of NLLanguageRecognizer
after which invokes the processString
to course of the enter message. As soon as processed, the language recognized is saved within the dominantLanguage
property:
if let dominantLanguage = languageRecognizer.dominantLanguage { print(dominantLanguage.rawValue) } |
Here’s a fast instance:
For the pattern, NLLanguageRecognizer
acknowledges the language as English (i.e. en
). If you happen to change the inputMessage
to Japanese like beneath, the dominantLanguage
turns into ja
:
let inputMessage = “これはテストです” |
The dominantLanguage
property could don’t have any worth if the enter message is like this:
let inputMessage = “1234949485🎃😸💩🥸” |
Wrap Up
On this tutorial, we now have walked you thru two of the built-in machine studying APIs for changing textual content to speech and figuring out the language of a textual content message. With these ML APIs, you possibly can simply incorporate the text-to-speech function in your iOS apps.