Tuesday, October 11, 2022
HomeOperating SystemWeblog To Speech - In My Voice

Weblog To Speech – In My Voice


Just lately my Web buddy Terrence Eden crafted a weblog publish titled Weblog To Speech which you would possibly wish to additionally learn. It serves as an inspiration for this publish.

In brief, there’s a pattern in running a blog (and on some information websites) so as to add an audio transcription of the web page you’re studying, often on the prime of the article. Principally that is executed semi-automatically utilizing a bot to learn in an “AI generated” voice comparable to Amazon Polly.

I haven’t seen many of those articles the place the creator themselves transcribed the textual content.

We can do this although, however utilizing AI (kinda). Properly, I can.

There exists on-line, a synthesized model of my very own voice, constructed into mimic3 (licensed below the AGPL v3) by MycroftAI.

Right here’s (hopefully) that voice – my (artificial) voice, studying the remainder of this publish.

Origin Story

Again in 2015 the “MyCroft – AI for everybody” venture was began, which aimed to be a totally useful Open Supply digital assistant. Consider it as an open various to Amazon Alexa, Apple Siri, Microsoft’s Cortana and Google Assistant. They’ve run a collection of crowdfunding campaigns to generate income to construct the software program and a tool.

When MyCroft have been first getting began, their Neighborhood Supervisor requested if I needed to be the “voice of MyCroft”. I used to be slightly perplexed by this, however flattered. I believe this merely happened on account of them listening to my dulcet tones on the (now defunct) Ubuntu Podcast together with my co-presenters.

It sounded enjoyable so I agreed to submit my voice, by way of a 3rd celebration firm – VocalID. This concerned utilizing an online interface to file quick snippets of textual content which lined all of the doable phenomes my voice would possibly make.

I spent some hours recording (and re-recording) 3488 quick sentences which may later be processed and used to synthesize my voice. Right here’a an unflattering selfie taken on 1st April 2016 of me in my younger Son’s bed room recording my voice utilizing a Blue Snowball.

The Mycroft workforce took these snippets and did one thing to show that right into a dataset which may very well be used with their assistant software program. You may basically make it (me) say something. The preliminary outcomes weren’t incredible. It was clearly “my” voice, nevertheless it sounded robotic, bass-heavy and really a lot within the uncanny valley.

popey in a field

The preliminary Mycroft Mark I {hardware} items did ultimately ship. I bought despatched one as a “Thanks” for my voice (no, I wasn’t paid in some other manner).

The massive field it shipped in…

Mycroft AI box

The scary warning you bought when opening the field…

Mycroft AI scary

The entrance of the Mycroft Mark I…

Mycroft AI front

Mycroft’s rear

Mycroft AI back

What are beans

As soon as I’d unpacked the Mark I and setup the software program, I attempted asking it/them/me some questions. One which had a enjoyable response was the traditional “What are beans?”. Take pleasure in:

Blame popey

Some folks took the software program and did enjoyable issues with it. For some time there was an internet site known as “Blame popey” during which you can sort any textual content (sure, something) and have it was a WAV file of “me” saying it. How we laughed.

Blame popey

popey sings

Another person took it a step additional and set a few of this to music as “popey sings”. There exists audio of “me” sining Elton John’s “Rocketman”, Eminem’s “Lose Your self”, “Jerusalem” and even Elvis “Love me Tender”. They’re right here in OGG and MP3 format for those who’re actually excited by listening to what a robotic me seems like “singing”. Right here’s a pattern of “me” singing Mr Sandman. I apologise.

Pressed popey

We even went so far as urgent the songs above onto vinyl as a 9-track “Restricted Version EP” to provide away as a “prize” at a stay Ubuntu Podcast occasion. Somebody, someplace has one among these. I don’t know in the event that they’ve ever truly been performed on a real turntable. In all probability not.

popey EP

popey Mark III

Whereas the unique audio used within the Mark I gadget was robotic and obscure generally, enhancements have been made. The brand new “Mimic 3” engine – which is on the market on GitHub sounds extra like me, is much less robotic, and extra nice to hearken to (assuming you don’t thoughts it being my voice, in fact!).

It’s fairly trivial to get the Mimic 3 engine working on a contemporary laptop computer and even Raspberry Pi, to play with it.

Mimic 3 web UI

Conclusion

I pasted a model of all this textual content into the Mimic 3 internet interface, to see the way it sounds. It’s not glorious, for certain. It has bother saying issues like GitHub, “Mark I” and the headlines on this publish.

I may re-word components to work round this, for instance altering “Mark I” to “Mark 1” or “Mark One”. The extra overhead of enhancing, re-editing and re-creating the audio is perhaps time consuming and never worthwhile, if no person listens.

Whereas I did generate the audio manually utilizing the online interface, there’s additionally an API, so I may programmatically generate a brand new MP3 every time the textual content on the web page modifications.

Total it was a enjoyable train, however I don’t assume I’ll be doing this for each publish!



RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments