By Charles Williamson in Artificial Intelligence — Jun 19, 2023

Voice Input

I just read Fred Wilson's post on Voice Input. It is true that voice input has been around for many years, but it is also only recently that I've started using it personally, for several reasons:

Improved transcription: I'm using a few products that have already integrated with Whisper, which has been much more reliable and accurate for me.
Improved value of transcriptions: most of the time when I'm talking, it is less coherent than if I'm typing. There are lots of dead-ends and non sequiturs. This is both the benefit and the downside of using voice, but occasionally you end up going in a direction that you wouldn't have predicted while just typing. But with AI, I've found it completely negates the downsides - now I just summarise the output transcription, and it's just as good (if not better) than if I'd taken the time to write it myself.

Here are a few examples:

Blog Posts: At the start of any longer blog post, I have an assortment of ideas but nothing that is finished. Now I just speak for as long as I have something to say, and then summarise the output using AI.
To Do Lists: Most days I speak into my note-taking app. It will transcribe the audio, and then I can perform a command within the same application to convert the brain dump into a list of action items, that I can then input into my project management tool.

I personally do this with Reflect, but I'm sure there are other options that can achieve a similar outcome (though none that I've come across).

I wonder why chatbots don't use this as a method of input more often. Perhaps they will in the future? Or perhaps the social perception of speaking into your phone with no-one on the other end is still a barrier for most people when they're around others?

Subscribe to Charles Williamson