Iris Classon
Iris Classon - In Love with Code

PDF to audio in Windows Runtime Apps in C#, for when I don’t want to read

I’m pretty much done unpacking all my things from my UK move and have finally been able to find more time to code after work. Which is great, as I have a long list of apps I want to make for my own use. On that list was to have an app that reads out books to me, when an audio book is not available.

With the help of the excellent library iTextSharp and the text to speech APIs it’s rather simple to get this working. Then of course some more time to handle the logic around this. For example I would like to add play, pause, stop controls, and adjust playback rate, view the PDF page that is being read, have book marks, add voice commands and a few more things.

Here is some code for letting a user select a PDF, use a slider to select which page to start at, and then parse the PDF. It does so 3 pages from that location, as text to speech doesn’t like massive texts. It would be an idea to add a setting that lets the user convert the PDF to audio and then play it with some meta tags. However, make sure it’s not copyright infringement. For PDF books on tech topics you can find many free books by Microsoft, I used the latest Azure book to test, and even a few books with code- and it’s actually not too bad to have code read out, special characters are ignored and it flows quite well.

 

The full code will be at the end of this post, I’m going to skip breaking it up for now since I need to head home before it gets dark (no lights on my bike and I had an accident yesterday which I’d like to not repeat :D ).

First, after the Project, New, grab the ITextSharp Nuget package.

Let the user select a file using the FileOpenPicker and verify they did indeed grab a file. Use the PdfReader provided by the third party library iTextSharp to get the number of pages by passing in the buffer from the file, and you can set the max value for the slider (minimum value set to 1 in the view).

I’ve disabled the play button and slider until the user has selected a file.

Once the user has selected a file you can loop through the number of pages and use PdfTextExtractor.GetTextFromPage to get the actual text. This will take a little while so provide some feedback to the user you are working on it hehe. The text will contain line breaks, and the audio will sound strange with a line break for every line, so I’ve simply replaced \n with an empty string. Not 100% happy with that quite yet, but I need to think about it a little bit.

Once that is done you can provide the text to the speech synthesizer, which you new up, select preferred voice if not default, create a media element and I would recommend to turn up the playback rate a tiny bit- or even better, let the user decide.

Simple really. Have fun it, I’ll post the end result with the things I mentioned earlier in the post, with some nicely refactored and error handled code. Hope to have some time over the next few weeks, but for now this is enough for my bike ride home :)

The code behind file for this (the full version won’t be using code behind but I seem to get a lot of MVVM related questions when I do that so I try to separate the topics so readers can focus on one thing at the time):

[sourcecode language=“csharp”]
using System;
using System.Linq;
using System.Runtime.InteropServices.WindowsRuntime;
using System.Text;
using System.Threading.Tasks;
using Windows.Media.SpeechSynthesis;
using Windows.Storage;
using Windows.Storage.Pickers;
using Windows.UI.Popups;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;

namespace Mail
{
public sealed partial class MainPage
{
public MainPage()
{
InitializeComponent();
}

    private StorageFile \_selectedFile;  

    private async void OnSelect(object sender, RoutedEventArgs e)  
    {  
        var picker = new FileOpenPicker();  
        picker.SuggestedStartLocation = PickerLocationId.DocumentsLibrary;  
        picker.FileTypeFilter.Add(".pdf");  

        var file = await picker.PickSingleFileAsync();  
        if (file != null)  
        {  
            \_selectedFile = file;  
            play.IsEnabled = true;  
            slider.IsEnabled = true;  

            var buffer = await FileIO.ReadBufferAsync(\_selectedFile);  

            using (var reader = new PdfReader(buffer.ToArray()))  
                slider.Maximum = reader.NumberOfPages;  
        }  
        else  
        {  
            await new MessageDialog("Could not open file").ShowAsync();  
        }  
    }  

    private async void OnPlay(object sender, RoutedEventArgs e)  
    {  
        if (\_selectedFile == null) return;  

        var buffer = await FileIO.ReadBufferAsync(\_selectedFile);  

        using (var reader = new PdfReader(buffer.ToArray()))  
        {  
            var text = new StringBuilder();  
            var selectedPageNr = (int) slider.Value;  

            for (var i = selectedPageNr; i <= reader.NumberOfPages && i <= selectedPageNr + 3; i++)  
                text.Append(PdfTextExtractor.GetTextFromPage(reader, i));  

            var cleaned = text.Replace("\n", "");  

            await SynthesizeTextToSpeachAsync(cleaned.ToString());  
        }  
    }  

    public async Task SynthesizeTextToSpeachAsync(string text)  
    {  
        using (var speechSynthesizer = new SpeechSynthesizer())  
        {  
            speechSynthesizer.Voice = SpeechSynthesizer.AllVoices.First(x => x.Gender == VoiceGender.Male);  

            var stream = await speechSynthesizer.SynthesizeTextToStreamAsync(text);  

            var mediaElement = new MediaElement { DefaultPlaybackRate = 1.5 };  

            mediaElement.SetSource(stream, stream.ContentType);  
        }  
    }  
}  

}

[/sourcecode]

Comments

Leave a comment below, or by email.
Stanley
2/10/2015 12:27:05 PM
Now where is the Iris voice add-on so we can all feel like you are reading to us? 
ollie
2/10/2015 2:46:49 PM
Really cool idea! Would make a nice companion to my spritzing research papers...Any plans to make available?! anything similar already existing?
Ollie 
sandeep
10/29/2015 5:56:43 PM
Is there any chance to change the voice format? 
sandeep
10/29/2015 6:02:27 PM
is there any chance to change the voice? 


Last modified on 2015-02-10

comments powered by Disqus