View Single Post
 
Old 09-27-2011, 02:11 PM
therealtomlapp therealtomlapp is offline
Freshman
 
Join Date: Jan 2009
Location: New Jersey
Posts: 40
Default Problem with long delay when playing text to speech from a buffer to directsound

I have a program that I use as an automated called ID app, that announces who is calling over a secondary sound card in my PC. Usually one would just use the speech synthesizer by itself, but i had to go a few steps further as I need it to direct the speech over a particular sound device. So far it works, but it seems at random there is a really long delay before you hear anything, sometimes the delay is so long that by the time you hear it, the caller has given up and hung up.

Here is my code:
Code:
Imports Microsoft.DirectX
Imports Microsoft.DirectX.DirectSound
Imports System.Speech
Imports System.IO
Imports System.Speech.Synthesis

Public Class Form1

#Region "Declarations"

    Dim SpeechSynth As New SpeechSynthesizer()
    Dim Stream_Voice As MemoryStream
    Dim myDevices As New DevicesCollection
    Dim DirectSoundAudioDevice As DirectSound.Device
    Dim DirectSoundAudioBuffer As SecondaryBuffer
    Dim DirectSoundAudioStream As MemoryStream

    Public Structure DirectSoundDevStruct
        Public DeviceIndex As Integer
        Public DeviceDescription As String
    End Structure

    Public Enum VoiceEnum
        Paul
        Kate
        Julie
    End Enum

#End Region

    Public Sub SpeakText(ByVal TextToSpeak As String, ByVal AudioVolume As Integer, ByVal AudioPan As Integer, ByVal AudioPlaybackFrequency As Integer, ByVal SAPIVoice As VoiceEnum)

        'Not too sure, should the 2 following lines execute in the Load event, or should they execute each time I utilize the device?
        DirectSoundAudioDevice = New DirectSound.Device(myDevices(2).DriverGuid)
        DirectSoundAudioDevice.SetCooperativeLevel(Me, CooperativeLevel.Priority)

        Dim bufferDesc As New Microsoft.DirectX.DirectSound.BufferDescription

        With bufferDesc
            .GlobalFocus = True
            .ControlVolume = True
            .ControlPan = True
            .ControlFrequency = True
            .ControlEffects = True
        End With

        Stream_Voice = Nothing
        Stream_Voice = New MemoryStream

        Stream_Voice.Position = 0

        Dim _SelectedVoice As String = ""

        Select Case SAPIVoice
            Case VoiceEnum.Julie
                _SelectedVoice = "Julie"


            Case VoiceEnum.Paul
                _SelectedVoice = "Paul"

            Case VoiceEnum.Kate
                _SelectedVoice = "Kate"

        End Select

        With SpeechSynth
            For Each _Voice As System.Speech.Synthesis.InstalledVoice In SpeechSynth.GetInstalledVoices
                If _Voice.VoiceInfo.Name.Contains(_SelectedVoice) Then
                    SpeechSynth.SelectVoice(_Voice.VoiceInfo.Name)
                End If
            Next

            .SetOutputToWaveStream(Stream_Voice)
            .Speak(TextToSpeak)
        End With

        Stream_Voice.Position = 0

        DirectSoundAudioBuffer = New SecondaryBuffer(Stream_Voice, bufferDesc, DirectSoundAudioDevice)

        Dim effects(0) As EffectDescription

        effects(0).GuidEffectClass = DSoundHelper.StandardChorusGuid

        With DirectSoundAudioBuffer
            .Volume = AudioVolume
            .Pan = AudioPan
            .Frequency = AudioPlaybackFrequency
            .SetEffects(effects)
            .Play(0, BufferPlayFlags.Default)
        End With

        DirectSoundAudioBuffer.Play(0, BufferPlayFlags.Default)

    End Sub

    Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click

        'Just for testing purposes
        SpeakText(TextBox1.Text, -500, -8999, 22000, VoiceEnum.Julie)

    End Sub


End Class
The sub used to execute the code is SpeakText, so every time a new phone call comes in, that is the sub that is executed. I also have another sub that I didn't put on here (because it would just be redundant) that is for playing the phone ringing sound. The code is basically the same, except the audio is not coming from the speech synthesizer. The first few lines of the sub is where I am declaring a new directsound device and setting the cooperative level. I am curious as if I should have these lines of code execute only once when the program loads (put them in the load event), but I don't know if that would result in memory problems down the line.

As you can see, i'm somewhat a beginner in DirectX.
Reply With Quote