Best samples for digitised speech

On the creation of AY or Beeper music, including the packages used to do so.
Post Reply
User avatar
R-Tape
Site Admin
Posts: 6409
Joined: Thu Nov 09, 2017 11:46 am

Best samples for digitised speech

Post by R-Tape »

Can anyone give any suggestions how best to clean up a speech sample (e.g. a WAV) to:

1 - make it sound cleaner/better when digitised to TAP.

2 - reduce the memory footprint when digitised to TAP.

and can you do both at the same time?

I'm an audio duffer, so please try and bring any technical knowledge down to my level. Detail will help, such as mentioning utilities and techniques. At the moment I'm using BeepFX, which is excellent and easy to use, but has yielded some surprisingly large TAP files compared to others and I can't understand why.
User avatar
PROSM
Manic Miner
Posts: 476
Joined: Fri Nov 17, 2017 7:18 pm
Location: Sunderland, England
Contact:

Re: Best samples for digitised speech

Post by PROSM »

R-Tape wrote: Wed Sep 15, 2021 8:57 pm 1 - make it sound cleaner/better when digitised to TAP.
For this, you can try applying a compression filter on the audio. In simple terms, this will alter the audio so that all parts of it are at around the same volume - all of the quiet bits and the loud peaks will be brought to the same level. You can do this using a free tool called Audacity - there are plenty of tutorials on the net for that.
R-Tape wrote: Wed Sep 15, 2021 8:57 pm 2 - reduce the memory footprint when digitised to TAP.

[...]

At the moment I'm using BeepFX, which is excellent and easy to use, but has yielded some surprisingly large TAP files compared to others and I can't understand why.
You're in a bit of a rabbit-hole here, I'm afraid :?

BeepFX, as far as I'm aware, stores its audio in PCM (pulse-code modulation). This essentially means that, many times a second, the amplitude (volume) of the audio signal is sampled and recorded. We call the amount of times per second the sampling rate. The higher the sampling rate, the more accurate the recording will be to the original audio signal, and the more memory the recording will require. To give context, the sampling rate of a CD is 44,100Hz (44,100 samples of the audio signal's volume per second). I would suspect that most PCM digitised samples on the Spectrum would hover around the 8,000-12,000Hz mark, due to the limitations of both the CPU and the available memory.

In BeepFX's case, it is storing these samples as 1-bit values, which are sent to the ULA's speaker bit in a steady stream. I'm not too sure of what sampling rates BeepFX uses (I know you can adjust the sampling rate via the quality slider in the WAV import tool), but it seems to use some ludicrously high ones at the top end of the scale, when the hardware specs of the Spectrum are taken into consideration. Try fiddling with the quality slider and see what results you get. Remember that the number of bytes required for a sample is going to be approximately equal to this:

Code: Select all

memory taken = (sampling rate / 8) * length of clip in seconds
You can't really have your cake and eat it when it comes to sample quality and sample size - you just have to find the right compromise.

There is another way of storing audio, called PWM (pulse-width modulation). This is based solely on 1-bit signals. Instead of recording the state of the speaker bit at a fixed rate, we instead record the time between state transitions of the speaker (i.e. how long does it take to go from 1 to 0 and vice-versa). This is the technique I've used in some of my games, and it has always given me consistently good results.

BeepFX doesn't support PWM samples, but there is a utility called pcm2pwm which can do the conversion for you. I don't think there are any publicly available binaries, only the source code, but I can email you a Windows executable that I compiled on my system if you'd like.

It's harder to estimate the amount of memory a PWM sample will take, but it seems to be about that of an equivalent PCM sample, perhaps slightly more. You'll just have to experiment with different settings, I'm afraid.
All software to-date
Working on something, as always.
User avatar
R-Tape
Site Admin
Posts: 6409
Joined: Thu Nov 09, 2017 11:46 am

Re: Best samples for digitised speech

Post by R-Tape »

PROSM wrote: Wed Sep 15, 2021 9:47 pm you can try applying a compression filter on the audio.
Ta for the excellent answer. I already have Audacity, so ^this^ sounds like an easy thing to start with.
You can't really have your cake and eat it when it comes to sample quality and sample size - you just have to find the right compromise.
Curses! I actually thought this might have been a rare example where they did go hand in hand.

You pitched the explanations well thanks. I'll need to think more about the second part. If I decide to investigate pcm2pwm I'll give you a shout. If mucking about in Audacity gives significant results I might just leave it there.
User avatar
Joefish
Rick Dangerous
Posts: 2059
Joined: Tue Nov 14, 2017 10:26 am

Re: Best samples for digitised speech

Post by Joefish »

Is BeepFX simply doing a crude conversion of high/low sample points into 1s and 0s, or is it using a smart progressive 1-bit pulse-coding algorithm to account for a range of values from e.g. an 8-bit WAV sample file? (Which would need a high replay rate on the Speccy to exploit the speaker smoothing circuit).
User avatar
utz
Microbot
Posts: 116
Joined: Wed Nov 15, 2017 9:04 am
Contact:

Re: Best samples for digitised speech

Post by utz »

In addition to what PROSM wrote, you can also try to overdrive the sample slightly, ie. amplify it so it clips a bit (Audacity will let you know how much your sample clips when amplifying, going 6-8 dB over the limit usually gives good results). Also, applying a low-pass filter with a cut-off slightly below 1/2 of the target sample rate is a good idea.
User avatar
Lee Bee
Dynamite Dan
Posts: 1297
Joined: Sat Nov 16, 2019 11:01 pm
Location: Devon, England
Contact:

Re: Best samples for digitised speech

Post by Lee Bee »

I agree about compression, that's essential here.

On top of that, spend time experimenting to get just the right master volume level. Too quiet and it won't pick up, too loud and it will be distorted. You could also experiment with EQ to optimise it.

If you want to quickly simulate the 1-bit beeper sound in Audacity, what I do is use a Nyquist script (Nyquist is a scripting engine built into Audaicty) called Broadcast limited III (you can find it online). Just select your audio, run the script, turn the Threshold down to 0, and bob's your uncle! :-) FANTASY WORLD DIZZY!
Post Reply