Instantaneous (analogue) compression of speech signals

Ban · Jan 7, 2005

Ken Smith wrote:

If you take a sine wave and run it through a circuit that does:

Y = X ^(17/19)

the sine wave's RMS amplitude will be compressed towards about 0.98V
RMS and there will be some distortion. The 3rd harmonic will be
about 2.7%.

Absurd, I think you should look how a sine wave changes its values, it gets
zero, then negative, try it then. How do you want to compress, analog or
digital? and how do you want to get the envelop signal. Which time
constants?

Assume that the sine wave we start with is 300Hz.

A phase shifter (all pass filter) can be made with a Q such that the
900Hz, 3rd harmonic is shifted by 180 degree relative to the 300Hz
sinewave.

If we take this shifted signal and do another X^(17/19) operation on
it, the 3rd harmonic will only be about 0.2%

You don't need the phase shift to be exactly 180 degrees. Any
non-zero phase shift and two steps of (17/19) soft clipping will
result in less harmonic content than one step of (17/19)^2 clipping
would produce.

If more distortion can be lived with, a lower power such as (11/13)
could be used.

Since the band of interest is 300Hz to 3KHz, we don't have to worry
about the harmonics of the frequencies above 1KHz. Those can be
removed with a simple low pass filter. I haven't verified it yet but
it seems to me that 3 stages of phase shifter and 4 clippers should
be able to make a significant compression of amplitude but make less
that 5% distortion on a sine wave.

The intermodulation distortion will not be made zero by this method.
If the input has more than one frequency component, the distortion
will be much higher.

Tell me which drugs are you using? Better use a u-law A/D or something
analog like THAT4301(better than 0.1%).
--
ciao Ban
Bordighera, Italy

Reg Edwards · Jan 5, 2005

John,

The filter preceding the clipping circuit can be a high-pass (first)
followed by a low-pass.

In a speech waveform, specially the male voice, most of the signal amplitude
is contained in the lower frequencies, say 400 Hz and below. So the
high-pass filter section tends to level the amplitude even before clipping.

Nearly all of the information is contained in the band 400 Hz to 3.3 KHz.

The clipper turns the large low frequency components which get through the
filter into square waves. There's a peculiar effect. The human ear and
brain partially succeeds in re-constituting the missing low-frequencies.
Imagination? Dunno about dogs' ears.

The higher frequencies are distorted into much higher frequency odd
harmonics. The function of the low-pass filter section is to minimise the
effect.

A second low-pass filter can be inserted after (not immediately after)
clipping to limit the RF bandwidth actually transmitted.

You can adjust filter frequencies to suit your own application. There is no
interaction between the various cascaded circuit sections.

Oscilloscope patterns obtained with speech, music and sinewave inputs are
very interesting while varying the two gain controls and listening on a loud
speaker. Despite the sharp clipping action, the onset of clipping and
distortion is quite soft.

An alternative clipping circuit is a cathode or emitter-coupled pair.

In my younger days I did a lot of testing and fault-locating on GPO music
circuit transmission lines into BBC broadcasting and TV stations. Used
headphones as the detector to balance impedance bridges up to 20 KHz. Found
it possible to train one's ears to hear up to 20 KHz. Saved time changing
from headphones to amplifier-plus-meter at 10 KHz where most people got
stuck.
----
Reg, G4FGQ

martin griffith · Jan 7, 2005

On Mon, 3 Jan 2005 21:21:14 +0000, in sci.electronics.design John
Woodgate <jmw@jmwa.demon.contraspam.yuk> wrote:

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)
ISTR the Datong RF clipper from the 70's. that gave a 6dB improvement.

I've done a quick google, but nobody seems to have the circuit

martin

Serious error.
All shortcuts have disappeared.
Screen. Mind. Both are blank.

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that martin griffith
<martingriffith@yahoo.co.uk> wrote (in <ujust0djuia1dq0uk3lj3hds0hta25q7
t0@4ax.com&gt

about 'Instantaneous (analogue) compression of speech
signals', on Fri, 7 Jan 2005:

On Mon, 3 Jan 2005 21:21:14 +0000, in sci.electronics.design John
Woodgate <jmw@jmwa.demon.contraspam.yuk> wrote:

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)
ISTR the Datong RF clipper from the 70's. that gave a 6dB improvement.
I've done a quick google, but nobody seems to have the circuit

It uses SSB clipping. It's far too complicated for what I want.

--
Regards, John Woodgate, OOO - Own Opinions Only.
The good news is that nothing is compulsory.
The bad news is that everything is prohibited.
http://www.jmwa.demon.co.uk Also see http://www.isce.org.uk

John Larkin · Jan 7, 2005

On Thu, 6 Jan 2005 04:37:45 +0000, John Woodgate
<jmw@jmwa.demon.contraspam.yuk> wrote:

I read in sci.electronics.design that John Larkin <john@spamless.usa
wrote (in <dvept093udvtbifbflkkhn80hmtlfsjn9v@4ax.com&gt about
'Instantaneous (analogue) compression of speech signals', on Wed, 5 Jan
2005:
On Mon, 3 Jan 2005 21:21:14 +0000, John Woodgate
jmw@jmwa.demon.contraspam.yuk> wrote:

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)

If you could tolerate a time delay, you could do a very nice smooth
AGC thing without the clipping problem that results from a fast-attack
signal.

True, this technique is well-known, but it's costly. I'm looking for an
ingenious low-cost solution.

Sounds ideal for a cheap DSP chip.

John

Ken Smith · Jan 7, 2005

In article <1vctt05f8af8l0afjjijlanplfiluqjqff@4ax.com>,
John Larkin <john@spamless.usa> wrote:
[... DSP based AGC ...]

Sounds ideal for a cheap DSP chip.

For one channel of voice grade signal, I'd bet a PIC or 8051 based circuit
could do it. The tricky bit is the dynamic range of the ADC. It is easy
to get 24bits worth of analog dynamic range and harder to get that in an
ADC.

--
--
kensmith@rahul.net forging knowledge

gwhite · Jan 7, 2005

John Woodgate wrote:

I read in sci.electronics.design that gwhite <gwhite@deadend.com> wrote
(in <41DE136A.AED6EBC8@deadend.com&gt about 'Instantaneous (analogue)
compression of speech signals', on Fri, 7 Jan 2005:

With regard to the other question of *increasing* articulation index
from straight linear performance given a system that is otherwise non-
peak power limited, I think this is quite dubious.

It isn't dubious, and the application is not radio. There is strong
evidence for the increase in intelligibility, but one particular piece
of work is unpublished for commercial reasons.

I realize it is not radio (and that's actually why my doubt of the application
exists). However, much of the work on articulation index and speech
intelligibility has occurred in the radio/telecom field. More specifically,
radio/telecom work has dealt quite directly with the concept of speech
clippers--probably more so than other fields since it tends to be a distinctly
peak power limited environment (especially radio). It is a source of info
so-to-speak. That is, the effects of clipping on speech intelligibility has
been dealt with directly.

For all my references, clipping does not *increase* intelligibility. On the
other hand it does *not harm* the intelligibility for rather high peak clipping
levels (up to 20 dB or so). It does however degrade the subjective quality.
Again, this is according to my sources. One of which is (ch2):

http://www.noblepub.com/shopexd.asp?id=11

I think the discussion in it is pretty good. "RF/IF clippers" are the best.

I would naturally be interested in technical discussion and evidence to the
contrary.

http://www.dstan.mod.uk/data/00/025/16000100.pdf
(6.1.3.17 Peak Clipping with Noise
"If the speech is clipped before the noise mixes with it there can be
an improvement in intelligibility." Of course, that is exactly the _radio_
problem solved.

6.1.4.10 Peak Clipping
"Peak clipping can improve intelligibility of a relatively noise-free speech
signal from
a microphone, in situations where high levels of noise mix with the speech
before it
reaches the listener. The clipping must occur before the noise mixes.
Although peak clipping the audio waveform can improve intelligibility, a better
approach is to use Radio Frequency clipping. With AF clipping the distortion
products are spread throughout the speech band.
When the RF waveform is clipped, the distortion products do not overlap the
transposed speech frequencies, and can be filtered out before the RF signal is
transposed back. [Not generally true as I pointed out earlier. Harmonics are
gone, but not odd-order intermod products.] The final audio waveform has
smoothly rounded peaks rather than
flattened peaks." [True])

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bucketloads of free-bee's on speech/hearing:

http://www.vard.org/prog/98/98prch13.htm
"In backgrounds of noise, both intelligibility and quality were more adversely
affected by peak clipping than by compression or linear amplification. ... These
results indicated that the output of a hearing aid should be limited with
compression rather than peak clipping."

http://www.kemt.fei.tuke.sk/Predmety/KEMT320_EA/_web/Online_Course_on_Acoustics/intelligibility.html

http://www.jblpro.com/pub/technote/spch_intl_1.pdf
http://www.jblpro.com/pub/technote/spch_intl_2.pdf

http://www.icsi.berkeley.edu/~steveng/PDF/Spectral_Slits.pdf
http://www.icsi.berkeley.edu/~steveng/PDF/Bandshift.pdf
http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/icassp98-uttcomb.pdf
http://www.wramc.amedd.army.mil/departments/aasc/avlab/Eurospeech2003-R.pdf

http://soma.crl.mcmaster.ca/~jeff/NIPS/NIPS_Submission.pdf

http://www.gold-line.com/pdf/articles/p_sti01e14.pdf

http://www.icsi.berkeley.edu/ftp/global/pub/speech/papers/thesis-bedk98.pdf

http://www.phonak.com/com_1998proceedings_6.pdf
http://www.phonak.com/com_1998proceedings_9.pdf

http://www.acoustics-engineering.com/files/TN002.pdf

http://www.svconline.com/mag/avinstall_measuring_intelligibility/

http://fonsg3.let.uva.nl/Proceedings/Proceedings20/ShuzhenWu/ShuzhenWu.html#Heading6

http://www.frye.com/library/acrobat/hrarticle.pdf
http://www.frye.com/library/acrobat/hrarticle2.pdf

http://www.auditory.org/mhonarc/2002/msg00326.html

http://www.cnel.ufl.edu/~markskow/papers/mdsThesisMain.pdf

http://ieeexplore.ieee.org/iel5/8159/23791/01091628.pdf

http://www.eng.uwo.ca/people/vparsa/Audiology/Compression_Tutorial.pdf
"It also seems to be established that compression limiting gives superior
quality to peak
clipping, although the hearing aid needs to be sufficiently saturated for this
advantage to occur
and it is not known how often this degree of saturation occurs in practice for
various degrees of
hearing loss and for various maximum power output settings. There appears,
however, to be no
reason not to use output controlled compression limiting over peak clipping,
except for the most
profoundly impaired listeners."

gwhite · Jan 7, 2005

Ken Smith wrote:

I haven't verified it yet but it seems to me that
3 stages of phase shifter and 4 clippers should be able to make a
significant compression of amplitude but make less that 5% distortion on a
sine wave.

As far as a sine wave goes, the RF clipper eliminates harmonics entirely. A
baseband version has been given. The patent expired.

The intermodulation distortion will not be made zero by this method.

Nor by any method.

Ken Smith · Jan 7, 2005

In article <TNKBVFHjku3BFwke@jmwa.demon.co.uk>,
John Woodgate <noone@yuk.yuk> wrote:

I read in sci.electronics.design that Ken Smith
kensmith@green.rahul.net> wrote (in <crmo28$5ie$2@blue.rahul.net&gt
about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:
Perhaps there is still something in the idea worth considering.
Whatever clipping curve you apply to the signal could be broken into 2
parts and a simple all pass filter used. The result should be no worse
than the one stage of clipping and may in fact sound better. Instead of
trying to zero the 3rd harmonic, higher harmonics could be targetted.

I don't immediately see how that would work for a broadband input
signal. Splitting the signal into octave bands and processing as you
propose would indeed work, because the third and higher harmonics are
out-of-band, if the all-pass maintains its 180 degree phase-shift,
relative to that at f to 2f, from 3f to 6f, where f is the lower band-
edge frequency of an octave-band filter.

Imagine that we have broken the clipping operation into two steps.
Further imagine that we have made these steps such that if the first step
adds NmV of 5th harmonic to the signal, the second does also. This means
that the second step is a little harder than the first.

For purposes of thinking about it assume, we first pass the signal through
just the clippers with no phase shifter between them and record the
spectrum of the result. Then we do this:

An all pass filter with a modest Q can shift, lets say, the 1KHz to 5Hz
band. The phase curve suddenly starts adding delay at about the
1KHz point.

Any harmonic, made from a signal well below 1KHz, that is above the 1KHz
point will be shifted in phase relative to its fundamental.

If this signal is again clipped, new harmonic components will be created
in the clipping process. These new components will be at some phase angle
to the shifted ones that have passed through the all pass filter.

The sum of two vectors is at its maximum when the vectors are aligned.
Any phase difference between the new harmonics and the ones from the all
pass means that the amplitude of the sum will be less than if there was no
phase shift.

Over some band of frequencies, the phase shift will be between 120 and 240
degrees and the harmonics will tend to cancel.

Since none of the harmonics can be greater than the case where there was
no shifter but some are smaller, the THD is less for the circuit with the
phase shifter.

That sounded clear to me, but I already know what I was thinking.

--
--
kensmith@rahul.net forging knowledge

Ken Smith · Jan 7, 2005

In article <41DEF448.D8A84031@deadend.com>, gwhite <gwhite@deadend.com> wrote:

Ken Smith wrote:

I haven't verified it yet but it seems to me that
3 stages of phase shifter and 4 clippers should be able to make a
significant compression of amplitude but make less that 5% distortion on a
sine wave.

As far as a sine wave goes, the RF clipper eliminates harmonics entirely. A
baseband version has been given. The patent expired.

The intermodulation distortion will not be made zero by this method.

Nor by any method.

FT the signal

Raise each amplitude to the 5/7th power but don't change the phase

iFT the new spectrum.

No new frequencies are created and no interaction between the amplitudes
has happened. This method has neither harmonic nor IM distortion.

--
--
kensmith@rahul.net forging knowledge

Ken Smith · Jan 7, 2005

In article <rp0ut0567a2snj680r34esn95mv2ilav1u@4ax.com>,
Jim Thompson <thegreatone@example.com> wrote:
[...]

I'm puzzled by "RF/IF" clipping. How does that work to improve the
demodulated audio?

In a single side band reciever:

If you've added 1MHz to all the frequencies, and clipped the signal in
that form, the results are quite different than clipping the detected
audio. For a single input frequency, all the distortion products end up
above 2MHz and thus get removed by the filtering.

The distortion ends up being heavy on the IM distortion effects and light
on harmonics. For some reason, this seems to be easier on the hear.

--
--
kensmith@rahul.net forging knowledge

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
<kensmith@green.rahul.net> wrote (in <crn2a2$cdi$3@blue.rahul.net&gt

about 'Instantaneous (analogue) compression of speech signals', on Fri,
7 Jan 2005:

The distortion ends up being heavy on the IM distortion effects and
light on harmonics. For some reason, this seems to be easier on the
hear.

Not in my experience, but that is with systems having wider bandwidth.
One would expect IM to sound worse, because the IM products are not even
vaguely harmonically-related to the fundamentals.
--
Regards, John Woodgate, OOO - Own Opinions Only.
The good news is that nothing is compulsory.
The bad news is that everything is prohibited.
http://www.jmwa.demon.co.uk Also see http://www.isce.org.uk

John Woodgate · Jan 7, 2005

I read in sci.electronics.design that Ken Smith
<kensmith@green.rahul.net> wrote (in <crn1ni$cdi$1@blue.rahul.net&gt

about '"all pass" thought about (analogue) compression', on Fri, 7 Jan
2005:

That sounded clear to me, but I already know what I was thinking.

It sounds clear to me, as well, and I expect it would work, at least to
some extent.
--
Regards, John Woodgate, OOO - Own Opinions Only.
The good news is that nothing is compulsory.
The bad news is that everything is prohibited.
http://www.jmwa.demon.co.uk Also see http://www.isce.org.uk

John Larkin · Jan 8, 2005

On Fri, 7 Jan 2005 16:51:34 +0000 (UTC), kensmith@green.rahul.net (Ken
Smith) wrote:

In article <1vctt05f8af8l0afjjijlanplfiluqjqff@4ax.com>,
John Larkin <john@spamless.usa> wrote:
[... DSP based AGC ...]
Sounds ideal for a cheap DSP chip.

For one channel of voice grade signal, I'd bet a PIC or 8051 based circuit
could do it. The tricky bit is the dynamic range of the ADC. It is easy
to get 24bits worth of analog dynamic range and harder to get that in an
ADC.

You're not likely to see much more dynamic range than 60 or so dB for
any real-world audio signal. So a 12-16 bit ADC should be good for
most apps. A DSP, or even a decent uP, could delay the data stream, do
an average or quasi-peak detection, envelope delay that some clever
smooth way, and multiply the delayed samples to compress the dynamic
range without bad artifacts. I'm sure it's being done already.

John

>--

John Woodgate · Jan 4, 2005

I read in sci.electronics.design that Jim Thompson
<thegreatone@example.com> wrote (in <n7olt0pimqncoidcohk7mi2jgs8jsjfa82@
4ax.com&gt

about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:

http://www.semiconductors.philips.com/acrobat_download/applicationnotes/
AN176.pdf

This is helpful for theory but the devices are not now available, I
think.

http://www.portset.co.uk/compand.htm

I have concerns about the 'direct' mode, which is clearly non-linear!

http://www.toko.co.jp/products/ctlg/ic/com_compandor_e.htm

http://www.chipdocs.com/datasheets/datasheet-pdf/Philips-
Semiconductors/NE570.html

Not 'instantaneous' and data only in Japanese

-(

http://ieeexplore.ieee.org/Xplore/Toclogin.jsp?url=/iel5/4/22551/0105081
4.pdf

Not accessible to me.

http://www.onsemi.com/site/products/parts/0,4454,62,00.html

Not instantaneous; these use a rectifier and thus involve at least one
time-constant.

Thanks for your help.
--
Regards, John Woodgate, OOO - Own Opinions Only.
The good news is that nothing is compulsory.
The bad news is that everything is prohibited.
http://www.jmwa.demon.co.uk Also see http://www.isce.org.uk

Jim Thompson · Jan 4, 2005

On Tue, 4 Jan 2005 20:41:43 +0000, John Woodgate
<jmw@jmwa.demon.contraspam.yuk> wrote:

I read in sci.electronics.design that Jim Thompson
thegreatone@example.com> wrote (in <sgslt0t5thgqkd6hhc0kdfvl1ng9ov415p@
4ax.com&gt about 'Instantaneous (analogue) compression of speech
signals', on Tue, 4 Jan 2005:
What curve would you like and I'll create it in circuitry for you?

I couldn't afford your professional services.

I didn't note a fee attached to my offer ;-)

And as yet I don't know
what I want. I'm still at the breadboard stage. I have something that
'doesn't not work', but I want to know how far off optimum it is. And
the sound quality is important but I can't hear well enough to assess
it. I need 'ears-on' local assistance, and I have some colleagues
visiting on Friday. Maybe they will listen for me.

OK. I appreciate how hearing loss slips up on you. I've very little
high frequency response in my left ear... If I bury my right ear in
the pillow I don't even hear the phone ring in our bedroom
(high-pitched electronic "ringer").

...Jim Thompson
--
| James E.Thompson, P.E. | mens |
| Analog Innovations, Inc. | et |
| Analog/Mixed-Signal ASIC's and Discrete Systems | manus |
| Phoenix, Arizona Voice

480)460-2350 | |
| E-mail Address at Website Fax

Rich Grise · Jan 5, 2005

On Tue, 04 Jan 2005 10:27:18 +0000, John Woodgate wrote:

I read in sci.electronics.design that Anthony C Smith
....
John- a very good soft clipper can be formed by placing a pair of back
to back zeners in parallel with the feedback resistor on an opamp- the
leakage from the zeners brakes the signal slowly before the Vz+0.7
point- this increases the THD on even harmonics only and sounds OK in
practice - if you want a better limiter a fet and rectifier would be the
way to go but more complex.

Thanks for that. Are you sure it's even harmonics? If the clipping is
precisely symmetrical the harmonics are all odd order.

Anything using a rectifier involves a time constant, and I want to avoid
that because it introduces an extra variable - the time constant.

If a diode clipper is unsatisfactory, would a log amp do?

Thanks,
Rich

Rich Grise · Jan 5, 2005

On Tue, 04 Jan 2005 13:32:43 +0000, John Woodgate wrote:

I read in sci.electronics.design that Keith Wootten
keith@nononono.co.uk> wrote (in <Tzyj2HBo8o2BFwOk@clara.co.uk&gt about
'Instantaneous (analogue) compression of speech signals', on Tue, 4 Jan
2005:
In message <2pRIs3CKdb2BFwac@jmwa.demon.co.uk>, John Woodgate
jmw@jmwa.demon.contraspam.yuk> writes
Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals? I've been doing a
little work on it but I'm unable to judge the resulting sound quality.
Why do treble boost controls no longer have any audible effect for me?
(;-)

Howzabout a potential divider with the top leg being a small
incandescent lamp. Also acts as an emergency beacon where the louder
you call for help, the brighter is the lamp.

Interesting, because it has potentially less distortion than diode
clipping. It's not quite 'instantaneous', though, and it's not so easy
to find suitable lamps.

Wayne Bridges is the expert with this method of course.

Does he have a web site?

He will, as soon as they finish that span to the grapes.

;-)
Rich

John S. Dyson · Jan 5, 2005

In article <pan.2005.01.04.07.41.29.674632@example.net>,
Rich Grise <richgrise@example.net> writes:

On Tue, 04 Jan 2005 01:43:49 +0000, John S. Dyson wrote:

In article <pan.2005.01.03.22.37.33.248842@example.net>,
Rich Grise <richgrise@example.net> writes:
On Mon, 03 Jan 2005 21:21:14 +0000, John Woodgate wrote:

Does anyone here have any experience of instantaneous (analogue)
compression (aka soft clipping) of speech signals?

Intentionally? ;-)

I've been doing a
little work on it but I'm unable to judge the resulting sound quality.

Well, isn't that a pretty good indication that your clipper isn't
introducing objectionable distortion? ;-) With speech, AIUI, you can do a

One thing about audio 'clipping' is that more clipping can be done if
intermod is controlled. So, one trick has been to modulate the audio
onto an SSB type carrier, and then clip it there, and demod it (that is
a long way to do it.) Another possibility is to chop up the spectrum,
and apply soft clipping to each chunk.

Thank you for bringing this up. I now remember, the context of "'Deep'
clipping" was modulated RF, although my mental dredge is coming up AM, as
opposed to SSB, but the point is entirely the same.

Not actually playing with that application myself, it appears that it

would be a damned good approach for speech applications. There have
been some other people responding who know more about this specific
application than I do, but it might be good enough for alot of applications.

I did something similar by doing an fft of audio (music, in fact), and
by using a carefully chosen window function, I was able to do evil,
nonlinear processing of the FFT'd signal, and then to reverse FFT the
result.

I'm afraid you're out of my league here, although I do want to say that
it's only the implementation I'm ignorant about - I get the _point_ of
what you're saying about transforming signals quite clearly, thanks.

It isn't beyond your abilities AT ALL, I am sure. However, basically,

using an FFT is conceptually to do alot of filters, apply a nonlinear
operation (e.g. clipping or other, perhaps more gentle math operation)
that does the 'limiting' or 'compression' operation. A 'gentle', but
perhaps inadequate for speech operation might be 'sqrt.' This would
have the effect of doing a 2:1 compression on the signal. After the
nonlinear operation, then the signal is rebuilt by doing an inverse FFT.

When doing the fft method (any dsp engineer can help you with this), the
key is to use the correct windowing method (used to meld the chunks of
FFT samples together) and make sure that the math operation doesnt' screw
with the phase of the signal. The math operations should only play
with the amplitude, unless there is some kind of much more fancy operation.
It is amazing that the magnitude of the transformed audio can be
really severely damaged, but if the phase is kept the same, then the
reconstructed audio is still recongizeable.

For the window, when looking at my (perhaps incorrect) source code,
the comments say that I used the 'hann window.' If someone really
needs to know, I can probably resurrect the code. I haven't looked
at it seriously in the last 3-5yrs.

The method that I used to overlap the FFTs and do the necessary windowing
did a pretty good job of avoiding the expected 'choppiness' in the signal.
The FFT method of signal processing was just too fancy and too aggressive
for my own needs. In fact, I didnt' like the sound of multi-band audio
agc in general, and instead developed a very fancy complex attack/decay
time scheme that does low distortion for fast effective attack/decay times.
(LF modulation of other signal components and various other kinds of
LF distortion are audibly mitigated by doing a super intelligent control
of the gain... The attack and decay times are totally undefinable except
in an instantaneous sense.) On my desktop machine, my most fancy single
band algorithm (which is probably more complex than many multi-band schemes)
takes about 1/100 of the CPU. Trivial AGC algorithms can probably be
1000X faster than that... My straightforward implementation of the
complex algorithm does use limited numbers of exp and log type operations.

Even a multiband scheme will produce short term distortion
products simply because of the physics and mathematics that define
the limitations of real world frequency domain filters. So, I designed
a single band gain control scheme that hides most of the intermod problems to
exist only during short transients -- probably worse than a multiband
scheme, but damned good for single band. It isn't perfect, but is about
as good as a single band scheme can be (in fact, it is probably better
than many multi-band schemes.) The multi-band schemes are still limited
by the phasing effects on the sound (for deep/fast compression.) Both
the pumping avoidance and intermod avoidance are fairly well achieved
in my single band scheme. When the single band agc is used sanely
(e.g. 1.4:1 compression through 3:1 compression (in dB) and the
compression is set to be gentle, the audio still sounds really good.)
If the compression is aggressively applied, it still sounds 'good',
and still doesn't audibly pump, and maintains a very high density
of the audio, but has little purpose other than perhaps to process
audio for advertisements, shortwave or AM station transmitter.

Of course, this is fairly far off topic WRT speech processing, so I won't
bother you with more off topic info (unless someone is interested in
my latest version of my audio AGC code -- very old versions are used in
some free and probably
commercial software.) The new stuff (developed in the last several years)
is far far better than anything else that I have played with (or developed
myself.) It is still ugly, but could be cleaned up if there would be
any demand. (It is written in sane C++, and happily uses inline asms
for P4 SSE math operations, some take advantage of the SIMD capabilities.)

John

John Woodgate · Jan 5, 2005

I read in sci.electronics.design that Jim Thompson
<thegreatone@example.com> wrote (in <kpunt05gl9dn5v862g759dbb033sbpf2fc@
4ax.com&gt

about 'Instantaneous (analogue) compression of speech
signals', on Wed, 5 Jan 2005:

Recalling an ancient project... isn't the ideal clipping case something
like (for example)... a 2dB change in the input produces a 1dB change in
the output?

There was a time when anything more than that introduced unpleasant
artefacts into the signal, but that doesn't happen with modern designs.
--
Regards, John Woodgate, OOO - Own Opinions Only.
The good news is that nothing is compulsory.
The bad news is that everything is prohibited.
http://www.jmwa.demon.co.uk Also see http://www.isce.org.uk

Instantaneous (analogue) compression of speech signals

Ban

Guest

Reg Edwards

Guest

martin griffith

Guest

John Woodgate

Guest

John Larkin

Guest

Ken Smith

Guest

gwhite

Guest

gwhite

Guest

Ken Smith

Guest

Ken Smith

Guest

Ken Smith

Guest

John Woodgate

Guest

John Woodgate

Guest

John Larkin

Guest

John Woodgate

Guest

Jim Thompson

Guest

Rich Grise

Guest

Rich Grise

Guest

John S. Dyson

Guest

John Woodgate

Guest

Log in

Welcome to EDABoard.com

Sponsor