Die Sprache ist und bleibt unser natürlichstes und wichtigstes Kommunikationsmittel, selbst im Zeitalter des Internet und drahtloser Datenübermittlung. Aus diesem Buch lernen Sie die wichtigsten Prinzipien drahtloser Sprachübermittlung, darunter Kompression, Kanalcodierung und Übertragung, sowohl aus historischer als auch aus aktueller Sicht. Erläutert werden proprietäre und standardisierte Codes, darunter die Sprachcodes nahezu aller drahtlosen und drahtgebundenen Systeme. Auf diesen Grundlagen können Sie aufbauen!
LAJOS HANZO has coauthored five books on mobile radio communications and published more than 300 research papers on a variety of topics in wireless multimedia communications. He holds a chair in telecommunications at the Department of Electronics and Computer Science, University of Southampton, UK, and he is an IEEE Distinguished Lecturer.
F. CLARE A. SOMERVILLE is with the Global Wireless Systems Research Department, Bell Laboratories, Swindon, UK. His current research involves real-time techniques for transmission of voice over GPRS and the resultant speech quality attained.
JASON P. WOODARD is with UbiNetics Ltd., where he is responsible for the development and implementation of various algorithms for third-generation mobile communications products.
Preface xxiii
Acknowledgments xxix
Part I Speech Signals and Waveform Coding 1
Chapter 1 Speech Signals and Introduction to Speech Coding 3
1.1 Motivation of Speech Compression 3
1.2 Basic Characterization of Speech Signals 4
1.3 Classification of Speech Codecs 7
1.4 Waveform Coding 11
1.5 Chapter Summary 26
Chapter 2 Predictive Coding 27
2.1 Forward Predictive Coding 27
2.2 DPCM Codec Schematic 28
2.3 Predictor Design 29
2.4 Adaptive One-Word-Memory Quantization 36
2.5 DPCM Performance 37
2.6 Backward-Adaptive Prediction 39
2.7 The 32kbps G.721 ADPCM Codec 43
2.8 Subjective and Objective Speech Quality 49
2.9 Variable-Rate G.726 and Embedded G.727 ADPCM 50
2.10 Rate-Distortion in Predictive Coding 58
2.11 Chapter Summary 62
Part II Analysis by Synthesis Coding 63
Chapter 3 Analysis-by-Synthesis Principles 65
3.1 Motivation 65
3.2 Analysis-by-Synthesis Codec Structure 66
3.3 The Short-Term Synthesis Filter 67
3.4 Long-Term Prediction 70
3.5 Excitation Models 78
3.6 Adaptive Short-Term and Long-Term Post-Filtering 81
3.7 Lattice-Based Linear Prediction 83
3.8 Chapter Summary 89
Chapter 4 Speech Spectral Quantization 90
4.1 Log-Area Ratios 90
4.2 Line Spectral Frequencies 95
4.3 Vector Quantization of Spectral Parameters 105
4.4 Spectral Quantizers for Wideband Speech Coding 113
4.5 Chapter Summary 126
Chapter 5 Regular Pulse Excited Coding 127
5.1 Theoretical Background 127
5.2 The 13 kbps RPE-LTP GSM Speech Encoder 133
5.3 The 13 kbps RPE-LTP GSM Speech Decoder 137
5.4 Bit Sensitivity of the 13 kbps GSM RPE-LTP Codec 140
5.5 Application Example: A Toolbox-Based Speech Transceiver 142
5.6 Chapter Summary 144
Chapter 6 Forward-Adaptive Code Excited Linear Prediction 145
6.1 Background 145
6.2 The Original CELP Approach 146
6.3 Fixed Codebook Search 149
6.4 CELP Excitation Models 151
6.5 Optimization of the CELP Codec Parameters 160
6.6 The Error-Sensitivity of CELP Codecs 175
6.7 Application Example: A Dual-Mode 3.1 kBd Speech Transceiver 187
6.8 Multi-Slot PRMA Transceiver 200
6.9 Chapter Summary 206
Chapter 7 Standard For ward-Adaptive CELP Codecs 207
7.1 Background 207
7.2 The U.S. DoD FS-1016 4.8kbits/s CELP Codec 207
7.3 The IS-54 DAMPS kbps Pan American Speech Codec 213
7.4 The 6.7 kbps Japanese Digital Cellular System's Speech Codec 216
7.5 The Qualcomm Variable-Rate CELP Codec 218
7.6 Japanese Half-Rate Speech Codec 225
7.7 The Half-Rate GSM Codec 233
7.8 The 8kbits/s G.729 Codec 237
7.9 The Reduced Complexity G.729 Annex A Codec 256
7.10 The 12.2 kbps Enhanced Full-Rate GSM Speech Codec 259
7.11 The Enhanced Full-Rate 7.4 kbps IS-136 Speech Codec 264
7.12 The ITU G.723.1 Dual-Rate Codec 268
7.13 Chapter Summary 277
Chapter 8 Backward-Adaptive Code Excited Linear Prediction 279
8.1 Introduction 279
8.2 Motivation and Background 279
8.3 Backward-Adaptive G.728 Codec Schematic 282
8.4 Backward-Adaptive G.728 Coding Algorithm 284
8.5 Reduced-Rate G.728-Like Codec: Variable-Length Excitation Vector 298
8.6 The Effects of Long-Term Prediction 300
8.7 Closed-Loop Codebook Training 305
8.8 Reduced-Rate G.728-Like Codec II: Constant-Length Excitation Vector 309
8.9 Programmable-Rate 8-4 kbps Low-Delay CELP Codecs 310
8.10 Backward-Adaptive Error Sensitivity Issues 327
8.11 A Low-Delay Multimode Speech Transceiver 333
8.12 Chapter Summary 338
Part III Wideband Coding and Transmission 339
Chapter 9 Wideband Speech Coding 341
9.1 Sub-band-ADPCM Wideband Coding at 64 kbps 341
9.2 Wideband Transform Coding at 32 kbps 357
9.3 Sub-Band-Split Wideband CELP Codecs 360
9.4 Fullband Wideband ACELP Coding 363
9.5 A Turbo-Coded Burst-by-Burst Adaptive Wideband Speech Transceiver 368
9.6 Chapter Summary 384
Part IV Very Low-Rate Coding and Transmission 385
Chapter 10 Overview of Low-Rate Speech Coding 387
10.1 Low-Bitrate Speech Coding 387
10.2 Linear Predictive Coding Model 400
10.3 Speech Quality Measurements 403
10.4 Speech Database 406
10.5 Chapter Summary 409
Chapter 11 Linear Predictive Vocoder 411
11.1 Overview of a Linear Predictive Vocoder 411
11.2 Line Spectrum Frequencies Quantization 412
11.3 Pitch Detection 417
11.4 Unvoiced Frames 428
11.5 Voiced Frames 429
11.6 Adaptive Post-Filter 430
11.7 Pulse Dispersion Filter 432
11.8 Results for Linear Predictive Vocoder 437
11.9 Chapter Summary 440
Chapter 12 Wavelets and Pitch Detection 441
12.1 Conceptual Introduction to Wavelets 441
12.2 Introduction to Wavelet Mathematics 444
12.3 Pre-Processing the Wavelet Transform Signal 449
12.4 Voiced-Unvoiced Decision 452
12.5 Wavelet-Based Pitch Detector 453
12.6 Summary and Conclusions 460
Chapter 13 Zinc Function Excitation 461
13.1 Introduction 461
13.2 Overview of Prototype Waveform Interpolation Zinc Function Excitation 462
13.3 Zinc Function Modeling 466
13.4 Pitch Detection 470
13.5 Voiced Speech 473
13.6 Excitation Interpolation Between Prototype Segments 477
13.7 Unvoiced Speech 483
13.8 Adaptive Post-Filter 483
13.9 Results for Single Zinc Function Excitation 483
13.10 Error Sensitivity of the 1.9kbps PWI-ZFE Coder 486
13.11 Multiple Zinc Function Excitation 490
13.12 A Sixth-Rate, 3.8kbps GSM-Like Speech Transceiver 496
13.13 Chapter Summary 500
Chapter 14 Mixed-Multiband Excitation 501
14.1 Introduction 501
14.2 Overview of Mixed-Multiband Excitation 502
14.3 Finite Impulse Response Filter 504
14.4 Mixed-Multiband Excitation Encoder 507
14.5 Mixed-Multiband Excitation Decoder 510
14.6 Performance of the Mixed-Multiband Excitation Coder 513
14.7 A Higher Rate 3.85 kbps Mixed-Multiband Excitation Scheme 520
14.8 A 2.35kbit/s Joint-Detection-Based CDMA Speech Transceiver 523
14.9 Chapter Summary 530
Chapter 15 Sinusoidal Transform Coding Below 4kbps 531
15.1 Introduction 531
15.2 Sinusoidal Analysis of Speech Signals 532
15.3 Sinusoidal Synthesis of Speech Signals 534
15.4 Low-Bitrate Sinusoidal Coders 536
15.5 Incorporating Prototype Waveform Interpolation 539
15.6 Encoding the Sinusoidal Frequency Component 541
15.7 Determining the Excitation Components 543
15.8 Quantizing the Excitation Parameters 548
15.9 Sinusoidal Transform Decoder 556
15.10 Speech Coder Performance 558
15.11 Chapter Summary 563
Chapter 16 Conclusions on Low-Rate Coding 565
16.1 Overview 565
16.2 Listening Tests 565
16.3 Summary of Very Low-Rate Coding
16.4 Further Research 568
Chapter 17 Comparison of Speech Codecs and Transceivers 569
17.1 Background to Speech Quality Evaluation 569
17.2 Objective Speech Quality Measures 570
17.3 Subjective Measures 577
17.4 Comparison of Subjective and Objective Measures 578
17.5 Subjective Speech Quality of Various Codecs 580
17.6 Error Sensitivity Comparison of Various Codecs 582
17.7 Objective Speech Performance of Various Transceivers 583
Appendix A Constructing the Quadratic Spline Wavelets 589
Appendix B Zinc Function Excitation 593
Appendix C Probability Density Function for Amplitudes 597
Bibliography 601
Index 623
Author Index 631