-
Notifications
You must be signed in to change notification settings - Fork 2
/
soxformat.7
827 lines (826 loc) · 27.1 KB
/
soxformat.7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
'\" t
'\" The line above instructs most `man' programs to invoke tbl
'\"
'\" Separate paragraphs; not the same as PP which resets indent level.
.de SP
.if t .sp .5
.if n .sp
..
'\"
'\" Replacement em-dash for nroff (default is too short).
.ie n .ds m " -
.el .ds m \(em
'\"
'\" Placeholder macro for if longer nroff arrow is needed.
.ds RA \(->
'\"
'\" Decimal point set slightly raised
.if t .ds d \v'-.15m'.\v'+.15m'
.if n .ds d .
'\"
'\" Enclosure macro for examples
.de EX
.SP
.nf
.ft CW
..
.de EE
.ft R
.SP
.fi
..
.TH SoX 7 "February 19, 2011" "soxformat" "Sound eXchange"
.SH NAME
SoX \- Sound eXchange, the Swiss Army knife of audio manipulation
.SH DESCRIPTION
This manual describes SoX supported file formats and audio device types;
the SoX manual set starts with
.BR sox (1).
.SP
Format types that can SoX can determine by a filename
extension are listed with their names preceded by a dot.
Format types that are optionally built into SoX
are marked `(optional)'.
.SP
Format types that can be handled by an
external library via an optional pseudo file type (currently
.B sndfile
or
.BR ffmpeg )
are marked e.g. `(also with \fB\-t sndfile\fR)'. This might be
useful if you have a file that doesn't work with SoX's default format
readers and writers, and there's an external reader or writer for that
format.
.SP
To see if SoX has support for an optional format or device, enter
.B sox \-h
and look for its name under the list:
`AUDIO FILE FORMATS' or `AUDIO DEVICE DRIVERS'.
.SS SOX FORMATS & DEVICE DRIVERS
\&\fB.raw\fR (also with \fB\-t sndfile\fR),
\&\fB.f32\fR, \fB.f64\fR,
\&\fB.s8\fR, \fB.s16\fR, \fB.s24\fR, \fB.s32\fR,
.br
\&\fB.u8\fR, \fB.u16\fR, \fB.u24\fR, \fB.u32\fR,
\&\fB.ul\fR, \fB.al\fR, \fB.lu\fR, \fB.la\fR
.if t .sp -.5
.if n .sp -1
.TP
\
Raw (headerless) audio files. For
.BR raw ,
the sample rate and the data encoding must be given using command-line
format options; for the other listed types, the sample rate defaults to
8kHz (but may be overridden), and the data encoding is defined by the
given suffix. Thus \fBf32\fR and \fBf64\fR indicate files encoded as 32
and 64-bit (IEEE single and double precision) floating point PCM
respectively; \fBs8\fR, \fBs16\fR, \fBs24\fR, and \fBs32\fR indicate 8,
16, 24, and 32-bit signed integer PCM respectively; \fBu8\fR, \fBu16\fR,
\fBu24\fR, and \fBu32\fR indicate 8, 16, 24, and 32-bit unsigned integer
PCM respectively; \fBul\fR indicates `\(*m-law' (8-bit), \fBal\fR
indicates `A-law' (8-bit), and \fBlu\fR and \fBla\fR are inverse bit
order `\(*m-law' and inverse bit order `A-law' respectively. For all raw
formats, the number of channels defaults to 1 (but may be overridden).
.SP
Headerless audio files on a SPARC computer are likely to be of format
\fBul\fR; on a Mac, they're likely to be \fBu8\fR but with a
sample rate of 11025 or 22050\ Hz.
.SP
See
.B .ima
and
.B .vox
for raw ADPCM formats, and
.B .cdda
for raw CD digital audio.
.PP
\&\fB.f4\fR, \fB.f8\fR,
\&\fB.s1\fR, \fB.s2\fR, \fB.s3\fR, \fB.s4\fR,
.br
\&\fB.u1\fR, \fB.u2\fR, \fB.u3\fR, \fB.u4\fR,
\&\fB.sb\fR, \fB.sw\fR, \fB.sl\fR, \fB.ub\fR, \fB.uw\fR
.if t .sp -.5
.if n .sp -1
.TP
\
Deprecated aliases for
\fBf32\fR, \fBf64\fR, \fBs8\fR, \fBs16\fR, \fBs24\fR, \fBs32\fR,
.br
\fBu8\fR, \fBu16\fR, \fBu24\fR, \fBu32\fR,
\fBs8\fR, \fBs16\fR, \fBs32\fR, \fBu8\fR, and \fBu16\fR
respectively.
.TP
\&\fB.8svx\fR (also with \fB\-t sndfile\fR)
Amiga 8SVX musical instrument description format.
.TP
\&\fB.aiff\fR, \fB.aif\fR (also with \fB\-t sndfile\fR)
AIFF files as used on old Apple Macs, Apple IIc/IIgs and SGI.
SoX's AIFF support does not include multiple audio chunks,
or the 8SVX musical instrument description format.
AIFF files are multimedia archives and
can have multiple audio and picture chunks\*m
you may need a separate archiver to work with them.
With Mac OS X, AIFF has been superseded by CAF.
.TP
\&\fB.aiffc\fR, \fB.aifc\fR (also with \fB\-t sndfile\fR)
AIFF-C is a format based on AIFF that was created to allow
handling compressed audio. It can also handle little
endian uncompressed linear data that is often referred to
as
.B sowt
encoding. This encoding has also become the defacto format produced by modern
Macs as well as iTunes on any platform. AIFF-C files produced
by other applications typically have the file extension .aif and
require looking at its header to detect the true format.
The
.B sowt
encoding is the only encoding that SoX can handle with this format.
.SP
AIFF-C is defined in DAVIC 1.4 Part 9 Annex B.
This format is referred from ARIB STD-B24, which is specified for
Japanese data broadcasting. Any private chunks are not supported.
.TP
\fBalsa\fR (optional)
Advanced Linux Sound Architecture device driver; supports both playing and
recording audio. ALSA is only used in Linux-based operating systems, though
these often support OSS (see below) as well. Examples:
.EX
sox infile \-t alsa
sox infile \-t alsa default
sox infile \-t alsa plughw:0,0
sox \-2 \-t alsa hw:1 outfile
.EE
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
.B .amb
Ambisonic B-Format: a specialisation of
.B .wav
with between 3 and 16 channels of audio for use with an Ambisonic decoder.
See http://www.ambisonia.com/Members/mleese/file-format-for-b-format for
details. It is up to the user to get the channels together in the right
order and at the correct amplitude.
.TP
\&\fB.amr\-nb\fR (optional)
Adaptive Multi Rate\*mNarrow Band speech codec; a lossy format used in 3rd
generation mobile telephony and defined in 3GPP TS 26.071 et al.
.SP
AMR-NB audio has a fixed sampling rate of 8 kHz and supports encoding
to the following bit-rates (as selected by the
.B \-C
option): 0 = 4\*d75 kbit/s, 1 = 5\*d15 kbit/s, 2 = 5\*d9 kbit/s, 3 =
6\*d7 kbit/s, 4 = 7\*d4 kbit/s 5 = 7\*d95 kbit/s, 6 = 10\*d2
kbit/s, 7 = 12\*d2 kbit/s.
.TP
\&\fB.amr\-wb\fR (optional)
Adaptive Multi Rate\*mWide Band speech codec; a lossy format used in 3rd
generation mobile telephony and defined in 3GPP TS 26.171 et al.
.SP
AMR-WB audio has a fixed sampling rate of 16 kHz and supports encoding
to the following bit-rates (as selected by the
.B \-C
option): 0 = 6\*d6 kbit/s, 1 = 8\*d85 kbit/s, 2 = 12\*d65 kbit/s, 3 =
14\*d25 kbit/s, 4 = 15\*d85 kbit/s 5 = 18\*d25 kbit/s, 6 = 19\*d85
kbit/s, 7 = 23\*d05 kbit/s, 8 = 23\*d85 kbit/s.
.TP
\fBao\fR (optional)
Xiph.org's Audio Output device driver; works only for playing audio. It
supports a wide range of devices and sound systems\*msee its documentation
for the full range. For the most part, SoX's use of libao cannot be
configured directly; instead, libao configuration files must be used.
.SP
The filename specified is used to determine which libao plugin to
use. Normally, you should specify `default' as the filename. If that
doesn't give the desired behavior then you can specify the short name
for a given plugin (such as \fBpulse\fR for pulse audio plugin).
Examples:
.EX
sox infile \-t ao
sox infile \-t ao default
sox infile \-t ao pulse
.EE
See also
.BR play (1)
and
.BR sox (1)
.BR \-d .
.TP
\&\fB.au\fR, \fB.snd\fR (also with \fB\-t sndfile\fR)
Sun Microsystems AU files.
There are many types of AU file;
DEC has invented its own with a different magic number
and byte order. To write a DEC file, use the
.B \-L
option with the output file options.
.SP
Some .au files are known to have invalid AU headers; these
are probably original Sun \(*m-law 8000\ Hz files and
can be dealt with using the
.B .ul
format (see below).
.SP
It is possible to override AU file header information
with the
.B \-r
and
.B \-c
options, in which case SoX will issue a warning to that effect.
.TP
.B .avr
Audio Visual Research format;
used by a number of commercial packages
on the Mac.
.TP
\&\fB.caf\fR (optional)
Apple's Core Audio File format.
.TP
\&\fB.cdda\fR, \fB.cdr\fR
`Red Book' Compact Disc Digital Audio (raw audio). CDDA has two audio
channels formatted as 16-bit signed integers (big endian)at a sample
rate of 44\*d1\ kHz. The number of (stereo) samples in each CDDA
track is always a multiple of 588.
.TP
\fBcoreaudio\fR (optional)
Mac OSX CoreAudio device driver: supports both playing and recording
audio. If a filename is not specific or if the name is "default" then
the default audio device is selected. Any other name will be used
to select a specific device. The valid names can be seen in the
System Preferences->Sound menu and then under the Output and Input tabs.
Examples:
.EX
sox infile \-t coreaudio
sox infile \-t coreaudio default
sox infile \-t coreaudio "Internal Speakers"
.EE
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
\&\fB.cvsd\fR, \fB.cvs\fR
Continuously Variable Slope Delta modulation.
A headerless format used to compress speech audio for applications such as voice mail.
This format is sometimes used with bit-reversed samples\*mthe
.B \-X
format option can be used to set the bit-order.
.TP
\&\fB.cvu\fR
Continuously Variable Slope Delta modulation (unfiltered).
This is an alternative handler for CVSD that is unfiltered but can
be used with any bit-rate. E.g.
.EX
sox infile outfile.cvu rate 28k
play \-r 28k outfile.cvu sinc \-3.4k
.EE
.TP
.B .dat
Text Data files.
These files contain a textual representation of the
sample data. There is one line at the beginning
that contains the sample rate, and one line that
contains the number of channels.
Subsequent lines contain two or more numeric data intems:
the time since the beginning of the first sample and the sample value
for each channel.
.SP
Values are normalized so that the maximum and minimum
are 1 and \-1. This file format can be used to
create data files for external programs such as
FFT analysers or graph routines. SoX can also convert
a file in this format back into one of the other file
formats.
.SP
Example containing only 2 stereo samples of silence:
.SP
.EX
; Sample Rate 8012
; Channels 2
0 0 0
0.00012481278 0 0
.EE
.TP
\&\fB.dvms\fR, \fB.vms\fR
Used in Germany to compress speech audio for voice mail.
A self-describing variant of
.BR cvsd .
.TP
\&\fB.fap\fR (optional)
See
.BR .paf .
.TP
\fBffmpeg\fR (optional)
This is a pseudo-type that forces ffmpeg to be used. The actual file
type is deduced from the file name (it cannot be used on stdio).
It can read a wide range of audio files, not all of which are
documented here, and also the audio track of many video files
(including AVI, WMV and MPEG). At present only the first audio track
of a file can be read.
.TP
\&\fB.flac\fR (optional; also with \fB\-t sndfile\fR)
Xiph.org's Free Lossless Audio CODEC compressed audio.
FLAC is an open, patent-free CODEC designed for compressing
music. It is similar to MP3 and Ogg Vorbis, but lossless,
meaning that audio is compressed in FLAC without any loss in
quality.
.SP
SoX can read native FLAC files (.flac) but not Ogg FLAC files (.ogg).
[But see
.B .ogg
below for information relating to support for Ogg
Vorbis files.]
.SP
SoX can write native FLAC files according to a given or default
compression level. 8 is the default compression level and gives the
best (but slowest) compression; 0 gives the least (but fastest)
compression. The compression level is selected using the
.B \-C
option [see
.BR sox (1)]
with a whole number from 0 to 8.
.TP
.B .fssd
An alias for the
.B .u8
format.
.TP
.B .gsrt
Grandstream ring-tone files.
Whilst this file format can contain A-Law, \(*m-law, GSM, G.722,
G.723, G.726, G.728, or iLBC encoded audio, SoX supports reading and
writing only A-Law and \(*m-law. E.g.
.EX
sox music.wav \-t gsrt ring.bin
play ring.bin
.EE
.TP
\&\fB.gsm\fR (optional; also with \fB\-t sndfile\fR)
GSM 06.10 Lossy Speech Compression.
A lossy format for compressing speech which is used in the
Global Standard for Mobile telecommunications (GSM). It's good
for its purpose, shrinking audio data size, but it will introduce
lots of noise when a given audio signal is encoded and decoded
multiple times. This format is used by some voice mail applications.
It is rather CPU intensive.
.TP
.B .hcom
Macintosh HCOM files.
These are Mac FSSD files with Huffman compression.
.TP
.B .htk
Single channel 16-bit PCM format used by HTK,
a toolkit for building Hidden Markov Model speech processing tools.
.TP
\&\fB.ircam\fR (also with \fB\-t sndfile\fR)
Another name for
.BR .sf .
.TP
\&\fB.ima\fR (also with \fB\-t sndfile\fR)
A headerless file of IMA ADPCM audio data. IMA ADPCM claims 16-bit precision
packed into only 4 bits, but in fact sounds no better than
.BR .vox .
.TP
\&\fB.lpc\fR, \fB.lpc10\fR
LPC-10 is a compression scheme for speech developed in the United
States. See http://www.arl.wustl.edu/~jaf/lpc/ for details. There is
no associated file format, so SoX's implementation is headerless.
.TP
\&\fB.mat\fR, \fB.mat4\fR, \fB.mat5\fR (optional)
Matlab 4.2/5.0 (respectively GNU Octave 2.0/2.1) format (.mat is the same as .mat4).
.TP
.B .m3u
A
.I playlist
format; contains a list of audio files.
SoX can read, but not write this file format.
See [1] for details of this format.
.TP
.B .maud
An IFF-conforming audio file type, registered by
MS MacroSystem Computer GmbH, published along
with the `Toccata' sound-card on the Amiga.
Allows 8bit linear, 16bit linear, A-Law, \(*m-law
in mono and stereo.
.TP
\&\fB.mp3\fR, \fB.mp2\fR (optional read, optional write)
MP3 compressed audio; MP3 (MPEG Layer 3) is a part of the patent-encumbered
MPEG standards for audio and video compression. It is a lossy
compression format that achieves good compression rates with little
quality loss.
.SP
Because MP3 is patented, SoX cannot be distributed with MP3 support without
incurring the patent holder's fees. Users who require SoX with MP3 support
must currently compile and build SoX with the MP3 libraries (LAME & MAD)
from source code, or, in some cases, obtain pre-built dynamically loadable
libraries.
.SP
When reading MP3 files, up to 28 bits of precision is stored although
only 16 bits is reported to user. This is to allow default behavior
of writing 16 bit output files. A user can specify a higher precision
for the output file to prevent lossing this extra information. MP3
output files will use up to 24 bits of precision while encoding.
.SP
MP3 compression parameters can be selected using SoX's \fB\-C\fR option
as follows
(note that the current syntax is subject to change):
.SP
The primary parameter to the LAME encoder is the bit rate. If the
value of the \fB\-C\fR value is a positive integer, it's taken as
the bitrate in kbps (e.g. if you specify 128, it uses 128 kbps).
.SP
The second most important parameter is probably "quality" (really
performance), which allows balancing encoding speed vs. quality.
In LAME, 0 specifies highest quality but is very slow, while
9 selects poor quality, but is fast. (5 is the default and 2 is
recommended as a good trade-off for high quality encodes.)
.SP
Because the \fB\-C\fR value is a float, the fractional part is used
to select quality. 128.2 selects 128 kbps encoding with a quality
of 2. There is one problem with this approach. We need 128 to specify
128 kbps encoding with default quality, so 0 means use default. Instead
of 0 you have to use .01 (or .99) to specify the highest quality
(128.01 or 128.99).
.SP
LAME uses bitrate to specify a constant bitrate, but higher quality
can be achieved using Variable Bit Rate (VBR). VBR quality (really
size) is selected using a number from 0 to 9. Use a value of 0 for high
quality, larger files, and 9 for smaller files of lower quality. 4 is
the default.
.SP
In order to squeeze the selection of VBR into the the \fB\-C\fR value
float we use negative numbers to select VRR. -4.2 would select default
VBR encoding (size) with high quality (speed). One special case is 0,
which is a valid VBR encoding parameter but not a valid bitrate.
Compression value of 0 is always treated as a high quality vbr, as a
result both -0.2 and 0.2 are treated as highest quality VBR (size) and
high quality (speed).
.SP
See also
.B Ogg Vorbis
for a similar format.
.TP
\&\fB.mp4\fR, \fB.m4a\fR (optional)
MP4 compressed audio. MP3 (MPEG 4) is part of the
MPEG standards for audio and video compression. See
.B mp3
for more information.
.TP
\&\fB.nist\fR (also with \fB\-t sndfile\fR)
See \fB.sph\fR.
.TP
\&\fB.ogg\fR, \fB.vorbis\fR (optional)
Xiph.org's Ogg Vorbis compressed audio; an open, patent-free CODEC designed
for music and streaming audio. It is a lossy compression format (similar to
MP3, VQF & AAC) that achieves good compression rates with a minimum amount
of quality loss.
.SP
SoX can decode all types of Ogg Vorbis files, and can encode at different
compression levels/qualities given as a number from \-1 (highest
compression/lowest quality) to 10 (lowest compression, highest quality).
By default the encoding quality level is 3 (which gives an encoded rate
of approx. 112kbps), but this can be changed using the
.B \-C
option (see above) with a number from \-1 to 10; fractional numbers (e.g.
3\*d6) are also allowed.
Decoding is somewhat CPU intensive and encoding is very CPU intensive.
.SP
See also
.B .mp3
for a similar format.
.TP
\fBoss\fR (optional)
Open Sound System /dev/dsp device driver; supports both playing and
recording audio. OSS support is available in Unix-like operating systems,
sometimes together with alternative sound systems (such as ALSA). Examples:
.EX
sox infile \-t oss
sox infile \-t oss /dev/dsp
sox \-2 \-t oss /dev/dsp outfile
.EE
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
\&\fB.paf\fR, \fB.fap\fR (optional)
Ensoniq PARIS file format (big and little-endian respectively).
.TP
.B .pls
A
.I playlist
format; contains a list of audio files.
SoX can read, but not write this file format.
See [2] for details of this format.
.SP
Note: SoX support for SHOUTcast PLS relies on
.BR wget (1)
and is only partially supported: it's necessary to
specify the audio type manually, e.g.
.EX
play \-t mp3 \(dqhttp://a.server/pls?rn=265&file=filename.pls\(dq
.EE
and SoX does not know about alternative servers\*mhit Ctrl-C twice in
quick succession to quit.
.TP
.B .prc
Psion Record. Used in Psion EPOC PDAs (Series 5, Revo and similar) for
System alarms and recordings made by the built-in Record application.
When writing, SoX defaults to A-law, which is recommended; if you must
use ADPCM, then use the \fB\-i\fR switch. The sound quality is poor
because Psion Record seems to insist on frames of 800 samples or
fewer, so that the ADPCM CODEC has to be reset at every 800 frames,
which causes the sound to glitch every tenth of a second.
.TP
\fBpulseaudio\fR (optional)
PulseAudio driver; supports both playing and recording of audio.
PulseAudio is a cross platform networked sound server.
If a file name is specified with this driver, it is ignored. Examples:
.EX
sox infile \-t pulseaudio
sox infile \-t pulseaudio default
.EE
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
\&\fB.pvf\fR (optional)
Portable Voice Format.
.TP
\&\fB.sd2\fR (optional)
Sound Designer 2 format.
.TP
\&\fB.sds\fR (optional)
MIDI Sample Dump Standard.
.TP
\&\fB.sf\fR (also with \fB\-t sndfile\fR)
IRCAM SDIF (Institut de Recherche et Coordination Acoustique/Musique
Sound Description Interchange Format). Used by academic music software
such as the CSound package, and the MixView sound sample editor.
.TP
\&\fB.sln\fR
Asterisk PBX `signed linear' 8khz, 16-bit signed integer, little-endian
raw format.
.TP
\&\fB.sph\fR, \fB.nist\fR (also with \fB\-t sndfile\fR)
SPHERE (SPeech HEader Resources) is a file format defined by NIST
(National Institute of Standards and Technology) and is used with
speech audio. SoX can read these files when they contain
\(*m-law and PCM data. It will ignore any header information that
says the data is compressed using \fIshorten\fR compression and
will treat the data as either \(*m-law or PCM. This will allow SoX
and the command line \fIshorten\fR program to be run together using
pipes to encompasses the data and then pass the result to SoX for processing.
.TP
.B .smp
Turtle Beach SampleVision files.
SMP files are for use with the PC-DOS package SampleVision by Turtle Beach
Softworks. This package is for communication to several MIDI samplers. All
sample rates are supported by the package, although not all are supported by
the samplers themselves. Currently loop points are ignored.
.TP
.B .snd
See
.BR .au ,
.B .sndr
and
.BR .sndt .
.TP
\fBsndfile\fR (optional)
This is a pseudo-type that forces libsndfile to be used. For writing files, the
actual file type is then taken from the output file name; for reading
them, it is deduced from the file.
.TP
\fBsndio\fR (optional)
OpenBSD audio device driver; supports both playing and recording audio.
.EX
sox infile \-t sndio
.EE
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
.B .sndr
Sounder files.
An MS-DOS/Windows format from the early '90s.
Sounder files usually have the extension `.SND'.
.TP
.B .sndt
SoundTool files.
An MS-DOS/Windows format from the early '90s.
SoundTool files usually have the extension `.SND'.
.TP
.B .sou
An alias for the
.B .u8
raw format.
.TP
.B .sox
SoX's native uncompressed PCM format, intended for storing (or piping)
audio at intermediate processing points (i.e. between SoX invocations).
It has much in common with the popular WAV, AIFF, and AU uncompressed PCM
formats, but has the following specific characteristics: the PCM samples
are always stored as 32 bit signed integers, the samples are stored (by
default) as `native endian', and the number of samples in the file is
recorded as a 64-bit integer. Comments are also supported.
.SP
See `Special Filenames' in
.BR sox (1)
for examples of using the
.B .sox
format with `pipes'.
.TP
\fBsunau\fR (optional)
Sun /dev/audio device driver; supports both playing and
recording audio. For example:
.EX
sox infile \-t sunau /dev/audio
.EE
or
.EX
sox infile \-t sunau \-U \-c 1 /dev/audio
.EE
for older sun equipment.
.SP
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
.B .txw
Yamaha TX-16W sampler.
A file format from a Yamaha sampling keyboard which wrote IBM-PC
format 3\*d5\(dq floppies. Handles reading of files which do not have
the sample rate field set to one of the expected by looking at some
other bytes in the attack/loop length fields, and defaulting to
33\ kHz if the sample rate is still unknown.
.TP
.B .vms
See
.BR .dvms .
.TP
\&\fB.voc\fR (also with \fB\-t sndfile\fR)
Sound Blaster VOC files.
VOC files are multi-part and contain silence parts, looping, and
different sample rates for different chunks.
On input, the silence parts are filled out, loops are rejected,
and sample data with a new sample rate is rejected.
Silence with a different sample rate is generated appropriately.
On output, silence is not detected, nor are impossible sample rates.
SoX supports reading (but not writing) VOC files with multiple
blocks, and files containing \(*m-law, A-law, and 2/3/4-bit ADPCM samples.
.TP
.B .vorbis
See
.BR .ogg .
.TP
\&\fB.vox\fR (also with \fB\-t sndfile\fR)
A headerless file of Dialogic/OKI ADPCM audio data commonly comes with the
extension .vox. This ADPCM data has 12-bit precision packed into only 4-bits.
.SP
Note: some early Dialogic hardware does not always reset the ADPCM
encoder at the start of each vox file. This can result in clipping
and/or DC offset problems when it comes to decoding the audio. Whilst
little can be done about the clipping, a DC offset can be removed by
passing the decoded audio through a high-pass filter, e.g.:
.EX
sox input.vox output.wav highpass 10
.EE
.TP
\&\fB.w64\fR (optional)
Sonic Foundry's 64-bit RIFF/WAV format.
.TP
\&\fB.wav\fR (also with \fB\-t sndfile\fR)
Microsoft .WAV RIFF files.
This is the native audio file format of Windows, and widely used for uncompressed audio.
.SP
Normally \fB.wav\fR files have all formatting information
in their headers, and so do not need any format options
specified for an input file. If any are, they will
override the file header, and you will be warned to this effect.
You had better know what you are doing! Output format
options will cause a format conversion, and the \fB.wav\fR
will written appropriately.
.SP
SoX can read and write linear PCM, floating point, \(*m-law, A-law, MS ADPCM, and IMA (or DVI) ADPCM encoded samples.
WAV files can also contain audio encoded in many other ways (not currently
supported with SoX) e.g. MP3; in some cases such a file can still be
read by SoX by overriding the file type, e.g.
.EX
play -t mp3 mp3-encoded.wav
.EE
Big endian versions of RIFF files, called RIFX, are also supported.
To write a RIFX file, use the
.B \-B
option with the output file options.
.TP
\fBwaveaudio\fR (optional)
MS-Windows native audio device driver. Examples:
.EX
sox infile \-t waveaudio
sox infile \-t waveaudio default
sox infile \-t waveaudio 1
sox infile \-t waveaudio "High Definition Audio Device ("
.EE
If the device name is omitted, \fB-1\fR, or \fBdefault\fR, then you
get the `Microsoft Wave Mapper' device. Wave Mapper means `use the
system default audio devices'. You can control what `default' means
via the OS Control Panel.
.SP
If the device name given is some other number, you get that audio
device by index; so recording with device name \fB0\fR would get the
first input device (perhaps the microphone), \fB1\fR would get the
second (perhaps line in), etc. Playback using \fB0\fR will get the
first output device (usually the only audio device).
.SP
If the device name given is something other than a number, SoX tries
to match it (maximum 31 characters) against the names of the available
devices.
.SP
See also
.BR play (1),
.BR rec (1),
and
.BR sox (1)
.BR \-d .
.TP
.B .wavpcm
A non-standard, but widely used, variant of
.BR .wav .
Some applications cannot read a standard WAV file header for PCM-encoded
data with sample-size greater than 16-bits or with more than two
channels, but can read a non-standard
WAV header. It is likely that such applications will eventually be
updated to support the standard header, but in the mean time, this SoX
format can be used to create files with the non-standard header that
should work with these applications. (Note that SoX will automatically
detect and read WAV files with the non-standard header.)
.SP
The most common use of this file-type is likely to be along the following
lines:
.EX
sox infile.any \-t wavpcm \-s outfile.wav
.EE
.TP
\&\fB.wv\fR (optional)
WavPack lossless audio compression. Note that, when converting
.B .wav
to this format and back again,
the RIFF header is not necessarily preserved losslessly (though the audio is).
.TP
\&\fB.wve\fR (also with \fB\-t sndfile\fR)
Psion 8-bit A-law. Used on Psion SIBO PDAs (Series 3 and similar).
This format is deprecated in SoX, but will continue to be used in
libsndfile.
.TP
.B .xa
Maxis XA files.
These are 16-bit ADPCM audio files used by Maxis games. Writing .xa files is
currently not supported, although adding write support should not be very
difficult.
.TP
\&\fB.xi\fR (optional)
Fasttracker 2 Extended Instrument format.
.SH SEE ALSO
.BR sox (1),
.BR soxi (1),
.BR libsox (3),
.BR octave (1),
.BR wget (1)
.SP
The SoX web page at http://sox.sourceforge.net
.br
SoX scripting examples at http://sox.sourceforge.net/Docs/Scripts
.SS References
.TP
[1]
Wikipedia,
.IR "M3U" ,
http://en.wikipedia.org/wiki/M3U
.TP
[2]
Wikipedia,
.IR "PLS" ,
http://en.wikipedia.org/wiki/PLS_(file_format)
.SH LICENSE
Copyright 1998\-2011 Chris Bagwell and SoX Contributors.
.br
Copyright 1991 Lance Norskog and Sundry Contributors.
.SH AUTHORS
Chris Bagwell ([email protected]).
Other authors and contributors are listed in the ChangeLog file that
is distributed with the source code.