SoX

SoX is an audio converter and mixer. It's a handy command for quick conversion tasks, and possibly for scripted audio editing.

SoX reads and rewrites audio data. Whether it stores the rewritten audio data is up to you. There are use cases in which you don't need to store the converted data, for instance when you're sending the output directly to your speakers for playback, or directly to another command through a UNIX pipe.

Before doing any conversion, it's usually a good idea to determine exactly what you're dealing with in the first place.

To gather information about an audio file, use the soxi command (actually a symlink to sox –info).

$ soxi countdown.mp3
Input File     : '/home/seth/countdown.mp3'
Channels       : 1
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:11.21 = 494185 samples...
File Size      : 179k
Bit Rate       : 128k
Sample Encoding: MPEG audio (layer I, II or III)

This gives you a good idea of what codec the audio file is encoded in, the length of the file, the size of the file, the sample rate, and the number of channels. You might think you already know these attributes, but it never hurts to verify media attributes with soxi.

Converting files

In this example, the audio of a game show countdown has been delivered as an MP3 file. While nearly all editing applications accept compressed audio, none of them actually edit the compressed data. Conversion is happening somewhere, whether it's as a secret background task or a prompt for you to save a copy. When you do that conversion in advance yourself, you can control what format you're using, and you can do the work in batches during downtime rather than wasting valuable production time waiting for an editing application to churn through them as needed.

The sox command is meant for converting audio files. There are a few stages in the SoX pipeline:

  • input
  • combine
  • effects
  • output

In the sox command syntax though, the effects step is written last, making the actual syntax:

input → combine → output → effects

Encoding

The simplest conversion command involves only an input file and an output file. Here's the command to convert an MP3 file to a lossless FLAC file:

$ sox countdown.mp3 output.flac
$ soxi output.flac

Input File     : 'output.flac'
Channels       : 1
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:11.18 = 493056 samples...
File Size      : 545k
Bit Rate       : 390k
Sample Encoding: 16-bit FLAC
Comment        : 'Comment=Processed by SoX'

Effects

The effects chain, specified at the end of a command, can alter audio prior to sending the data to its final destination. For instance, sometimes audio that's too loud can cause problems during conversion:

$ sox bad.wav bad.ogg
sox WARN sox: `bad.ogg' output clipped 126 samples; decrease volume?

Applying a gain effect can often solve this problem:

$ sox bad.wav bad.ogg gain -1

Fade

Another useful effect is the fade effect. This effect lets you define the shape of a fade-in or fade-out, along with how many seconds you want the fade to span.

Here's an example of a 6-second fade-in using an inverted parabola:

$ sox intro.ogg intro.flac fade p 6

This applies a 3-second fade-in to the head of the audio, and a fade-out starting at the 8 second mark (the intro music is only 11 seconds, so the fade-out is also 3-seconds in this case):

$ sox intro.ogg intro.flac fade p 3 8

The different kinds of fades (sine, linear, inverted parabola, and so on), as well as the options fade offers (fade-in, fade-out, are listed in the sox man page.

Effects and syntax

Each effect plugin has its own syntax, so refer to the man page for details on how to invoke each one. These include:

  • band: apply a band-pass filter → band [-n] center[k] [width[h|k|o|q]]
  • bass: boost or cut the bass → bass|treble gain [frequency[k] [width[s|h|k|o|q]]]
  • channels: change the number of channels → channels CHANNELS
  • chorus: add a chorus effect → chorus gain-in gain-out <delay decay speed depth -s|-t>
  • equalizer: adjust EQ → equalizer frequency[k] width[q|o|h|k] gain
  • fade: fade in and out → fade [type] fade-in-length [stop-position(=) [fade-out-length]]
  • flanger: add a flanger effect → flanger [delay depth regen width speed shape phase interp]
  • gain: adjust gain → gain [-e|-B|-b|-r] [-n] [-l|-h] [gain-dB]
  • pitch: change the pitch but not tempo → pitch [-q] shift [segment [search [overlap]]]
  • reverb: add reverb → reverb [-w|–wet-only] [reverberance (50%) [HF-damping (50%)
  • reverse: reverse the audio → reverse
  • stretch: adjust the duration but not the pitch → stretch factor [window fade shift fading]

There are a few particularly unique filters, too:

  • stats: view the levels and adjustments applied to audio by sox
  • synth: generate tones by combining virtual oscillators

Effects can be daisy-chained in one command, at least to the extent that you want to combine them. In other words, there's no syntax to apply a flanger effect only during a 6-second fade-out. For something that precise, you need a graphical sound wave editor or a digital audio workstation. However, if you just have effects you want to apply at once, you can list them together in the same command.

For example, this command applies a -1 gain effect, a tempo stretch of 1.35, and a fade-out:

$ sox intro.ogg output.flac gain -1 stretch 1.35 fade p 0 6 
$ soxi output.flac                                            

Input File     : 'output.flac'
Channels       : 1
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:15.10 = 665808 samples...
[...]

Combining audio

SoX can also combine audio files, either by concatenating them or by mixing them.

To join (or concatenate) files into one, provide more than one input file in your command:

$ sox countdown.mp3 intro.ogg output.flac

In this example, output.flac now contains countdown audio, followed immediately by intro music.

If you want the two tracks to play over one another at the same time, though, you can use the --combine mix option:

$ sox --combine mix countdown.mp3 intro.ogg output.flac

Imagine, however, that the two input files differed in more than just their codecs. It's not uncommon for vocal tracks to be recorded in mono (1 channel), but for music to be recorded in at least stereo (2 channels). SoX won't default to a solution, so you must standardize the format of the two files yourself first.

Altering audio files

Options relate to the file name listed after it. The –channels option in this command applies only to input.wav and NOT to example.ogg or output.flac:

$ sox --channels 2 input.wav example.ogg output.flac

This means that the position of an option is very significant in SoX. Should you specify an option at the start of your command, you're essentially only overriding what SoX gleans from the input files on its own. Options placed immediately before the output file, however, determine how SoX writes the audio data.

To solve the previous problem of incompatible channels, you can first standardize your inputs, and then mix:

$ sox countdown.mp3 --channels 2 countdown-stereo.flac gain -1 
$ soxi countdown-stereo.flac 

Input File     : 'countdown-stereo.flac'
Channels       : 2
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:11.18 = 493056 samples...
File Size      : 545k
Bit Rate       : 390k
Sample Encoding: 16-bit FLAC
Comment        : 'Comment=Processed by SoX'

$ sox --combine mix \
countdown-stereo.flac \
intro.ogg \
output.flac

SoX absolutely requires multiple commands for complex actions, so it's normal to create several temporary and intermediate files as needed.

Multichannel audio

Not all audio is constrained to 1 or 2 channels. If you want to combine several audio channels into one file, you can do that with SoX and the –combine merge option:

$ sox --combine merge countdown.mp3 intro.ogg output.flac
$ soxi output.flac 

Input File     : 'output.flac'
Channels       : 3
[...]

Easy audio manipulation

It might seem strange to work with audio using no visual interface, and for some tasks SoX definitely isn't the best tool. However, for many tasks, SoX provides an easy and lightweight toolkit. SoX is a simple command with powerful potential. With it, you can convert audio, manipulate channels and waveforms, and even generate your own sounds. For more information, read its man page or visit online documentation.

Re-Compiling SoX

The default Slackware install of SoX once lacked MP3 support, but now that MP3 patents have expired, support is built in. Should you have a need to recompile SoX, however, to add recent codec support or for any reason, this section explains how to do that.

First, download the SoX source code from http://sox.sourceforge.net and Pat's SlackBuild script from your local Slackware mirror (for example, http://mirrors.slackware.com/slackware/slackware${ARCH}-XX/source/ap/sox/.

Rebuild with as much codec support as you can manage. For example:

 ./configure --with-distro='SlackermediaXX'
  --with-ladspa-path='/usr/lib64/' --with-oggvorbis=dyn
  --with-flac=dyn --with-amrwb=dyn --with-amrnb=dyn --with-wavpack=dyn
  --with-alsa=dyn --with-ffmpeg=dyn --with-oss=dyn --with-sndfile=dyn
  --with-mp3=dyn --with-gsm=dyn --with-lpc10=dyn --with-ao=dyn
  --libdir='/usr/lib64' --mandir='/usr/man/'

Once your new version of SoX is compiled, install it using the upgradepkg command:

  • If you built a newer version of SoX than the one that shipped with Slackware, then issue this command (where x.x is the newer version number)
# upgradepkg /tmp/sox-x.x*t?z
  • If you built the same version of SoX as the one that shipped with Slackware, then issue the command:
# upgradepkg --reinstall /tmp/sox-x.x*t?z

R S Q