The pedalboard.io
API#
This module provides classes and functions for reading and writing audio files or streams.
Introduced in v0.5.1.
- class pedalboard.io.AudioFile(file_like: BinaryIO, mode: Literal['r'] = 'r')#
- class pedalboard.io.AudioFile(file_like: BinaryIO, mode: Literal['w'], samplerate: Optional[float] = None, num_channels: int = 1, bit_depth: int = 16, quality: Optional[Union[str, float]] = None, format: Optional[str] = None)
- class pedalboard.io.AudioFile(filename: str, mode: Literal['r'] = 'r')
- class pedalboard.io.AudioFile(filename: str, mode: Literal['w'], samplerate: Optional[float] = None, num_channels: int = 1, bit_depth: int = 16, quality: Optional[Union[str, float]] = None)
A base class for readable and writeable audio files.
AudioFile
may be used just like a regular Pythonopen
function call, to open an audio file for reading (with the default"r"
mode) or for writing (with the"w"
mode).- Unlike a typical
open
call: AudioFile
objects can only be created in read ("r"
) or write ("w"
) mode. All audio files are binary (so a trailingb
would be redundant) and appending to an existing audio file is not possible.If opening an audio file in write mode (
"w"
), one additional argument is required: the sample rate of the file.A file-like object can be provided to
AudioFile
, allowing for reading and writing to in-memory streams or buffers. The provided file-like object must be seekable and must be opened in binary mode (i.e.:io.BinaryIO
instead ofio.StringIO
, if using the io package).
Examples
Opening an audio file on disk:
with AudioFile("my_file.mp3") as f: first_ten_seconds = f.read(int(f.samplerate * 10))
Opening a file-like object:
ogg_buffer: io.BytesIO = get_audio_buffer(...) with AudioFile(ogg_buffer) as f: first_ten_seconds = f.read(int(f.samplerate * 10))
Opening an audio file on disk, while resampling on-the-fly:
with AudioFile("my_file.mp3").resampled_to(22_050) as f: first_ten_seconds = f.read(int(f.samplerate * 10))
Writing an audio file on disk:
with AudioFile("white_noise.wav", "w", samplerate=44100, num_channels=2) as f: f.write(np.random.rand(2, 44100))
Writing encoded audio to a file-like object:
wav_buffer = io.BytesIO() with AudioFile(wav_buffer, "w", samplerate=44100, num_channels=2) as f: f.write(np.random.rand(2, 44100)) wav_buffer.getvalue() # do something with the file-like object
Writing to an audio file while also specifying quality options for the codec:
with AudioFile( "white_noise.mp3", "w", samplerate=44100, num_channels=2, quality=160, # kilobits per second ) as f: f.write(np.random.rand(2, 44100))
Re-encoding a WAV file as an MP3 in four lines of Python:
with AudioFile("input.wav") as i: with AudioFile("output.mp3", "w", i.samplerate, i.num_channels) as o: while i.tell() < i.frames: o.write(i.read(1024))
- Unlike a typical
- class pedalboard.io.ReadableAudioFile(file_like: BinaryIO)#
- class pedalboard.io.ReadableAudioFile(filename: str)
A class that wraps an audio file for reading, with native support for Ogg Vorbis, MP3, WAV, FLAC, and AIFF files on all operating systems. Other formats may also be readable depending on the operating system and installed system libraries:
macOS:
.3g2
,.3gp
,.aac
,.ac3
,.adts
,.aif
,.aifc
,.aiff
,.amr
,.au
,.bwf
,.caf
,.ec3
,.flac
,.latm
,.loas
,.m4a
,.m4b
,.m4r
,.mov
,.mp1
,.mp2
,.mp3
,.mp4
,.mpa
,.mpeg
,.ogg
,.qt
,.sd2
,.snd
,.w64
,.wav
,.xhe
Windows:
.aif
,.aiff
,.flac
,.mp3
,.ogg
,.wav
,.wma
Linux:
.aif
,.aiff
,.flac
,.mp3
,.ogg
,.wav
Use
pedalboard.io.get_supported_read_formats()
to see which formats or file extensions are supported on the current platform.(Note that although an audio file may have a certain file extension, its contents may be encoded with a compression algorithm unsupported by Pedalboard.)
Note
You probably don’t want to use this class directly: passing the same arguments to
AudioFile
will work too, and allows usingAudioFile
just like you’d useopen(...)
in Python.- close() None #
Close this file, rendering this object unusable.
- read(num_frames: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float32]] #
Read the given number of frames (samples in each channel) from this audio file at its current position.
num_frames
is a required argument, as audio files can be deceptively large. (Consider that an hour-long.ogg
file may be only a handful of megabytes on disk, but may decompress to nearly a gigabyte in memory.) Audio files should be read in chunks, rather than all at once, to avoid hard-to-debug memory problems and out-of-memory crashes.Audio samples are returned as a multi-dimensional
numpy.array
with the shape(channels, samples)
; i.e.: a stereo audio file will have shape(2, <length>)
. Returned data is always in thefloat32
datatype.For most (but not all) audio files, the minimum possible sample value will be
-1.0f
and the maximum sample value will be+1.0f
.
- read_raw(num_frames: int = 0) numpy.ndarray #
Read the given number of frames (samples in each channel) from this audio file at the current position.
Audio samples are returned as a multi-dimensional
numpy.array
with the shape(channels, samples)
; i.e.: a stereo audio file will have shape(2, <length>)
. Returned data is in the raw format stored by the underlying file (one ofint8
,int16
,int32
, orfloat32
).
- resampled_to(target_sample_rate: float, quality: pedalboard.Resample.Quality = Quality.WindowedSinc) pedalboard.io.ResampledReadableAudioFile #
Return a
ResampledReadableAudioFile
that will automatically resample thisReadableAudioFile
to the provided target_sample_rate, using a constant amount of memory.Introduced in v0.6.0.
- seek(position: int) None #
Seek this file to the provided location in frames. Future reads will start from this position.
- seekable() bool #
Returns True if this file is currently open and calls to seek() will work.
- tell() int #
Return the current position of the read pointer in this audio file, in frames. This value will increase as
read()
is called, and may decrease ifseek()
is called.
- property closed: bool#
True iff this file is closed (and no longer usable), False otherwise.
- property duration: float#
The duration of this file in seconds (
frames
divided bysamplerate
).
- property file_dtype: str#
The data type (
"int16"
,"float32"
, etc) stored natively by this file.Note that
read()
will always return afloat32
array, regardless of the value of this property. Useread_raw()
to read data from the file in itsfile_dtype
.
- property frames: int#
The total number of frames (samples per channel) in this file.
For example, if this file contains 10 seconds of stereo audio at sample rate of 44,100 Hz,
frames
will return441,000
.
- property name: Optional[str]#
The name of this file.
If this
ReadableAudioFile
was opened from a file-like object, this will beNone
.
- property num_channels: int#
The number of channels in this file.
- property samplerate: float#
The sample rate of this file in samples (per channel) per second (Hz).
- class pedalboard.io.ResampledReadableAudioFile(audio_file: pedalboard.io.ReadableAudioFile, target_sample_rate: float, resampling_quality: pedalboard.Resample.Quality = Quality.WindowedSinc)#
A class that wraps an audio file for reading, while resampling the audio stream on-the-fly to a new sample rate.
Introduced in v0.6.0.
Reading, seeking, and all other basic file I/O operations are supported (except for
read_raw()
).ResampledReadableAudioFile
should usually be used via theresampled_to()
method onReadableAudioFile
:with AudioFile("my_file.mp3").resampled_to(22_050) as f: f.samplerate # => 22050 first_ten_seconds = f.read(int(f.samplerate * 10))
Fractional (real-valued, non-integer) sample rates are supported.
Under the hood,
ResampledReadableAudioFile
uses a statefulStreamResampler
instance, which uses a constant amount of memory to resample potentially-unbounded streams of audio. The audio output byResampledReadableAudioFile
will always be identical to the result obtained by passing the entire audio file through aStreamResampler
, with the added benefits of allowing chunked reads, seeking through files, and using a constant amount of memory.- close() None #
Close this file, rendering this object unusable. Note that the
ReadableAudioFile
instance that is wrapped by this object will not be closed, and will remain usable.
- read(num_frames: int = 0) numpy.ndarray[Any, numpy.dtype[numpy.float32]] #
Read the given number of frames (samples in each channel, at the target sample rate) from this audio file at its current position, automatically resampling on-the-fly to
target_sample_rate
.num_frames
is a required argument, as audio files can be deceptively large. (Consider that an hour-long.ogg
file may be only a handful of megabytes on disk, but may decompress to nearly a gigabyte in memory.) Audio files should be read in chunks, rather than all at once, to avoid hard-to-debug memory problems and out-of-memory crashes.Audio samples are returned as a multi-dimensional
numpy.array
with the shape(channels, samples)
; i.e.: a stereo audio file will have shape(2, <length>)
. Returned data is always in thefloat32
datatype.For most (but not all) audio files, the minimum possible sample value will be
-1.0f
and the maximum sample value will be+1.0f
.
- seek(position: int) None #
Seek this file to the provided location in frames at the target sample rate. Future reads will start from this position.
As of version 0.6.1, this method operates in linear time with respect to the seek length (i.e.: the file is seeked to the start and pushed through the resampler) to ensure that the resampled audio output is accurate. This may be optimized in a future version of Pedalboard.
- seekable() bool #
Returns True if this file is currently open and calls to seek() will work.
- tell() int #
Return the current position of the read pointer in this audio file, in frames at the target sample rate. This value will increase as
read()
is called, and may decrease ifseek()
is called.
- property closed: bool#
True iff either this file or its wrapped
ReadableAudioFile
instance are closed (and no longer usable), False otherwise.
- property duration: float#
The duration of this file in seconds (
frames
divided bysamplerate
).
- property file_dtype: str#
The data type (
"int16"
,"float32"
, etc) stored natively by this file.Note that
read()
will always return afloat32
array, regardless of the value of this property.
- property frames: int#
The total number of frames (samples per channel) in this file, at the target sample rate.
For example, if this file contains 10 seconds of stereo audio at sample rate of 44,100 Hz, and
target_sample_rate
is 22,050 Hz,frames
will return22,050
.Note that different
resampling_quality
values used for resampling may causeframes
to differ by ± 1 from its expected value.
- property name: Optional[str]#
The name of this file.
If the
ReadableAudioFile
wrapped by thisResampledReadableAudioFile
was opened from a file-like object, this will beNone
.
- property num_channels: int#
The number of channels in this file.
- property resampling_quality: pedalboard.Resample.Quality#
The resampling algorithm used to resample from the original file’s sample rate to the
target_sample_rate
.
- property samplerate: float#
The sample rate of this file in samples (per channel) per second (Hz). This will be equal to the
target_sample_rate
parameter passed when this object was created.
- class pedalboard.io.StreamResampler(source_sample_rate: float, target_sample_rate: float, num_channels: int, quality: pedalboard.Resample.Quality = Quality.WindowedSinc)#
A streaming resampler that can change the sample rate of multiple chunks of audio in series, while using constant memory.
For a resampling plug-in that can be used in
Pedalboard
objects, seepedalboard.Resample
.Introduced in v0.6.0.
- process(input: Optional[numpy.ndarray[Any, numpy.dtype[numpy.float32]]] = None) numpy.ndarray[Any, numpy.dtype[numpy.float32]] #
Resample a 32-bit floating-point audio buffer. The returned buffer may be smaller than the provided buffer depending on the quality method used. Call
process()
without any arguments to flush the internal buffers and return all remaining audio.
- reset() None #
Used to reset the internal state of this resampler. Call this method when resampling a new audio stream to prevent audio from leaking between streams.
- property input_latency: float#
The number of samples (in the input sample rate) that must be supplied before this resampler will begin returning output.
- property num_channels: int#
The number of channels expected to be passed in every call to
process()
.
- property quality: pedalboard.Resample.Quality#
The resampling algorithm used by this resampler.
- class pedalboard.io.WriteableAudioFile(file_like: BinaryIO, samplerate: Optional[float] = None, num_channels: int = 1, bit_depth: int = 16, quality: Optional[Union[str, float]] = None, format: Optional[str] = None)#
- class pedalboard.io.WriteableAudioFile(filename: str, samplerate: Optional[float] = None, num_channels: int = 1, bit_depth: int = 16, quality: Optional[Union[str, float]] = None)
A class that wraps an audio file for writing, with native support for Ogg Vorbis, MP3, WAV, FLAC, and AIFF files on all operating systems.
Use
pedalboard.io.get_supported_write_formats()
to see which formats or file extensions are supported on the current platform.- Parameters
filename_or_file_like – The path to an output file to write to, or a seekable file-like binary object (like
io.BytesIO
) to write to.samplerate – The sample rate of the audio that will be written to this file. All calls to the
write()
method will assume this sample rate is used.num_channels – The number of channels in the audio that will be written to this file. All calls to the
write()
method will expect audio with this many channels, and will throw an exception if the audio does not contain this number of channels.bit_depth – The bit depth (number of bits per sample) that will be written to this file. Used for raw formats like WAV and AIFF. Will have no effect on compressed formats like MP3 or Ogg Vorbis.
quality – An optional string or number that indicates the quality level to use for the given audio compression codec. Different codecs have different compression quality values; numeric values like
128
and256
will usually indicate the number of kilobits per second used by the codec. Some formats, like MP3, support more advanced options likeV2
(as specified by the LAME encoder) which may be passed as a string. The strings"best"
,"worst"
,"fastest"
, and"slowest"
will also work for any codec.
Note
You probably don’t want to use this class directly: all of the parameters accepted by the
WriteableAudioFile
constructor will be accepted byAudioFile
as well, as long as the"w"
mode is passed as the second argument.- close() None #
Close this file, flushing its contents to disk and rendering this object unusable for further writing.
- flush() None #
Attempt to flush this audio file’s contents to disk. Not all formats support flushing, so this may throw a RuntimeError. (If this happens, closing the file will reliably force a flush to occur.)
- write(samples: numpy.ndarray) None #
Encode an array of audio data and write it to this file. The number of channels in the array must match the number of channels used to open the file. The array may contain audio in any shape. If the file’s bit depth or format does not match the provided data type, the audio will be automatically converted.
Arrays of type int8, int16, int32, float32, and float64 are supported. If an array of an unsupported
dtype
is provided, aTypeError
will be raised.
- property closed: bool#
If this file has been closed, this property will be True.
- property file_dtype: str#
The data type stored natively by this file. Note that write(…) will accept multiple datatypes, regardless of the value of this property.
- property frames: int#
The total number of frames (samples per channel) written to this file so far.
- property num_channels: int#
The number of channels in this file.
- property quality: Optional[str]#
The quality setting used to write this file. For many formats, this may be
None
.Quality options differ based on the audio codec used in the file. Most codecs specify a number of bits per second in 16- or 32-bit-per-second increments (128 kbps, 160 kbps, etc). Some codecs provide string-like options for variable bit-rate encoding (i.e. “V0” through “V9” for MP3). The strings
"best"
,"worst"
,"fastest"
, and"slowest"
will also work for any codec.
- property samplerate: float#
The sample rate of this file in samples (per channel) per second (Hz).