filters.texi 367 KB
Newer Older
1 2 3 4 5
@chapter Filtering Introduction
@c man begin FILTERING INTRODUCTION

Filtering in FFmpeg is enabled through the libavfilter library.

6 7 8 9
In libavfilter, a filter can have multiple inputs and multiple
outputs.
To illustrate the sorts of things that are possible, we consider the
following filtergraph.
10

11
@verbatim
12
                [main]
13 14
input --> split ---------------------> overlay --> output
            |                             ^
15
            |[tmp]                  [flip]|
16
            +-----> crop --> vflip -------+
17
@end verbatim
18

19 20
This filtergraph splits the input stream in two streams, then sends one
stream through the crop filter and the vflip filter, before merging it
21 22
back with the other stream by overlaying it on top. You can use the
following command to achieve this:
23 24

@example
25
ffmpeg -i INPUT -vf "split [main][tmp]; [tmp] crop=iw:ih/2:0:0, vflip [flip]; [main][flip] overlay=0:H/2" OUTPUT
26 27
@end example

28 29
The result will be that the top half of the video is mirrored
onto the bottom half of the output video.
30

31 32 33 34 35 36 37 38 39 40 41 42 43 44
Filters in the same linear chain are separated by commas, and distinct
linear chains of filters are separated by semicolons. In our example,
@var{crop,vflip} are in one linear chain, @var{split} and
@var{overlay} are separately in another. The points where the linear
chains join are labelled by names enclosed in square brackets. In the
example, the split filter generates two outputs that are associated to
the labels @var{[main]} and @var{[tmp]}.

The stream sent to the second output of @var{split}, labelled as
@var{[tmp]}, is processed through the @var{crop} filter, which crops
away the lower half part of the video, and then vertically flipped. The
@var{overlay} filter takes in input the first unchanged output of the
split filter (which was labelled as @var{[main]}), and overlay on its
lower half the output generated by the @var{crop,vflip} filterchain.
45 46

Some filters take in input a list of parameters: they are specified
Tim Nicholson's avatar
Tim Nicholson committed
47 48
after the filter name and an equal sign, and are separated from each other
by a colon.
49

50 51 52
There exist so-called @var{source filters} that do not have an
audio/video input, and @var{sink filters} that will not have audio/video
output.
53 54 55 56 57 58 59

@c man end FILTERING INTRODUCTION

@chapter graph2dot
@c man begin GRAPH2DOT

The @file{graph2dot} program included in the FFmpeg @file{tools}
60
directory can be used to parse a filtergraph description and issue a
61 62 63 64 65 66 67 68 69 70 71
corresponding textual representation in the dot language.

Invoke the command:
@example
graph2dot -h
@end example

to see how to use @file{graph2dot}.

You can then pass the dot description to the @file{dot} program (from
the graphviz suite of programs) and obtain a graphical representation
72
of the filtergraph.
73 74 75 76 77 78 79 80 81 82

For example the sequence of commands:
@example
echo @var{GRAPH_DESCRIPTION} | \
tools/graph2dot -o graph.tmp && \
dot -Tpng graph.tmp -o graph.png && \
display graph.png
@end example

can be used to create and display an image representing the graph
83 84 85 86 87 88 89 90 91 92 93 94
described by the @var{GRAPH_DESCRIPTION} string. Note that this string must be
a complete self-contained graph, with its inputs and outputs explicitly defined.
For example if your command line is of the form:
@example
ffmpeg -i infile -vf scale=640:360 outfile
@end example
your @var{GRAPH_DESCRIPTION} string will need to be of the form:
@example
nullsrc,scale=640:360,nullsink
@end example
you may also need to set the @var{nullsrc} parameters and add a @var{format}
filter in order to simulate a specific input file.
95 96 97

@c man end GRAPH2DOT

98 99 100 101 102 103 104
@chapter Filtergraph description
@c man begin FILTERGRAPH DESCRIPTION

A filtergraph is a directed graph of connected filters. It can contain
cycles, and there can be multiple links between a pair of
filters. Each link has one input pad on one side connecting it to one
filter from which it takes its input, and one output pad on the other
105
side connecting it to one filter accepting its output.
106 107 108 109 110

Each filter in a filtergraph is an instance of a filter class
registered in the application, which defines the features and the
number of input and output pads of the filter.

111
A filter with no input pads is called a "source", and a filter with no
112 113
output pads is called a "sink".

114
@anchor{Filtergraph syntax}
115 116
@section Filtergraph syntax

117 118 119 120 121
A filtergraph has a textual representation, which is recognized by the
@option{-filter}/@option{-vf}/@option{-af} and
@option{-filter_complex} options in @command{ffmpeg} and
@option{-vf}/@option{-af} in @command{ffplay}, and by the
@code{avfilter_graph_parse_ptr()} function defined in
122
@file{libavfilter/avfilter.h}.
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141

A filterchain consists of a sequence of connected filters, each one
connected to the previous one in the sequence. A filterchain is
represented by a list of ","-separated filter descriptions.

A filtergraph consists of a sequence of filterchains. A sequence of
filterchains is represented by a list of ";"-separated filterchain
descriptions.

A filter is represented by a string of the form:
[@var{in_link_1}]...[@var{in_link_N}]@var{filter_name}=@var{arguments}[@var{out_link_1}]...[@var{out_link_M}]

@var{filter_name} is the name of the filter class of which the
described filter is an instance of, and has to be the name of one of
the filter classes registered in the program.
The name of the filter class is optionally followed by a string
"=@var{arguments}".

@var{arguments} is a string which contains the parameters used to
142
initialize the filter instance. It may have one of two forms:
143 144 145 146 147 148 149 150 151 152 153 154 155
@itemize

@item
A ':'-separated list of @var{key=value} pairs.

@item
A ':'-separated list of @var{value}. In this case, the keys are assumed to be
the option names in the order they are declared. E.g. the @code{fade} filter
declares three options in this order -- @option{type}, @option{start_frame} and
@option{nb_frames}. Then the parameter list @var{in:0:30} means that the value
@var{in} is assigned to the option @option{type}, @var{0} to
@option{start_frame} and @var{30} to @option{nb_frames}.

156 157 158 159 160 161
@item
A ':'-separated list of mixed direct @var{value} and long @var{key=value}
pairs. The direct @var{value} must precede the @var{key=value} pairs, and
follow the same constraints order of the previous point. The following
@var{key=value} pairs can be set in any preferred order.

162 163 164 165
@end itemize

If the option value itself is a list of items (e.g. the @code{format} filter
takes a list of pixel formats), the items in the list are usually separated by
166
@samp{|}.
167

168 169
The list of arguments can be quoted using the character @samp{'} as initial
and ending mark, and the character @samp{\} for escaping the characters
170 171
within the quoted text; otherwise the argument string is considered
terminated when the next special character (belonging to the set
172
@samp{[]=;,}) is encountered.
173 174 175

The name and arguments of the filter are optionally preceded and
followed by a list of link labels.
176
A link label allows one to name a link and associate it to a filter output
177 178 179 180 181 182 183 184 185 186 187
or input pad. The preceding labels @var{in_link_1}
... @var{in_link_N}, are associated to the filter input pads,
the following labels @var{out_link_1} ... @var{out_link_M}, are
associated to the output pads.

When two link labels with the same name are found in the
filtergraph, a link between the corresponding input and output pad is
created.

If an output pad is not labelled, it is linked by default to the first
unlabelled input pad of the next filter in the filterchain.
188
For example in the filterchain
189 190 191 192 193 194 195 196 197
@example
nullsrc, split[L1], [L2]overlay, nullsink
@end example
the split filter instance has two output pads, and the overlay filter
instance two input pads. The first output pad of split is labelled
"L1", the first input pad of overlay is labelled "L2", and the second
output pad of split is linked to the second input pad of overlay,
which are both unlabelled.

198 199 200 201
In a filter description, if the input label of the first filter is not
specified, "in" is assumed; if the output label of the last filter is not
specified, "out" is assumed.

202 203 204 205
In a complete filterchain all the unlabelled filter input and output
pads must be connected. A filtergraph is considered valid if all the
filter input and output pads of all the filterchains are connected.

206
Libavfilter will automatically insert @ref{scale} filters where format
207 208 209 210 211
conversion is required. It is possible to specify swscale flags
for those automatically inserted scalers by prepending
@code{sws_flags=@var{flags};}
to the filtergraph description.

212
Here is a BNF description of the filtergraph syntax:
213 214 215 216
@example
@var{NAME}             ::= sequence of alphanumeric characters and '_'
@var{LINKLABEL}        ::= "[" @var{NAME} "]"
@var{LINKLABELS}       ::= @var{LINKLABEL} [@var{LINKLABELS}]
217
@var{FILTER_ARGUMENTS} ::= sequence of chars (possibly quoted)
218
@var{FILTER}           ::= [@var{LINKLABELS}] @var{NAME} ["=" @var{FILTER_ARGUMENTS}] [@var{LINKLABELS}]
219
@var{FILTERCHAIN}      ::= @var{FILTER} [,@var{FILTERCHAIN}]
220
@var{FILTERGRAPH}      ::= [sws_flags=@var{flags};] @var{FILTERCHAIN} [;@var{FILTERGRAPH}]
221 222
@end example

223 224
@section Notes on filtergraph escaping

225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243
Filtergraph description composition entails several levels of
escaping. See @ref{quoting_and_escaping,,the "Quoting and escaping"
section in the ffmpeg-utils(1) manual,ffmpeg-utils} for more
information about the employed escaping procedure.

A first level escaping affects the content of each filter option
value, which may contain the special character @code{:} used to
separate values, or one of the escaping characters @code{\'}.

A second level escaping affects the whole filter description, which
may contain the escaping characters @code{\'} or the special
characters @code{[],;} used by the filtergraph description.

Finally, when you specify a filtergraph on a shell commandline, you
need to perform a third level escaping for the shell special
characters contained within it.

For example, consider the following string to be embedded in
the @ref{drawtext} filter description @option{text} value:
244 245 246 247
@example
this is a 'string': may contain one, or more, special characters
@end example

248 249
This string contains the @code{'} special escaping character, and the
@code{:} special character, so it needs to be escaped in this way:
250 251 252 253 254
@example
text=this is a \'string\'\: may contain one, or more, special characters
@end example

A second level of escaping is required when embedding the filter
255
description in a filtergraph description, in order to escape all the
256 257 258 259
filtergraph special characters. Thus the example above becomes:
@example
drawtext=text=this is a \\\'string\\\'\\: may contain one\, or more\, special characters
@end example
260 261
(note that in addition to the @code{\'} escaping special characters,
also @code{,} needs to be escaped).
262

263
Finally an additional level of escaping is needed when writing the
264 265 266 267 268 269 270 271
filtergraph description in a shell command, which depends on the
escaping rules of the adopted shell. For example, assuming that
@code{\} is special and needs to be escaped with another @code{\}, the
previous string will finally result in:
@example
-vf "drawtext=text=this is a \\\\\\'string\\\\\\'\\\\: may contain one\\, or more\\, special characters"
@end example

Clément Bœsch's avatar
Clément Bœsch committed
272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289
@chapter Timeline editing

Some filters support a generic @option{enable} option. For the filters
supporting timeline editing, this option can be set to an expression which is
evaluated before sending a frame to the filter. If the evaluation is non-zero,
the filter will be enabled, otherwise the frame will be sent unchanged to the
next filter in the filtergraph.

The expression accepts the following values:
@table @samp
@item t
timestamp expressed in seconds, NAN if the input timestamp is unknown

@item n
sequential number of the input frame, starting from 0

@item pos
the position in the file of the input frame, NAN if unknown
290 291 292 293

@item w
@item h
width and height of the input frame if video
Clément Bœsch's avatar
Clément Bœsch committed
294 295
@end table

296 297 298
Additionally, these filters support an @option{enable} command that can be used
to re-define the expression.

Clément Bœsch's avatar
Clément Bœsch committed
299 300 301
Like any other filtering option, the @option{enable} option follows the same
rules.

302
For example, to enable a blur filter (@ref{smartblur}) from 10 seconds to 3
Clément Bœsch's avatar
Clément Bœsch committed
303 304
minutes, and a @ref{curves} filter starting at 3 seconds:
@example
305 306
smartblur = enable='between(t,10,3*60)',
curves    = enable='gte(t,3)' : preset=cross_process
Clément Bœsch's avatar
Clément Bœsch committed
307 308
@end example

309 310
@c man end FILTERGRAPH DESCRIPTION

Stefano Sabatini's avatar
Stefano Sabatini committed
311 312 313
@chapter Audio Filters
@c man begin AUDIO FILTERS

314
When you configure your FFmpeg build, you can disable any of the
315
existing filters using @code{--disable-filters}.
Stefano Sabatini's avatar
Stefano Sabatini committed
316 317 318 319 320
The configure output will show the audio filters included in your
build.

Below is a description of the currently available audio filters.

Paul B Mahol's avatar
Paul B Mahol committed
321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368
@section acrossfade

Apply cross fade from one input audio stream to another input audio stream.
The cross fade is applied for specified duration near the end of first stream.

The filter accepts the following options:

@table @option
@item nb_samples, ns
Specify the number of samples for which the cross fade effect has to last.
At the end of the cross fade effect the first input audio will be completely
silent. Default is 44100.

@item duration, d
Specify the duration of the cross fade effect. See
@ref{time duration syntax,,the Time duration section in the ffmpeg-utils(1) manual,ffmpeg-utils}
for the accepted syntax.
By default the duration is determined by @var{nb_samples}.
If set this option is used instead of @var{nb_samples}.

@item overlap, o
Should first stream end overlap with second stream start. Default is enabled.

@item curve1
Set curve for cross fade transition for first stream.

@item curve2
Set curve for cross fade transition for second stream.

For description of available curve types see @ref{afade} filter description.
@end table

@subsection Examples

@itemize
@item
Cross fade from one input to another:
@example
ffmpeg -i first.flac -i second.flac -filter_complex acrossfade=d=10:c1=exp:c2=exp output.flac
@end example

@item
Cross fade from one input to another but without overlapping:
@example
ffmpeg -i first.flac -i second.flac -filter_complex acrossfade=d=10:o=0:c1=exp:c2=exp output.flac
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391
@section adelay

Delay one or more audio channels.

Samples in delayed channel are filled with silence.

The filter accepts the following option:

@table @option
@item delays
Set list of delays in milliseconds for each channel separated by '|'.
At least one delay greater than 0 should be provided.
Unused delays will be silently ignored. If number of given delays is
smaller than number of channels all remaining channels will not be delayed.
@end table

@subsection Examples

@itemize
@item
Delay first channel by 1.5 seconds, the third channel by 0.5 seconds and leave
the second channel (and any other channels that may be present) unchanged.
@example
Carl Eugen Hoyos's avatar
Carl Eugen Hoyos committed
392
adelay=1500|0|500
Paul B Mahol's avatar
Paul B Mahol committed
393 394 395
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455
@section aecho

Apply echoing to the input audio.

Echoes are reflected sound and can occur naturally amongst mountains
(and sometimes large buildings) when talking or shouting; digital echo
effects emulate this behaviour and are often used to help fill out the
sound of a single instrument or vocal. The time difference between the
original signal and the reflection is the @code{delay}, and the
loudness of the reflected signal is the @code{decay}.
Multiple echoes can have different delays and decays.

A description of the accepted parameters follows.

@table @option
@item in_gain
Set input gain of reflected signal. Default is @code{0.6}.

@item out_gain
Set output gain of reflected signal. Default is @code{0.3}.

@item delays
Set list of time intervals in milliseconds between original signal and reflections
separated by '|'. Allowed range for each @code{delay} is @code{(0 - 90000.0]}.
Default is @code{1000}.

@item decays
Set list of loudnesses of reflected signals separated by '|'.
Allowed range for each @code{decay} is @code{(0 - 1.0]}.
Default is @code{0.5}.
@end table

@subsection Examples

@itemize
@item
Make it sound as if there are twice as many instruments as are actually playing:
@example
aecho=0.8:0.88:60:0.4
@end example

@item
If delay is very short, then it sound like a (metallic) robot playing music:
@example
aecho=0.8:0.88:6:0.4
@end example

@item
A longer delay will sound like an open air concert in the mountains:
@example
aecho=0.8:0.9:1000:0.3
@end example

@item
Same as above but with one more mountain:
@example
aecho=0.8:0.9:1000|1800:0.3|0.25
@end example
@end itemize

Stefano Sabatini's avatar
Stefano Sabatini committed
456 457 458 459 460 461 462
@section aeval

Modify an audio signal according to the specified expressions.

This filter accepts one or more expressions (one for each channel),
which are evaluated and used to modify a corresponding audio signal.

463
It accepts the following parameters:
Stefano Sabatini's avatar
Stefano Sabatini committed
464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515

@table @option
@item exprs
Set the '|'-separated expressions list for each separate channel. If
the number of input channels is greater than the number of
expressions, the last specified expression is used for the remaining
output channels.

@item channel_layout, c
Set output channel layout. If not specified, the channel layout is
specified by the number of expressions. If set to @samp{same}, it will
use by default the same input channel layout.
@end table

Each expression in @var{exprs} can contain the following constants and functions:

@table @option
@item ch
channel number of the current expression

@item n
number of the evaluated sample, starting from 0

@item s
sample rate

@item t
time of the evaluated sample expressed in seconds

@item nb_in_channels
@item nb_out_channels
input and output number of channels

@item val(CH)
the value of input channel with number @var{CH}
@end table

Note: this filter is slow. For faster processing you should use a
dedicated filter.

@subsection Examples

@itemize
@item
Half volume:
@example
aeval=val(ch)/2:c=same
@end example

@item
Invert phase of the second channel:
@example
516
aeval=val(0)|-val(1)
Stefano Sabatini's avatar
Stefano Sabatini committed
517 518 519
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
520
@anchor{afade}
521
@section afade
Paul B Mahol's avatar
Paul B Mahol committed
522

523
Apply fade-in/out effect to input audio.
Paul B Mahol's avatar
Paul B Mahol committed
524

525
A description of the accepted parameters follows.
Paul B Mahol's avatar
Paul B Mahol committed
526 527

@table @option
528 529 530
@item type, t
Specify the effect type, can be either @code{in} for fade-in, or
@code{out} for a fade-out effect. Default is @code{in}.
Paul B Mahol's avatar
Paul B Mahol committed
531

532 533 534
@item start_sample, ss
Specify the number of the start sample for starting to apply the fade
effect. Default is 0.
Paul B Mahol's avatar
Paul B Mahol committed
535

536 537 538 539 540
@item nb_samples, ns
Specify the number of samples for which the fade effect has to last. At
the end of the fade-in effect the output audio will have the same
volume as the input audio, at the end of the fade-out transition
the output audio will be silence. Default is 44100.
Paul B Mahol's avatar
Paul B Mahol committed
541

542
@item start_time, st
543 544 545 546 547
Specify the start time of the fade effect. Default is 0.
The value must be specified as a time duration; see
@ref{time duration syntax,,the Time duration section in the ffmpeg-utils(1) manual,ffmpeg-utils}
for the accepted syntax.
If set this option is used instead of @var{start_sample}.
Paul B Mahol's avatar
Paul B Mahol committed
548

549
@item duration, d
550 551 552
Specify the duration of the fade effect. See
@ref{time duration syntax,,the Time duration section in the ffmpeg-utils(1) manual,ffmpeg-utils}
for the accepted syntax.
553 554 555
At the end of the fade-in effect the output audio will have the same
volume as the input audio, at the end of the fade-out transition
the output audio will be silence.
556 557
By default the duration is determined by @var{nb_samples}.
If set this option is used instead of @var{nb_samples}.
Paul B Mahol's avatar
Paul B Mahol committed
558

559 560
@item curve
Set curve for fade transition.
Paul B Mahol's avatar
Paul B Mahol committed
561

562
It accepts the following values:
Paul B Mahol's avatar
Paul B Mahol committed
563
@table @option
564 565 566 567 568 569 570 571 572 573
@item tri
select triangular, linear slope (default)
@item qsin
select quarter of sine wave
@item hsin
select half of sine wave
@item esin
select exponential sine wave
@item log
select logarithmic
574
@item ipar
575 576 577 578 579 580 581 582 583
select inverted parabola
@item qua
select quadratic
@item cub
select cubic
@item squ
select square root
@item cbr
select cubic root
584 585 586 587 588 589 590 591 592 593 594 595
@item par
select parabola
@item exp
select exponential
@item iqsin
select inverted quarter of sine wave
@item ihsin
select inverted half of sine wave
@item dese
select double-exponential seat
@item desi
select double-exponential sigmoid
Paul B Mahol's avatar
Paul B Mahol committed
596 597 598
@end table
@end table

599
@subsection Examples
Paul B Mahol's avatar
Paul B Mahol committed
600

601 602 603 604 605 606
@itemize
@item
Fade in first 15 seconds of audio:
@example
afade=t=in:ss=0:d=15
@end example
Paul B Mahol's avatar
Paul B Mahol committed
607

608 609 610 611 612 613
@item
Fade out last 25 seconds of a 900 seconds audio:
@example
afade=t=out:st=875:d=25
@end example
@end itemize
Paul B Mahol's avatar
Paul B Mahol committed
614

615 616
@anchor{aformat}
@section aformat
Paul B Mahol's avatar
Paul B Mahol committed
617

618 619
Set output format constraints for the input audio. The framework will
negotiate the most appropriate format to minimize conversions.
Paul B Mahol's avatar
Paul B Mahol committed
620

621
It accepts the following parameters:
Paul B Mahol's avatar
Paul B Mahol committed
622 623
@table @option

624 625
@item sample_fmts
A '|'-separated list of requested sample formats.
Paul B Mahol's avatar
Paul B Mahol committed
626

627 628
@item sample_rates
A '|'-separated list of requested sample rates.
Paul B Mahol's avatar
Paul B Mahol committed
629

630 631
@item channel_layouts
A '|'-separated list of requested channel layouts.
Paul B Mahol's avatar
Paul B Mahol committed
632

633 634
See @ref{channel layout syntax,,the Channel Layout section in the ffmpeg-utils(1) manual,ffmpeg-utils}
for the required syntax.
Paul B Mahol's avatar
Paul B Mahol committed
635 636
@end table

637
If a parameter is omitted, all values are allowed.
Paul B Mahol's avatar
Paul B Mahol committed
638

639
Force the output to either unsigned 8-bit or signed 16-bit stereo
640 641 642
@example
aformat=sample_fmts=u8|s16:channel_layouts=stereo
@end example
Paul B Mahol's avatar
Paul B Mahol committed
643

644 645 646 647 648 649
@section allpass

Apply a two-pole all-pass filter with central frequency (in Hz)
@var{frequency}, and filter-width @var{width}.
An all-pass filter changes the audio's frequency to phase relationship
without changing its frequency to amplitude relationship.
Paul B Mahol's avatar
Paul B Mahol committed
650

651
The filter accepts the following options:
Paul B Mahol's avatar
Paul B Mahol committed
652 653 654

@table @option
@item frequency, f
655
Set frequency in Hz.
Paul B Mahol's avatar
Paul B Mahol committed
656 657 658 659

@item width_type
Set method to specify band-width of filter.
@table @option
660 661 662 663 664 665 666 667
@item h
Hz
@item q
Q-Factor
@item o
octave
@item s
slope
Paul B Mahol's avatar
Paul B Mahol committed
668 669 670
@end table

@item width, w
671
Specify the band-width of a filter in width_type units.
Paul B Mahol's avatar
Paul B Mahol committed
672 673
@end table

674
@anchor{amerge}
675
@section amerge
Paul B Mahol's avatar
Paul B Mahol committed
676

677
Merge two or more audio streams into a single multi-channel stream.
Paul B Mahol's avatar
Paul B Mahol committed
678

679
The filter accepts the following options:
Paul B Mahol's avatar
Paul B Mahol committed
680 681 682

@table @option

683 684
@item inputs
Set the number of inputs. Default is 2.
Paul B Mahol's avatar
Paul B Mahol committed
685 686 687

@end table

688 689 690 691 692 693 694
If the channel layouts of the inputs are disjoint, and therefore compatible,
the channel layout of the output will be set accordingly and the channels
will be reordered as necessary. If the channel layouts of the inputs are not
disjoint, the output will have all the channels of the first input then all
the channels of the second input, in that order, and the channel layout of
the output will be the default value corresponding to the total number of
channels.
Paul B Mahol's avatar
Paul B Mahol committed
695

696 697 698 699
For example, if the first input is in 2.1 (FL+FR+LF) and the second input
is FC+BL+BR, then the output will be in 5.1, with the channels in the
following order: a1, a2, b1, a3, b2, b3 (a1 is the first channel of the
first input, b1 is the first channel of the second input).
700 701 702 703 704

On the other hand, if both input are in stereo, the output channels will be
in the default order: a1, a2, b1, b2, and the channel layout will be
arbitrarily set to 4.0, which may or may not be the expected value.

705
All inputs must have the same sample rate, and format.
706 707 708 709

If inputs do not have the same duration, the output will stop with the
shortest.

710 711 712 713 714
@subsection Examples

@itemize
@item
Merge two mono files into a stereo stream:
715 716 717 718
@example
amovie=left.wav [l] ; amovie=right.mp3 [r] ; [l] [r] amerge
@end example

719
@item
720
Multiple merges assuming 1 video stream and 6 audio streams in @file{input.mkv}:
721
@example
722
ffmpeg -i input.mkv -filter_complex "[0:1][0:2][0:3][0:4][0:5][0:6] amerge=inputs=6" -c:a pcm_s16le output.mkv
723
@end example
724
@end itemize
725

Justin Ruggles's avatar
Justin Ruggles committed
726 727 728 729
@section amix

Mixes multiple audio inputs into a single output.

730 731 732 733 734
Note that this filter only supports float samples (the @var{amerge}
and @var{pan} audio filters support many formats). If the @var{amix}
input has integer samples then @ref{aresample} will be automatically
inserted to perform the conversion to float samples.

Justin Ruggles's avatar
Justin Ruggles committed
735 736
For example
@example
737
ffmpeg -i INPUT1 -i INPUT2 -i INPUT3 -filter_complex amix=inputs=3:duration=first:dropout_transition=3 OUTPUT
Justin Ruggles's avatar
Justin Ruggles committed
738 739 740 741
@end example
will mix 3 input audio streams to a single output with the same duration as the
first input and a dropout transition time of 3 seconds.

742
It accepts the following parameters:
Justin Ruggles's avatar
Justin Ruggles committed
743 744 745
@table @option

@item inputs
746
The number of inputs. If unspecified, it defaults to 2.
Justin Ruggles's avatar
Justin Ruggles committed
747 748 749 750 751 752

@item duration
How to determine the end-of-stream.
@table @option

@item longest
753
The duration of the longest input. (default)
Justin Ruggles's avatar
Justin Ruggles committed
754 755

@item shortest
756
The duration of the shortest input.
Justin Ruggles's avatar
Justin Ruggles committed
757 758

@item first
759
The duration of the first input.
Justin Ruggles's avatar
Justin Ruggles committed
760 761 762 763

@end table

@item dropout_transition
764
The transition time, in seconds, for volume renormalization when an input
Justin Ruggles's avatar
Justin Ruggles committed
765 766 767 768
stream ends. The default value is 2 seconds.

@end table

769 770 771 772
@section anull

Pass the audio source unchanged to the output.

773 774
@section apad

775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826
Pad the end of an audio stream with silence.

This can be used together with @command{ffmpeg} @option{-shortest} to
extend audio streams to the same length as the video stream.

A description of the accepted options follows.

@table @option
@item packet_size
Set silence packet size. Default value is 4096.

@item pad_len
Set the number of samples of silence to add to the end. After the
value is reached, the stream is terminated. This option is mutually
exclusive with @option{whole_len}.

@item whole_len
Set the minimum total number of samples in the output audio stream. If
the value is longer than the input audio length, silence is added to
the end, until the value is reached. This option is mutually exclusive
with @option{pad_len}.
@end table

If neither the @option{pad_len} nor the @option{whole_len} option is
set, the filter will add silence to the end of the input stream
indefinitely.

@subsection Examples

@itemize
@item
Add 1024 samples of silence to the end of the input:
@example
apad=pad_len=1024
@end example

@item
Make sure the audio output will contain at least 10000 samples, pad
the input with silence if required:
@example
apad=whole_len=10000
@end example

@item
Use @command{ffmpeg} to pad the audio input with silence, so that the
video stream will always result the shortest and will be converted
until the end in the output file when using the @option{shortest}
option:
@example
ffmpeg -i VIDEO -i AUDIO -filter_complex "[1:0]apad" -shortest OUTPUT
@end example
@end itemize
827

828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861
@section aphaser
Add a phasing effect to the input audio.

A phaser filter creates series of peaks and troughs in the frequency spectrum.
The position of the peaks and troughs are modulated so that they vary over time, creating a sweeping effect.

A description of the accepted parameters follows.

@table @option
@item in_gain
Set input gain. Default is 0.4.

@item out_gain
Set output gain. Default is 0.74

@item delay
Set delay in milliseconds. Default is 3.0.

@item decay
Set decay. Default is 0.4.

@item speed
Set modulation speed in Hz. Default is 0.5.

@item type
Set modulation type. Default is triangular.

It accepts the following values:
@table @samp
@item triangular, t
@item sinusoidal, s
@end table
@end table

862
@anchor{aresample}
Mina Nagy Zaki's avatar
Mina Nagy Zaki committed
863 864
@section aresample

865 866 867
Resample the input audio to the specified parameters, using the
libswresample library. If none are specified then the filter will
automatically convert between its input and output.
Mina Nagy Zaki's avatar
Mina Nagy Zaki committed
868

869 870 871 872
This filter is also able to stretch/squeeze the audio data to make it match
the timestamps or to inject silence / cut out audio to make it match the
timestamps, do a combination of both or do neither.

873 874 875 876 877
The filter accepts the syntax
[@var{sample_rate}:]@var{resampler_options}, where @var{sample_rate}
expresses a sample rate and @var{resampler_options} is a list of
@var{key}=@var{value} pairs, separated by ":". See the
ffmpeg-resampler manual for the complete list of supported options.
Mina Nagy Zaki's avatar
Mina Nagy Zaki committed
878

879 880 881 882 883
@subsection Examples

@itemize
@item
Resample the input audio to 44100Hz:
Mina Nagy Zaki's avatar
Mina Nagy Zaki committed
884 885 886 887
@example
aresample=44100
@end example

888 889
@item
Stretch/squeeze samples to the given timestamps, with a maximum of 1000
890 891 892 893
samples per second compensation:
@example
aresample=async=1000
@end example
894
@end itemize
895

896 897 898 899 900 901 902 903
@section asetnsamples

Set the number of samples per each output audio frame.

The last output packet may contain a different number of samples, as
the filter will flush all the remaining samples when the input audio
signal its end.

904
The filter accepts the following options:
905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924

@table @option

@item nb_out_samples, n
Set the number of frames per each output audio frame. The number is
intended as the number of samples @emph{per each channel}.
Default value is 1024.

@item pad, p
If set to 1, the filter will pad the last audio frame with zeroes, so
that the last frame will contain the same number of samples as the
previous ones. Default value is 1.
@end table

For example, to set the number of per-frame samples to 1234 and
disable padding for the last frame, use:
@example
asetnsamples=n=1234:p=0
@end example

Nicolas George's avatar
Nicolas George committed
925 926 927 928 929 930 931 932 933 934 935 936
@section asetrate

Set the sample rate without altering the PCM data.
This will result in a change of speed and pitch.

The filter accepts the following options:

@table @option
@item sample_rate, r
Set the output sample rate. Default is 44100 Hz.
@end table

Stefano Sabatini's avatar
Stefano Sabatini committed
937 938 939 940 941 942 943 944
@section ashowinfo

Show a line containing various information for each input audio frame.
The input audio is not modified.

The shown line contains a sequence of key/value pairs of the form
@var{key}:@var{value}.

945
The following values are shown in the output:
Stefano Sabatini's avatar
Stefano Sabatini committed
946 947 948

@table @option
@item n
949
The (sequential) number of the input frame, starting from 0.
Stefano Sabatini's avatar
Stefano Sabatini committed
950 951

@item pts
952
The presentation timestamp of the input frame, in time base units; the time base
Anton Khirnov's avatar
Anton Khirnov committed
953
depends on the filter input pad, and is usually 1/@var{sample_rate}.
Stefano Sabatini's avatar
Stefano Sabatini committed
954 955

@item pts_time
956
The presentation timestamp of the input frame in seconds.
Stefano Sabatini's avatar
Stefano Sabatini committed
957 958 959

@item pos
position of the frame in the input stream, -1 if this information in
960
unavailable and/or meaningless (for example in case of synthetic audio)
Stefano Sabatini's avatar
Stefano Sabatini committed
961 962

@item fmt
963
The sample format.
Stefano Sabatini's avatar
Stefano Sabatini committed
964 965

@item chlayout
966
The channel layout.
Stefano Sabatini's avatar
Stefano Sabatini committed
967 968

@item rate
969
The sample rate for the audio frame.
Stefano Sabatini's avatar
Stefano Sabatini committed
970

Anton Khirnov's avatar
Anton Khirnov committed
971
@item nb_samples
972
The number of samples (per channel) in the frame.
Anton Khirnov's avatar
Anton Khirnov committed
973

Stefano Sabatini's avatar
Stefano Sabatini committed
974
@item checksum
975 976
The Adler-32 checksum (printed in hexadecimal) of the audio data. For planar
audio, the data is treated as if all the planes were concatenated.
Stefano Sabatini's avatar
Stefano Sabatini committed
977

Anton Khirnov's avatar
Anton Khirnov committed
978 979
@item plane_checksums
A list of Adler-32 checksums for each data plane.
Stefano Sabatini's avatar
Stefano Sabatini committed
980 981
@end table

Paul B Mahol's avatar
Paul B Mahol committed
982
@anchor{astats}
Paul B Mahol's avatar
Paul B Mahol committed
983 984 985 986 987 988
@section astats

Display time domain statistical information about the audio channels.
Statistics are calculated and displayed for each audio channel and,
where applicable, an overall figure is also given.

989
It accepts the following option:
Paul B Mahol's avatar
Paul B Mahol committed
990 991 992
@table @option
@item length
Short window length in seconds, used for peak and trough RMS measurement.
993
Default is @code{0.05} (50 milliseconds). Allowed range is @code{[0.1 - 10]}.
994 995 996 997 998 999 1000 1001 1002 1003 1004

@item metadata

Set metadata injection. All the metadata keys are prefixed with @code{lavfi.astats.X},
where @code{X} is channel number starting from 1 or string @code{Overall}. Default is
disabled.

Available keys for each channel are:
DC_offset
Min_level
Max_level
1005
Min_difference
1006
Max_difference
1007
Mean_difference
1008 1009 1010 1011 1012 1013
Peak_level
RMS_peak
RMS_trough
Crest_factor
Flat_factor
Peak_count
1014
Bit_depth
1015 1016 1017 1018 1019

and for Overall:
DC_offset
Min_level
Max_level
1020
Min_difference
1021
Max_difference
1022
Mean_difference
1023 1024 1025 1026 1027 1028
Peak_level
RMS_level
RMS_peak
RMS_trough
Flat_factor
Peak_count
1029
Bit_depth
1030 1031 1032 1033 1034 1035 1036
Number_of_samples

For example full key look like this @code{lavfi.astats.1.DC_offset} or
this @code{lavfi.astats.Overall.Peak_count}.

For description what each key means read bellow.

1037 1038 1039
@item reset
Set number of frame after which stats are going to be recalculated.
Default is disabled.
Paul B Mahol's avatar
Paul B Mahol committed
1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053
@end table

A description of each shown parameter follows:

@table @option
@item DC offset
Mean amplitude displacement from zero.

@item Min level
Minimal sample level.

@item Max level
Maximal sample level.

1054 1055 1056
@item Min difference
Minimal difference between two consecutive samples.

1057 1058 1059
@item Max difference
Maximal difference between two consecutive samples.

1060 1061 1062 1063
@item Mean difference
Mean difference between two consecutive samples.
The average of each difference between two consecutive samples.

Paul B Mahol's avatar
Paul B Mahol committed
1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081
@item Peak level dB
@item RMS level dB
Standard peak and RMS level measured in dBFS.

@item RMS peak dB
@item RMS trough dB
Peak and trough values for RMS level measured over a short window.

@item Crest factor
Standard ratio of peak to RMS level (note: not in dB).

@item Flat factor
Flatness (i.e. consecutive samples with the same value) of the signal at its peak levels
(i.e. either @var{Min level} or @var{Max level}).

@item Peak count
Number of occasions (not the number of samples) that the signal attained either
@var{Min level} or @var{Max level}.
1082 1083 1084

@item Bit depth
Overall bit depth of audio. Number of bits used for each sample.
Paul B Mahol's avatar
Paul B Mahol committed
1085 1086
@end table

1087 1088 1089 1090
@section astreamsync

Forward two audio streams and control the order the buffers are forwarded.

1091 1092 1093 1094 1095
The filter accepts the following options:

@table @option
@item expr, e
Set the expression deciding which stream should be
1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110
forwarded next: if the result is negative, the first stream is forwarded; if
the result is positive or zero, the second stream is forwarded. It can use
the following variables:

@table @var
@item b1 b2
number of buffers forwarded so far on each stream
@item s1 s2
number of samples forwarded so far on each stream
@item t1 t2
current timestamp of each stream
@end table

The default value is @code{t1-t2}, which means to always forward the stream
that has a smaller timestamp.
1111 1112 1113
@end table

@subsection Examples
1114

1115
Stress-test @code{amerge} by randomly sending buffers on the wrong
1116 1117 1118 1119 1120 1121 1122
input, while avoiding too much of a desynchronization:
@example
amovie=file.ogg [a] ; amovie=file.mp3 [b] ;
[a] [b] astreamsync=(2*random(1))-1+tanh(5*(t1-t2)) [a2] [b2] ;
[a2] [b2] amerge
@end example

1123 1124 1125 1126 1127 1128 1129
@section asyncts

Synchronize audio data with timestamps by squeezing/stretching it and/or
dropping samples/adding silence when needed.

This filter is not built by default, please use @ref{aresample} to do squeezing/stretching.

1130
It accepts the following parameters:
1131 1132 1133 1134 1135 1136 1137
@table @option

@item compensate
Enable stretching/squeezing the data to make it match the timestamps. Disabled
by default. When disabled, time gaps are covered with silence.

@item min_delta
1138 1139 1140
The minimum difference between timestamps and audio data (in seconds) to trigger
adding/dropping samples. The default value is 0.1. If you get an imperfect
sync with this filter, try setting this parameter to 0.
1141 1142

@item max_comp
1143 1144
The maximum compensation in samples per second. Only relevant with compensate=1.
The default value is 500.
1145 1146

@item first_pts
1147 1148 1149
Assume that the first PTS should be this value. The time base is 1 / sample
rate. This allows for padding/trimming at the start of the stream. By default,
no assumption is made about the first frame's expected PTS, so no padding or
1150 1151
trimming is done. For example, this could be set to 0 to pad the beginning with
silence if an audio stream starts after the video stream or to trim any samples
1152
with a negative PTS due to encoder delay.
1153 1154 1155

@end table

Pavel Koshevoy's avatar
Pavel Koshevoy committed
1156 1157 1158 1159 1160 1161 1162 1163
@section atempo

Adjust audio tempo.

The filter accepts exactly one parameter, the audio tempo. If not
specified then the filter will assume nominal 1.0 tempo. Tempo must
be in the [0.5, 2.0] range.

1164 1165 1166 1167 1168
@subsection Examples

@itemize
@item
Slow down audio to 80% tempo:
Pavel Koshevoy's avatar
Pavel Koshevoy committed
1169 1170 1171 1172
@example
atempo=0.8
@end example

1173 1174
@item
To speed up audio to 125% tempo:
Pavel Koshevoy's avatar
Pavel Koshevoy committed
1175
@example
1176 1177 1178 1179 1180 1181 1182 1183
atempo=1.25
@end example
@end itemize

@section atrim

Trim the input so that the output contains one continuous subpart of the input.

1184
It accepts the following parameters:
1185 1186
@table @option
@item start
1187 1188
Timestamp (in seconds) of the start of the section to keep. I.e. the audio
sample with the timestamp @var{start} will be the first sample in the output.
1189 1190

@item end
1191
Specify time of the first audio sample that will be dropped, i.e. the
1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203
audio sample immediately preceding the one with the timestamp @var{end} will be
the last sample in the output.

@item start_pts
Same as @var{start}, except this option sets the start timestamp in samples
instead of seconds.

@item end_pts
Same as @var{end}, except this option sets the end timestamp in samples instead
of seconds.

@item duration
1204
The maximum duration of the output in seconds.
1205 1206

@item start_sample
1207
The number of the first sample that should be output.
1208 1209

@item end_sample
1210
The number of the first sample that should be dropped.
1211 1212
@end table

1213 1214 1215
@option{start}, @option{end}, and @option{duration} are expressed as time
duration specifications; see
@ref{time duration syntax,,the Time duration section in the ffmpeg-utils(1) manual,ffmpeg-utils}.
1216

1217 1218 1219 1220 1221
Note that the first two sets of the start/end options and the @option{duration}
option look at the frame timestamp, while the _sample options simply count the
samples that pass through the filter. So start/end_pts and start/end_sample will
give different results when the timestamps are wrong, inexact or do not start at
zero. Also note that this filter does not modify the timestamps. If you wish
1222
to have the output timestamps start at zero, insert the asetpts filter after the
1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235
atrim filter.

If multiple start or end options are set, this filter tries to be greedy and
keep all samples that match at least one of the specified constraints. To keep
only the part that matches all the constraints at once, chain multiple atrim
filters.

The defaults are such that all the input is kept. So it is possible to set e.g.
just the end values to keep everything before the specified time.

Examples:
@itemize
@item
1236
Drop everything except the second minute of input:
1237 1238 1239 1240 1241
@example
ffmpeg -i INPUT -af atrim=60:120
@end example

@item
1242
Keep only the first 1000 samples:

@example
ffmpeg -i INPUT -af atrim=end_sample=1000
@end example

@end itemize

@section bandpass

Apply a two-pole Butterworth band-pass filter with central
frequency @var{frequency}, and (3dB-point) band-width width.
The @var{csg} option selects a constant skirt gain (peak gain = Q)
instead of the default: constant 0dB peak gain.
The filter roll off at 6dB per octave (20dB per decade).

The filter accepts the following options:

@table @option
@item frequency, f
Set the filter's central frequency. Default is @code{3000}.

@item csg
Constant skirt gain if set to 1. Defaults to 0.

@item width_type
Set method to specify band-width of filter.
@table @option
@item h
Hz
@item q
Q-Factor
@item o
octave
@item s
slope
@end table

@item width, w
Specify the band-width of a filter in width_type units.
@end table

@section bandreject

Apply a two-pole Butterworth band-reject filter with central
frequency @var{frequency}, and (3dB-point) band-width @var{width}.
The filter roll off at 6dB per octave (20dB per decade).

The filter accepts the following options:

@table @option
@item frequency, f
Set the filter's central frequency. Default is @code{3000}.

@item width_type
Set method to specify band-width of filter.
@table @option
@item h
Hz
@item q
Q-Factor
@item o
octave
@item s
slope
@end table

@item width, w
Specify the band-width of a filter in width_type units.
@end table

@section bass

Boost or cut the bass (lower) frequencies of the audio using a two-pole
shelving filter with a response similar to that of a standard
hi-fi's tone-controls. This is also known as shelving equalisation (EQ).

The filter accepts the following options:

@table @option
@item gain, g
Give the gain at 0 Hz. Its useful range is about -20
(for a large cut) to +20 (for a large boost).
Beware of clipping when using a positive gain.

@item frequency, f
Set the filter's central frequency and so can be used
to extend or reduce the frequency range to be boosted or cut.
The default value is @code{100} Hz.

@item width_type
Set method to specify band-width of filter.
@table @option
@item h
Hz
@item q
Q-Factor
@item o
octave
@item s
slope
@end table

@item width, w
Determine how steep is the filter's shelf transition.
@end table

@section biquad

Apply a biquad IIR filter with the given coefficients.
Where @var{b0}, @var{b1}, @var{b2} and @var{a0}, @var{a1}, @var{a2}
are the numerator and denominator coefficients respectively.

1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383
@section bs2b
Bauer stereo to binaural transformation, which improves headphone listening of
stereo audio records.

It accepts the following parameters:
@table @option

@item profile
Pre-defined crossfeed level.
@table @option

@item default
Default level (fcut=700, feed=50).

@item cmoy
Chu Moy circuit (fcut=700, feed=60).

@item jmeier
Jan Meier circuit (fcut=650, feed=95).

@end table

@item fcut
Cut frequency (in Hz).

@item feed
Feed level (in Hz).

@end table

1384 1385 1386 1387
@section channelmap

Remap input channels to new locations.

1388
It accepts the following parameters:
1389 1390
@table @option
@item channel_layout
1391
The channel layout of the output stream.
1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403

@item map
Map channels from input to output. The argument is a '|'-separated list of
mappings, each in the @code{@var{in_channel}-@var{out_channel}} or
@var{in_channel} form. @var{in_channel} can be either the name of the input
channel (e.g. FL for front left) or its index in the input channel layout.
@var{out_channel} is the name of the output channel or its index in the output
channel layout. If @var{out_channel} is not given then it is implicitly an
index, starting with zero and increasing by one for each mapping.
@end table

If no mapping is present, the filter will implicitly map input channels to
1404
output channels, preserving indices.
1405

1406
For example, assuming a 5.1+downmix input MOV file,
1407 1408 1409 1410 1411 1412 1413 1414
@example
ffmpeg -i in.mov -filter 'channelmap=map=DL-FL|DR-FR' out.wav
@end example
will create an output WAV file tagged as stereo from the downmix channels of
the input.

To fix a 5.1 WAV improperly encoded in AAC's native channel order
@example
1415
ffmpeg -i in.wav -filter 'channelmap=1|2|0|5|3|4:5.1' out.wav
1416 1417 1418 1419
@end example

@section channelsplit

1420
Split each channel from an input audio stream into a separate output stream.
1421

1422
It accepts the following parameters:
1423 1424
@table @option
@item channel_layout
1425
The channel layout of the input stream. The default is "stereo".
1426 1427
@end table

1428
For example, assuming a stereo input MP3 file,
1429 1430 1431 1432 1433 1434
@example
ffmpeg -i in.mp3 -filter_complex channelsplit out.mkv
@end example
will create an output Matroska file with two audio streams, one containing only
the left channel and the other the right channel.

1435
Split a 5.1 WAV file into per-channel files:
1436 1437 1438 1439 1440 1441 1442 1443
@example
ffmpeg -i in.wav -filter_complex
'channelsplit=channel_layout=5.1[FL][FR][FC][LFE][SL][SR]'
-map '[FL]' front_left.wav -map '[FR]' front_right.wav -map '[FC]'
front_center.wav -map '[LFE]' lfe.wav -map '[SL]' side_left.wav -map '[SR]'
side_right.wav
@end example

Paul B Mahol's avatar
Paul B Mahol committed
1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498
@section chorus
Add a chorus effect to the audio.

Can make a single vocal sound like a chorus, but can also be applied to instrumentation.

Chorus resembles an echo effect with a short delay, but whereas with echo the delay is
constant, with chorus, it is varied using using sinusoidal or triangular modulation.
The modulation depth defines the range the modulated delay is played before or after
the delay. Hence the delayed sound will sound slower or faster, that is the delayed
sound tuned around the original one, like in a chorus where some vocals are slightly
off key.

It accepts the following parameters:
@table @option
@item in_gain
Set input gain. Default is 0.4.

@item out_gain
Set output gain. Default is 0.4.

@item delays
Set delays. A typical delay is around 40ms to 60ms.

@item decays
Set decays.

@item speeds
Set speeds.

@item depths
Set depths.
@end table

@subsection Examples

@itemize
@item
A single delay:
@example
chorus=0.7:0.9:55:0.4:0.25:2
@end example

@item
Two delays:
@example
chorus=0.6:0.9:50|60:0.4|0.32:0.25|0.4:2|1.3
@end example

@item
Fuller sounding chorus with three delays:
@example
chorus=0.5:0.9:50|60|40:0.4|0.32|0.3:0.25|0.4|0.3:2|2.3|1.3
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
1499
@section compand
1500
Compress or expand the audio's dynamic range.
Paul B Mahol's avatar
Paul B Mahol committed
1501

1502
It accepts the following parameters:
Paul B Mahol's avatar
Paul B Mahol committed
1503 1504

@table @option
1505

Paul B Mahol's avatar
Paul B Mahol committed
1506 1507
@item attacks
@item decays
1508
A list of times in seconds for each channel over which the instantaneous level
1509 1510 1511
of the input signal is averaged to determine its volume. @var{attacks} refers to
increase of volume and @var{decays} refers to decrease of volume. For most
situations, the attack time (response to the audio getting louder) should be
1512
shorter than the decay time, because the human ear is more sensitive to sudden
1513 1514
loud audio than sudden soft audio. A typical value for attack is 0.3 seconds and
a typical value for decay is 0.8 seconds.
1515 1516
If specified number of attacks & decays is lower than number of channels, the last
set attack/decay will be used for all remaining channels.
Paul B Mahol's avatar
Paul B Mahol committed
1517 1518

@item points
1519
A list of points for the transfer function, specified in dB relative to the
1520
maximum possible signal amplitude. Each key points list must be defined using
1521 1522
the following syntax: @code{x0/y0|x1/y1|x2/y2|....} or
@code{x0/y0 x1/y1 x2/y2 ....}
Paul B Mahol's avatar
Paul B Mahol committed
1523

1524 1525 1526 1527
The input values must be in strictly increasing order but the transfer function
does not have to be monotonically rising. The point @code{0/0} is assumed but
may be overridden (by @code{0/out-dBn}). Typical values for the transfer
function are @code{-70/-70|-60/-20}.
Paul B Mahol's avatar
Paul B Mahol committed
1528 1529

@item soft-knee
1530
Set the curve radius in dB for all joints. It defaults to 0.01.
Paul B Mahol's avatar
Paul B Mahol committed
1531 1532

@item gain
1533 1534 1535
Set the additional gain in dB to be applied at all points on the transfer
function. This allows for easy adjustment of the overall gain.
It defaults to 0.
Paul B Mahol's avatar
Paul B Mahol committed
1536 1537

@item volume
1538 1539
Set an initial volume, in dB, to be assumed for each channel when filtering
starts. This permits the user to supply a nominal level initially, so that, for
1540 1541
example, a very large gain is not applied to initial signal levels before the
companding has begun to operate. A typical value for audio which is initially
1542
quiet is -90 dB. It defaults to 0.
Paul B Mahol's avatar
Paul B Mahol committed
1543 1544

@item delay
1545
Set a delay, in seconds. The input audio is analyzed immediately, but audio is
1546 1547
delayed before being fed to the volume adjuster. Specifying a delay
approximately equal to the attack/decay times allows the filter to effectively
1548
operate in predictive rather than reactive mode. It defaults to 0.
1549

Paul B Mahol's avatar
Paul B Mahol committed
1550 1551 1552
@end table

@subsection Examples
1553

Paul B Mahol's avatar
Paul B Mahol committed
1554 1555
@itemize
@item
1556 1557
Make music with both quiet and loud passages suitable for listening to in a
noisy environment:
Paul B Mahol's avatar
Paul B Mahol committed
1558
@example
1559
compand=.3|.3:1|1:-90/-60|-60/-40|-40/-30|-20/-20:6:0:-90:0.2
Paul B Mahol's avatar
Paul B Mahol committed
1560 1561
@end example

1562 1563 1564 1565 1566
Another example for audio with whisper and explosion parts:
@example
compand=0|0:1|1:-90/-900|-70/-70|-30/-9|0/-3:6:0:0:0
@end example

Paul B Mahol's avatar
Paul B Mahol committed
1567
@item
1568
A noise gate for when the noise is at a lower level than the signal:
Paul B Mahol's avatar
Paul B Mahol committed
1569
@example
1570
compand=.1|.1:.2|.2:-900/-900|-50.1/-900|-50/-50:.01:0:-90:.1
Paul B Mahol's avatar
Paul B Mahol committed
1571 1572 1573
@end example

@item
1574
Here is another noise gate, this time for when the noise is at a higher level
Paul B Mahol's avatar
Paul B Mahol committed
1575 1576
than the signal (making it, in some ways, similar to squelch):
@example
1577
compand=.1|.1:.1|.1:-45.1/-45.1|-45/-900|0/-900:.01:45:-90:.1
Paul B Mahol's avatar
Paul B Mahol committed
1578 1579 1580
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598
@section dcshift
Apply a DC shift to the audio.

This can be useful to remove a DC offset (caused perhaps by a hardware problem
in the recording chain) from the audio. The effect of a DC offset is reduced
headroom and hence volume. The @ref{astats} filter can be used to determine if
a signal has a DC offset.

@table @option
@item shift
Set the DC shift, allowed range is [-1, 1]. It indicates the amount to shift
the audio.

@item limitergain
Optional. It should have a value much less than 1 (e.g. 0.05 or 0.02) and is
used to prevent clipping.
@end table

1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756
@section dynaudnorm
Dynamic Audio Normalizer.

This filter applies a certain amount of gain to the input audio in order
to bring its peak magnitude to a target level (e.g. 0 dBFS). However, in
contrast to more "simple" normalization algorithms, the Dynamic Audio
Normalizer *dynamically* re-adjusts the gain factor to the input audio.
This allows for applying extra gain to the "quiet" sections of the audio
while avoiding distortions or clipping the "loud" sections. In other words:
The Dynamic Audio Normalizer will "even out" the volume of quiet and loud
sections, in the sense that the volume of each section is brought to the
same target level. Note, however, that the Dynamic Audio Normalizer achieves
this goal *without* applying "dynamic range compressing". It will retain 100%
of the dynamic range *within* each section of the audio file.

@table @option
@item f
Set the frame length in milliseconds. In range from 10 to 8000 milliseconds.
Default is 500 milliseconds.
The Dynamic Audio Normalizer processes the input audio in small chunks,
referred to as frames. This is required, because a peak magnitude has no
meaning for just a single sample value. Instead, we need to determine the
peak magnitude for a contiguous sequence of sample values. While a "standard"
normalizer would simply use the peak magnitude of the complete file, the
Dynamic Audio Normalizer determines the peak magnitude individually for each
frame. The length of a frame is specified in milliseconds. By default, the
Dynamic Audio Normalizer uses a frame length of 500 milliseconds, which has
been found to give good results with most files.
Note that the exact frame length, in number of samples, will be determined
automatically, based on the sampling rate of the individual input audio file.

@item g
Set the Gaussian filter window size. In range from 3 to 301, must be odd
number. Default is 31.
Probably the most important parameter of the Dynamic Audio Normalizer is the
@code{window size} of the Gaussian smoothing filter. The filter's window size
is specified in frames, centered around the current frame. For the sake of
simplicity, this must be an odd number. Consequently, the default value of 31
takes into account the current frame, as well as the 15 preceding frames and
the 15 subsequent frames. Using a larger window results in a stronger
smoothing effect and thus in less gain variation, i.e. slower gain
adaptation. Conversely, using a smaller window results in a weaker smoothing
effect and thus in more gain variation, i.e. faster gain adaptation.
In other words, the more you increase this value, the more the Dynamic Audio
Normalizer will behave like a "traditional" normalization filter. On the
contrary, the more you decrease this value, the more the Dynamic Audio
Normalizer will behave like a dynamic range compressor.

@item p
Set the target peak value. This specifies the highest permissible magnitude
level for the normalized audio input. This filter will try to approach the
target peak magnitude as closely as possible, but at the same time it also
makes sure that the normalized signal will never exceed the peak magnitude.
A frame's maximum local gain factor is imposed directly by the target peak
magnitude. The default value is 0.95 and thus leaves a headroom of 5%*.
It is not recommended to go above this value.

@item m
Set the maximum gain factor. In range from 1.0 to 100.0. Default is 10.0.
The Dynamic Audio Normalizer determines the maximum possible (local) gain
factor for each input frame, i.e. the maximum gain factor that does not
result in clipping or distortion. The maximum gain factor is determined by
the frame's highest magnitude sample. However, the Dynamic Audio Normalizer
additionally bounds the frame's maximum gain factor by a predetermined
(global) maximum gain factor. This is done in order to avoid excessive gain
factors in "silent" or almost silent frames. By default, the maximum gain
factor is 10.0, For most inputs the default value should be sufficient and
it usually is not recommended to increase this value. Though, for input
with an extremely low overall volume level, it may be necessary to allow even
higher gain factors. Note, however, that the Dynamic Audio Normalizer does
not simply apply a "hard" threshold (i.e. cut off values above the threshold).
Instead, a "sigmoid" threshold function will be applied. This way, the
gain factors will smoothly approach the threshold value, but never exceed that
value.

@item r
Set the target RMS. In range from 0.0 to 1.0. Default is 0.0 - disabled.
By default, the Dynamic Audio Normalizer performs "peak" normalization.
This means that the maximum local gain factor for each frame is defined
(only) by the frame's highest magnitude sample. This way, the samples can
be amplified as much as possible without exceeding the maximum signal
level, i.e. without clipping. Optionally, however, the Dynamic Audio
Normalizer can also take into account the frame's root mean square,
abbreviated RMS. In electrical engineering, the RMS is commonly used to
determine the power of a time-varying signal. It is therefore considered
that the RMS is a better approximation of the "perceived loudness" than
just looking at the signal's peak magnitude. Consequently, by adjusting all
frames to a constant RMS value, a uniform "perceived loudness" can be
established. If a target RMS value has been specified, a frame's local gain
factor is defined as the factor that would result in exactly that RMS value.
Note, however, that the maximum local gain factor is still restricted by the
frame's highest magnitude sample, in order to prevent clipping.

@item n
Enable channels coupling. By default is enabled.
By default, the Dynamic Audio Normalizer will amplify all channels by the same
amount. This means the same gain factor will be applied to all channels, i.e.
the maximum possible gain factor is determined by the "loudest" channel.
However, in some recordings, it may happen that the volume of the different
channels is uneven, e.g. one channel may be "quieter" than the other one(s).
In this case, this option can be used to disable the channel coupling. This way,
the gain factor will be determined independently for each channel, depending
only on the individual channel's highest magnitude sample. This allows for
harmonizing the volume of the different channels.

@item c
Enable DC bias correction. By default is disabled.
An audio signal (in the time domain) is a sequence of sample values.
In the Dynamic Audio Normalizer these sample values are represented in the
-1.0 to 1.0 range, regardless of the original input format. Normally, the
audio signal, or "waveform", should be centered around the zero point.
That means if we calculate the mean value of all samples in a file, or in a
single frame, then the result should be 0.0 or at least very close to that
value. If, however, there is a significant deviation of the mean value from
0.0, in either positive or negative direction, this is referred to as a
DC bias or DC offset. Since a DC bias is clearly undesirable, the Dynamic
Audio Normalizer provides optional DC bias correction.
With DC bias correction enabled, the Dynamic Audio Normalizer will determine
the mean value, or "DC correction" offset, of each input frame and subtract
that value from all of the frame's sample values which ensures those samples
are centered around 0.0 again. Also, in order to avoid "gaps" at the frame
boundaries, the DC correction offset values will be interpolated smoothly
between neighbouring frames.

@item b
Enable alternative boundary mode. By default is disabled.
The Dynamic Audio Normalizer takes into account a certain neighbourhood
around each frame. This includes the preceding frames as well as the
subsequent frames. However, for the "boundary" frames, located at the very
beginning and at the very end of the audio file, not all neighbouring
frames are available. In particular, for the first few frames in the audio
file, the preceding frames are not known. And, similarly, for the last few
frames in the audio file, the subsequent frames are not known. Thus, the
question arises which gain factors should be assumed for the missing frames
in the "boundary" region. The Dynamic Audio Normalizer implements two modes
to deal with this situation. The default boundary mode assumes a gain factor
of exactly 1.0 for the missing frames, resulting in a smooth "fade in" and
"fade out" at the beginning and at the end of the input, respectively.

@item s
Set the compress factor. In range from 0.0 to 30.0. Default is 0.0.
By default, the Dynamic Audio Normalizer does not apply "traditional"
compression. This means that signal peaks will not be pruned and thus the
full dynamic range will be retained within each local neighbourhood. However,
in some cases it may be desirable to combine the Dynamic Audio Normalizer's
normalization algorithm with a more "traditional" compression.
For this purpose, the Dynamic Audio Normalizer provides an optional compression
(thresholding) function. If (and only if) the compression feature is enabled,
all input frames will be processed by a soft knee thresholding function prior
to the actual normalization process. Put simply, the thresholding function is
going to prune all samples whose magnitude exceeds a certain threshold value.
However, the Dynamic Audio Normalizer does not simply apply a fixed threshold
value. Instead, the threshold value will be adjusted for each individual
frame.
In general, smaller parameters result in stronger compression, and vice versa.
Values below 3.0 are not recommended, because audible distortion may appear.
@end table

1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804
@section earwax

Make audio easier to listen to on headphones.

This filter adds `cues' to 44.1kHz stereo (i.e. audio CD format) audio
so that when listened to on headphones the stereo image is moved from
inside your head (standard for headphones) to outside and in front of
the listener (standard for speakers).

Ported from SoX.

@section equalizer

Apply a two-pole peaking equalisation (EQ) filter. With this
filter, the signal-level at and around a selected frequency can
be increased or decreased, whilst (unlike bandpass and bandreject
filters) that at all other frequencies is unchanged.

In order to produce complex equalisation curves, this filter can
be given several times, each with a different central frequency.

The filter accepts the following options:

@table @option
@item frequency, f
Set the filter's central frequency in Hz.

@item width_type
Set method to specify band-width of filter.
@table @option
@item h
Hz
@item q
Q-Factor
@item o
octave
@item s
slope
@end table

@item width, w
Specify the band-width of a filter in width_type units.

@item gain, g
Set the required gain or attenuation in dB.
Beware of clipping when using a positive gain.
@end table

1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 1818 1819
@subsection Examples
@itemize
@item
Attenuate 10 dB at 1000 Hz, with a bandwidth of 200 Hz:
@example
equalizer=f=1000:width_type=h:width=200:g=-10
@end example

@item
Apply 2 dB gain at 1000 Hz with Q 1 and attenuate 5 dB at 100 Hz with Q 2:
@example
equalizer=f=1000:width_type=q:width=1:g=2,equalizer=f=100:width_type=q:width=2:g=-5
@end example
@end itemize

Paul B Mahol's avatar
Paul B Mahol committed
1820 1821