GStreamer Simple Guide
This is a non-official community guide. Please help us to make it rock :)
This is more of a hands-on guide, but if you want the official docs, jump to here
0. Basic knowledge
- A gstreamer pipeline is composed by a series of elements. Each element executes a certain processing on your data. Connecting them makes them work together, just like pluging your DVD player on your TV transforms the data inside the disc into a full blown image on screen.
- The pipeline is the assembly of your elements linked together, and it's also a gstreamer Bin. A Bin is special because it can contain other elements, and it's useful because you can hide/abstract all of those elements inside a single black-box. [ note: you can build your own Bins ].
Each gstreamer element can have source pads and sink pads. A source pad is the element's output and a sink pad is the input. Source elements only have source pads - so they can only produce data - whilst Sink elements only process data.
- Imagine pads (src/sink) as being little plugs. They can support some formats but no others. You can't plug a video output into a microphone input, for instance. That's why each pad has certain capabilities ("caps" for short)
- You can express pipelines with a very simple string notation, so that you can run it with the gst-launch program:
$ gst-launch filesrc location=my_song.mp3 ! decodebin ! gconfaudiosink
This one will decode and play the my_song.mp3 file using your default audio output.
Obs: "location" is an attribute of the element filesrc. "decodebin" is a Bin element that decodes any file you throw at it.
- Some elements perform "caps negotiation", which means they will strive to hand your data on the format required by the next element. The cool thing is: converter elements (such as videoscale) often rely on caps to transform data in a certain way, eg:
[ FIXME: insert example ]
- MIME-Type expressions (eg: "video/x-raw-yuv,width=480,height=360") are not actual elements in the pipeline, but rather syntax sugar for the "capsfilter" element (think capsfilter.caps = mime_expression)
1. Video encoding
1.1 Framerate adjustment
See videorate. eg: videorate ! video/x-raw-yuv,framerate=25/2
Obs: as for version 0.10 there's still no frame interpolation, it only duplicates/drops frames as needed
1.2 Scaling
See videoscale. eg: videoscale ! video/x-raw-yuv,width=480,height=360
1.3 RGB <-> YUV conversion
see ffmpegcolorspace
1.4 Stream Encoding
You need first to encode a video stream, and then wrap it inside a container file (this is done by the "muxer" element)
- AVI file: ffenc_mpeg4 ! avimux
- OGM file: theoraenc ! oggmux
99. Technical stuff
- Elements send gstreamer Buffers to each other as a mean of carrying data. Those buffers can be timestamped so that time-based streams can be easily organized. You need that when assemblying audio/video from other pieces of data, such as pictures and sound samples; otherwise gstreamer can't tell when and for how long should the data appear inside the stream. In Python, you can do something like this:
buffer = gst.Buffer(data) buffer.timestamp = gst.SECOND * 10 # start at the 10th second buffer.duration = gst.SECOND * 3 # three seconds duration
- Pads (src/sink) can have all kinds of internal properties. They can have name, caps,... You can even code and extend your own pad with any properties you want. Check out the videomixer element.