We have already worked with the
playbin2 element, which is capable of building a complete playback pipeline without much work on our side. This tutorial shows how to further customize
playbin2 in case its default values do not suit our particular needs.
We will learn:
How to find out how many streams a file contains, and how to switch among them.
How to gather information regarding each stream.
As a side note, even though its name is
playbin2, you can pronounce it “playbin”, since the original
playbin element is deprecated and nobody should be using it.
More often than not, multiple audio, video and subtitle streams can be found embedded in a single file. The most common case are regular movies, which contain one video and one audio stream (Stereo or 5.1 audio tracks are considered a single stream). It is also increasingly common to find movies with one video and multiple audio streams, to account for different languages. In this case, the user selects one audio stream, and the application will only play that one, ignoring the others.
To be able to select the appropriate stream, the user needs to know certain information about them, for example, their language. This information is embedded in the streams in the form of “metadata” (annexed data), and this tutorial shows how to retrieve it.
Subtitles can also be embedded in a file, along with audio and video, but they are dealt with in more detail in Playback tutorial 2: Subtitle management. Finally, multiple video streams can also be found in a single file, for example, in DVD with multiple angles of the same scene, but they are somewhat rare.
Embedding multiple streams inside a single file is called “multiplexing” or “muxing”, and such file is then known as a “container”. Common container formats are Matroska (.mkv), Quicktime (.qt, .mov, .mp4), Ogg (.ogg) or Webm (.webm).
Retrieving the individual streams from within the container is called “demultiplexing” or “demuxing”.
The following code recovers the amount of streams in the file, their associated metadata, and allows switching the audio stream while the media is playing.
The multilingual player
Copy this code into a text file named
playback-tutorial-1.c (or find it in the SDK installation).
Need help? (Click to expand)
This tutorial opens a window and displays a movie, with accompanying audio. The media is fetched from the Internet, so the window might take a few seconds to appear, depending on your connection speed. The number of audio streams is shown in the terminal, and the user can switch from one to another by entering a number and pressing enter. A small delay is to be expected.
Bear in mind that there is no latency management (buffering), so on slow connections, the movie might stop after a few seconds. See how Tutorial 12: Live streaming solves this issue.
We start, as usual, putting all our variables in a structure, so we can pass it around to functions. For this tutorial, we need the amount of streams of each type, and the currently playing one. Also, we are going to use a different mechanism to wait for messages that allows interactivity, so we need a GLib's main loop object.
Later we are going to set some of
playbin2's flags. We would like to have a handy enum that allows manipulating these flags easily, but since
playbin2 is a plug-in and not a part of the GStreamer core, this enum is not available to us. The “trick” is simply to declare this enum in our code, as it appears in the
GstPlayFlags. GObject allows introspection, so the possible values for these flags can be retrieved at runtime without using this trick, but in a far more cumbersome way.
Forward declarations for the two callbacks we will be using.
handle_message for the GStreamer messages, as we have already seen, and
handle_keyboard for key strokes, since this tutorial is introducing a limited amount of interactivity.
We skip over the creation of the pipeline, the instantiation of
playbin2 and pointing it to our test media through the
playbin2 is in itself a pipeline, and in this case it is the only element in the pipeline, so we skip completely the creation of the pipeline, and use directly the
We focus on some of the other properties of
|Enable video rendering. If this flag is not set, there will be no video output.|
|Enable audio rendering. If this flag is not set, there will be no audio output.|
|Enable subtitle rendering. If this flag is not set, subtitles will not be shown in the video output.|
|Enable rendering of visualisations when there is no video stream. Playback tutorial 6: Audio visualization goes into more details.|
|See Basic tutorial 12: Streaming and Playback tutorial 4: Progressive streaming.|
|See Basic tutorial 12: Streaming and Playback tutorial 4: Progressive streaming.|
If the video content was interlaced, this flag instructs
In our case, for demonstration purposes, we are enabling audio and video and disabling subtitles, leaving the rest of flags to their default values (this is why we read the current value of the flags with
g_object_get() before overwriting it with
This property is not really useful in this example.
playbin2 of the maximum speed of our network connection, so, in case multiple versions of the requested media are available in the server,
playbin2 chooses the most appropriate. This is mostly used in combination with streaming protocols like
We have set all these properties one by one, but we could have all of them with a single call to
This is why
g_object_set() requires a NULL as the last parameter.
These lines connect a callback function to the standard input (the keyboard). The mechanism shown here is specific to GLib, and not really related to GStreamer, so there is no point in going into much depth. Applications normally have their own way of handling user input, and GStreamer has little to do with it besides the Navigation interface discussed briefly in Tutorial 17: DVD playback.
To allow interactivity, we will no longer poll the GStreamer bus manually. Instead, we create a
GMainLoop(GLib main loop) and set it running with
g_main_loop_run(). This function blocks and will not return until
g_main_loop_quit() is issued. In the meantime, it will call the callbacks we have registered at the appropriate times:
handle_message when a message appears on the bus, and
handle_keyboard when the user presses any key.
There is nothing new in handle_message, except that when the pipeline moves to the PLAYING state, it will call the
As the comment says, this function just gathers information from the media and prints it on the screen. The number of video, audio and subtitle streams is directly available through the
Now, for each stream, we want to retrieve its metadata. Metadata is stored as tags in a
GstTagList structure, which is a list of data pieces identified by a name. The
GstTagList associated with a stream can be recovered with
g_signal_emit_by_name(), and then individual tags are extracted with the
gst_tag_list_get_* functions like
gst_tag_list_get_string() for example.
This rather unintuitive way of retrieving the tag list is called an Action Signal. Action signals are emitted by the application to a specific element, which then performs an action and returns a result. They behave like a dynamic function call, in which methods of a class are identified by their name (the signal's name) instead of their memory address. These signals are listed In the documentation along with the regular signals, and are tagged “Action”. See
playbin2 defines 3 action signals to retrieve metadata:
get-text-tags. The name if the tags is standardized, and the list can be found in the
GstTagList documentation. In this example we are interested in the
GST_TAG_LANGUAGE_CODE of the streams and their
GST_TAG_*_CODEC (audio, video or text).
Once we have extracted all the metadata we want, we get the streams that are currently selected through 3 more properties of
It is interesting to always check the currently selected streams and never make any assumption. Multiple internal conditions can make
playbin2 behave differently in different executions. Also, the order in which the streams are listed can change from one run to another, so checking the metadata to identify one particular stream becomes crucial.
Finally, we allow the user to switch the running audio stream. This very basic function just reads a string from the standard input (the keyboard), interprets it as a number, and tries to set the
current-audio property of
playbin2 (which previously we have only read).
Bear in mind that the switch is not immediate. Some of the previously decoded audio will still be flowing through the pipeline, while the new stream becomes active and is decoded. The delay depends on the particular multiplexing of the streams in the container, and the length
playbin2 has selected for its internal queues (which depends on the network conditions).
If you execute the tutorial, you will be able to switch from one language to another while the movie is running by pressing 0, 1 or 2 (and ENTER). This concludes this tutorial.
This tutorial has shown:
A few more of
How to retrieve the list of tags associated with a stream with
How to switch the current audio simply by writing to the
The next playback tutorial shows how to handle subtitles, either embedded in the container or in an external file.
Remember that attached to this page you should find the complete source code of the tutorial and any accessory files needed to build it.
It has been a pleasure having you here, and see you soon!