Images

ffpyplayer.pic

FFmpeg based image storage and conversion tools

FFmpeg based classes to store and convert images from / to many different pixel formats. See Image and SWScale for details.

Create an image in rgb24 format:

>>> w, h = 500, 100
>>> size = w * h * 3
>>> buf = bytearray([int(x * 255 / size) for x in range(size)])
>>> img = Image(plane_buffers=[buf], pix_fmt='rgb24', size=(w, h))

Convert the image to a different size:

>>> sws = SWScale(w, h, img.get_pixel_format(), ow=w/2, oh=h/3)
>>> img2 = sws.scale(img)
>>> img2.get_size()
(250, 33)

Convert the image to YUV420P and get the resulting plane buffers as bytearrays:

>>> sws = SWScale(w, h, img.get_pixel_format(), ofmt='yuv420p')
>>> img2 = sws.scale(img)
>>> img2.get_pixel_format()
'yuv420p'
>>> planes = img2.to_bytearray()
>>> map(len, planes)
[50000, 12500, 12500, 0]

Create an Image using default FFmpeg buffers:

>>> img = Image(pix_fmt='rgb24', size=(w, h))

Copy the image:

>>> import copy
>>> # copy reference without actually copying the buffers
>>> img2 = copy.copy(img)
>>> # do deep copy
>>> img2 = copy.deepcopy(img)
class ffpyplayer.pic.Image

Bases: object

Stores a image using a specified pixel format.

An Image can be represented by many different pixel formats, which determines how the buffer representing it is stored. We store the buffers as one to four arrays of bytes representing the one to four planes. For example, RGB23 has all the data in the first plane in the form of RGBRGB… while YUV420P uses the first three planes.

The Image can be initialized with a list of the plane buffers, or internal buffers can be created when none are provided. Depending on how it’s initialized one or more params need to be specified.

Paramters
plane_buffers: list

A list of bytes or bytearray type objects representing the 1-4 planes. The number of planes is determined by pix_fmt (e.g. 1 for RGB24, 3 for yuv). The length of the bytes object in each plane is a function of size, and if provided, also linesize. See linesize for details. The buffers are used directly without making any copies therefore, the bytes objects are kept alive internally as long as this instance is alive.

If empty, internal buffers for the image will be created for the image.

pix_fmt: str

The pixel format of the image. Can be one of ffpyplayer.tools.pix_fmts. Must be provided when using plane_buffers.

size: 2-tuple of ints

The size of the frame in the form of (width, height). Must be provided when using plane_buffers.

linesize: list of ints

The linesize of each provided plane. In addition to the width of the frame, a linesize can be provided. The linesize represent the actual number of bytes in each line, and may be padded at the end to satisfy some alignment requirement. For example, a RGB24 frame of size (100, 10) will have 3 * 100 = 300 bytes in each horizontal line and will be 3000 bytes large. But, when 32 bit alignment is required, the buffer will have to padded at the end so that each line is 320 bytes, and the total buffer length is 3200 bytes. If linesize is provided, it must be provided for every valid plane. If it’s not provided, an alignment of 1 (i.e. no alignment) is assumed. See get_buffer_size() for more details.

no_create: bool

A optional argument, which if provided with True will just create the instance and not initialize anything. All other parameters are ignored when True. This is useful when instantiating later from cython with the cython_init method.

Copying

FFmpeg has an internal ref counting system where when used, it frees buffers it allocated only when there’s no reference to it remaining thereby allowing multiple images to use the same buffer without making copies. When the Image class allocates the image buffers, e.g. when plane_buffers is empty such reference buffers are created. As a consequence, when copying the Image object, the buffers will not have to be copied.

Using the python copy module you can do a shallow or a deep copy of the object. When doing a shallow copy, new buffers will be created if the original buffers were not FFmpeg created and referenced, e.g. if provided using plane_buffers. This is to ensure the buffers won’t go out of memory while in use.

After the copy, the buffers will be “referenced” and additional copies will create more references without copying the buffers. A deep copy, however, will always create a new referenced buffer. The function is_ref() indicates whether the image buffer is such a FFmpeg referenced buffer.

:

>>> w, h = 640, 480
>>> size = w * h * 3
>>> buf = bytearray([int(x * 255 / size) for x in range(size)])
>>> img = Image(plane_buffers=[buf], pix_fmt='rgb24', size=(w, h))
>>> img2 = Image(pix_fmt='rgb24', size=(w, h))
get_buffer_size(self, keep_align=False)

Returns the size of the buffers of each plane.

Parameters
keep_align: bool

If True, the linesize alignments of the actual image will be used to calculate the buffer size for each plane. If False, an alignment of 1 (i.e. no alignment) will be used, returning the minimal buffer size required to store the image. Defaults to False.

Returns

4-tuple of ints:

A list of buffer sizes for each plane of this pixel format.

A (unaligned) yuv420p image has 3 planes:

>>> w, h = 100, 10
>>> img = Image(pix_fmt='yuv420p', size=(w, h))
>>> img.get_linesizes(keep_align=True)
(100, 50, 50, 0)
>>> img.get_buffer_size()
(1000, 250, 250, 0)

>>> # align to 32 bits
>>> linesize = img.get_linesizes(keep_align=True)
>>> align = lambda x: int(math.ceil(x / 32.) * 32)
>>> linesize = map(align, linesize)
>>> linesize
[128, 64, 64, 0]
>>> img = Image(pix_fmt='yuv420p', size=(w, h), linesize=linesize)
>>> img.get_linesizes(keep_align=True)
(128, 64, 64, 0)
>>> img.get_buffer_size(keep_align=True)
(1280, 320, 320, 0)
>>> img.get_buffer_size()
(1000, 250, 250, 0)
get_linesizes(self, keep_align=False)

Returns the linesize of each plane.

The linesize is the actual number of bytes in each horizontal line for a given plane, which may be padded at the end to satisfy some alignment requirement. For example, a RGB24 frame of size (100, 10) will have 3 * 100 = 300 bytes in each line and will be 3000 bytes large. But, when 32 bit alignment is required, the buffer will have to padded at the end so that each line is 320 bytes, and the total buffer length is 3200 bytes.

Parameters
keep_align: bool

If True, the original linesize alignments of the image will be returned for every plane. If False, linesize with an alignment of 1 (i.e. no alignment) will be used, returning the minimal linesize required to for the image. Defaults to False.

Returns

4-tuple of ints:

A 4 tuple with the linesizes of each plane. If the plane isn’t used it’ll be 0.

By defaults there’s no alignment:

>>> w, h = 100, 10
>>> img = Image(plane_buffers=[bytes(' ') * (w * h * 3)],
... pix_fmt='rgb24', size=(w, h))
>>> img.get_linesizes(keep_align=True)
(300, 0, 0, 0)

You can force alignment e.g. 32 bits alignment:

>>> import math
>>> linesize = [int(math.ceil(w * 3 / 32.) * 32)]
>>> linesize
[320]
>>> img = Image(plane_buffers=[bytes(' ') * (h * linesize[0])],
... pix_fmt='rgb24', size=(w, h), linesize=linesize)
>>> img.get_linesizes(keep_align=True)
(320, 0, 0, 0)
>>> img.get_size()
(100, 10)

The linesizes of an unaligned and 32 bit aligned yuv420p image:

>>> img = Image(pix_fmt='yuv420p', size=(w, h))
>>> img.get_linesizes(keep_align=True)
(100, 50, 50, 0)
>>> img.get_size()
(100, 10)

>>> # now try align to 32 bit
>>> linesize = img.get_linesizes(keep_align=True)
>>> align = lambda x: int(math.ceil(x / 32.) * 32)
>>> linesize = map(align, linesize)
>>> linesize
[128, 64, 64, 0]
>>> img = Image(pix_fmt='yuv420p', size=(w, h), linesize=linesize)
>>> img.get_linesizes(keep_align=True)
(128, 64, 64, 0)
>>> img.get_linesizes()
(100, 50, 50, 0)
>>> img.get_size()
(100, 10)
get_pixel_format(self)

Returns the pixel format of the image. Can be one of ffpyplayer.tools.pix_fmts.

Returns

str: The pixel format of the image.

>>> img.get_pixel_format()
'rgb24'
get_required_buffers(self)

Returns a 4 tuple of booleans indicating which of the 4 planes are required (i.e. even if get_buffer_size is non-zero for that plane it may still be optional).

get_size(self)

Returns the size of the frame.

Returns

2-tuple of ints: The size of the frame as (width, height).

>>> img.get_size()
(640, 480)
is_key_frame(self)

Returns whether the image is a key frame.

Returns

bool: True if the image was a key frame.

is_ref(self)

Returns whether the image buffer is FFmpeg referenced. This can only be True when the buffers were allocated internally or by FFmpeg bit not when plane_buffers is provided. See Image for details. After a copy, it will always returns True.

Returns

bool: True if the buffer is FFmpeg referenced.

For example:

>>> w, h = 640, 480
>>> img = Image(plane_buffers=[bytes(' ') * (w * h * 3)], pix_fmt='rgb24', size=(w, h))
>>> img.is_ref()
False
>>> import copy
>>> img2 = copy.copy(img)
>>> img2.is_ref()
True

Or if directly allocated internally:

>>> img = Image(pix_fmt='rgb24', size=(w, h))
>>> img.is_ref()
True
to_bytearray(self, keep_align=False)

Returns a copy of the plane buffers as bytearrays.

Parameters
keep_align: bool

If True, the buffer for each plane will be padded after each horizontal line to match the linesize of its plane in this image. If False, an alignment of 1 (i.e. no alignment) will be used, returning the maximially packed buffer of this plane. Defaults to False.

Returns

4-element list: A list of bytearray buffers for each plane of this pixel format. An empty bytearray is returned for unused planes.

Get the buffer of an RGB image:

>>> w, h = 100, 10
>>> img = Image(pix_fmt='rgb24', size=(w, h))
>>> img.get_linesizes(keep_align=True)
(300, 0, 0, 0)
>>> map(len, img.to_bytearray())
[3000, 0, 0, 0]

Get the buffers of a YUV420P image:

>>> img = Image(pix_fmt='yuv420p', size=(w, h))
>>> linesize = img.get_linesizes(keep_align=True)
>>> linesize
(100, 50, 50, 0)
>>> align = lambda x: int(math.ceil(x / 32.) * 32)
>>> linesize = map(align, linesize)
>>> linesize
[128, 64, 64, 0]

>>> img = Image(pix_fmt='yuv420p', size=(w, h), linesize=linesize)
>>> map(len, img.to_bytearray())
[1000, 250, 250, 0]
>>> map(len, img.to_bytearray(keep_align=True))
[1280, 320, 320, 0]

>>> # now initialize a new Image with it
>>> img2 = Image(plane_buffers=img.to_bytearray(),
... pix_fmt=img.get_pixel_format(), size=img.get_size())
>>> img2.get_buffer_size(keep_align=True)
(1000, 250, 250, 0)

>>> # keep alignment
>>> img2 = Image(plane_buffers=img.to_bytearray(keep_align=True),
... pix_fmt=img.get_pixel_format(), size=img.get_size(),
... linesize=img.get_linesizes(keep_align=True))
>>> img2.get_buffer_size(keep_align=True)
(1280, 320, 320, 0)
to_memoryview(self, keep_align=False)

Returns a memoryviews of the buffers of the image.

Parameters
keep_align: bool

If True, the buffers of the original image will be returned without making any additional copies. If False, then if the image alignment is already 1, the original buffers will be returned, otherwise, new buffers will be created with an alignment of 1 and the buffers will be copied into them and returned. See to_bytearray().

Returns
4-element list:

A list of cython arrays for each plane of this image’s pixel format. If the data didn’t have to be copied, the arrays point directly to the original image data. The arrays can be used where memoryviews are accepted, since cython arrays implement the memoryview interface.

Unused planes are set to None.

Warning

If the data points to the original image data, you must ensure that this Image instance does not go out of memory while the returned memoryviews of the arrays are in use. Otherwise when the Image goes out of memory, the original data will become invalid and usage of the returned memoryviews of them will crash python.

Get the buffer of an RGB image:

>>> w, h = 100, 10
>>> img = Image(pix_fmt='rgb24', size=(w, h))
>>> img.get_linesizes(keep_align=True)
(300, 0, 0, 0)
>>> img.to_memoryview()
[<ffpyplayer.pic.array object at 0x055DCE58>, None, None, None]
>>> arr = img.to_memoryview()[0]
>>> # memview is the only attribute of cython arrays
>>> arr.memview
<MemoryView of 'array' at 0x55d1468>
>>> arr.memview.size
3000
class ffpyplayer.pic.ImageLoader

Bases: object

Class that reads one or more images from a file and returns them.

Parameters
filename: string type

The full path to the image file. The string will first be encoded using utf8 before passing to FFmpeg.

For example, reading a simple png using the iterator syntax:

>>> img = ImageLoader('file.png')
>>> images = [m for m in img]
>>> images
[(<ffpyplayer.pic.Image object at 0x02B5F5D0>, 0.0)]

Or reading it directly:

>>> img = ImageLoader('file.png')
>>> img.next_frame()
(<ffpyplayer.pic.Image object at 0x02B74850>, 0.0)
>>> img.next_frame()
(None, 0)
>>> img.next_frame()
(None, 0)

Or reading a gif using the iterator syntax:

>>> img = ImageLoader('sapo11.gif')
>>> images = [m for m in img]
>>> images
[(<ffpyplayer.pic.Image object at 0x02B749B8>, 0.0),
(<ffpyplayer.pic.Image object at 0x02B74918>, 0.08),
(<ffpyplayer.pic.Image object at 0x02B74990>, 0.22),
(<ffpyplayer.pic.Image object at 0x02B749E0>, 0.36),
(<ffpyplayer.pic.Image object at 0x02B74A08>, 0.41000000000000003),
(<ffpyplayer.pic.Image object at 0x02B74A30>, 0.46),
(<ffpyplayer.pic.Image object at 0x02B74A58>, 0.51)]

Or reading it directly:

>>> img = ImageLoader('sapo11.gif')
>>> img.next_frame()
(<ffpyplayer.pic.Image object at 0x02B74B70>, 0.0)
>>> img.next_frame()
(<ffpyplayer.pic.Image object at 0x02B74C60>, 0.08)
...
>>> img.next_frame()
(<ffpyplayer.pic.Image object at 0x02B74B70>, 0.51)
>>> img.next_frame()
(None, 0)
>>> img.next_frame()
(None, 0)
next_frame(self)

Returns the next available frame, or (None, 0) if there are no more frames available.

Returns

a 2-tuple of (:class:`Image, pts)`: Where the first element is the next image to be displayed and pts is the time, relative to the first frame, when to display it e.g. in the case of a gif.

If we reached the eof of the file and there are no more frames to be returned, it returns (None, 0).

Warning

Both next_frame() and the iterator syntax read the frames identically. Consequently, calling one, will also advance the frame for the other.

class ffpyplayer.pic.SWScale

Bases: object

Converts Images from one format and size to another format and size.

The class accepts an Image of a given pixel format and size and converts it to another Image with a different pixel format and size. Each SWScale instance converts only images with parameters specified when creating the instance.

Parameters
iw, ih: int

The width and height of the source image.

ifmt: str

The pixel format of the source image. Can be one of ffpyplayer.tools.pix_fmts.

ow, oh: int

The width and height of the output image after converting from the source image. A value of 0 will set that parameter to the source height/width. A value of -1 for one of the parameters, will result in a value of that parameter that maintains the original aspect ratio. Defaults to -1.

ofmt: str

The pixel format of the output image. Can be one of ffpyplayer.tools.pix_fmts. If empty, the source pixel format will be used. Defaults to empty string.

:

>>> w, h = 500, 100
>>> size = w * h * 3
>>> buf = bytearray([int(x * 255 / size) for x in range(size)])
>>> img = Image(plane_buffers=[buf], pix_fmt='rgb24', size=(w, h))

>>> # specify output w,h
>>> sws = SWScale(w, h, img.get_pixel_format(), ow=w/2, oh=h/3)
>>> img2 = sws.scale(img)
>>> img2.get_size()
(250, 33)

>>> # use input height
>>> sws = SWScale(w, h, img.get_pixel_format(), ow=w/2, oh=0)
>>> img2 = sws.scale(img)
>>> img2.get_size()
(250, 100)

>>> # keep aspect ratio
>>> sws = SWScale(w, h, img.get_pixel_format(), ow=w/2)
>>> img2 = sws.scale(img)
>>> img2.get_size()
(250, 50)

>>> # convert rgb24 to yuv420p
>>> sws = SWScale(w, h, img.get_pixel_format(), ofmt='yuv420p')
>>> img2 = sws.scale(img)
>>> img2.get_pixel_format()
'yuv420p'

>>> # convert into a previously allocated and aligned image
>>> import math
>>> align = lambda x: int(math.ceil(x / 32.) * 32)
>>> img2 = Image(pix_fmt=img.get_pixel_format(), size=(w/2, h/2))
>>> img2.get_linesizes(keep_align=True)
(750, 0, 0, 0)
>>> linesize = map(align, img2.get_linesizes())
>>> linesize
[768, 0, 0, 0]
>>> img2 = Image(pix_fmt=img2.get_pixel_format(), size=img2.get_size(), linesize=linesize)
>>> img2.get_linesizes(keep_align=True)
(768, 0, 0, 0)
>>> sws.scale(img, dst=img2)
<ffpyplayer.pic.Image object at 0x02B44440>
>>> img2
<ffpyplayer.pic.Image object at 0x02B44440>
scale(self, Image src, Image dst=None, int _flip=False)

Scales a image into another image format and/or size as specified by the instance initialization parameters.

Parameters
src: Image

A image instance with values matching the source image specification of this instance. An exception is raised if the Image doesn’t match. It will be used as the source image.

dst: Image or None

A image instance with values matching the output image specification of this instance. An exception is raised if the Image doesn’t match. If specified, the output image will be converted directly into this Image. If not specified, a new Image will be created and returned.

_flip: bool, defaults to False

Whether the image will be flipped before scaling. This only works for pixel formats whose color planes are the same size (e.g. rgb), so use with caution.

Returns

Image:

The output image. If dst was not None dst will be returned, otherwise a new image containing the converted image will be returned.

ffpyplayer.pic.get_image_size(pix_fmt, width, height)

Returns the size in bytes of the buffers of each plane of an image with a given pixel format, width, and height.

Parameters
pix_fmt: str

The pixel format in which the image is represented. Can be one of pix_fmts.

width: int

The width of the image.

height: int

The height of the image.

Returns

4-tuple of ints:

A tuple of buffer sizes in bytes for each plane of this pixel format required to store the image. Unused planes are zero.

:

>>> print get_image_size('rgb24', 100, 100)
(30000, 0, 0, 0)
>>> print get_image_size('yuv420p', 100, 100)
(10000, 2500, 2500, 0)
>>> print get_image_size('gray', 100, 100)
(10000, 1024, 0, 0)