Video space/time transposition
From Create52 Cookbook
Contents |
The transformation
Video space/time transposition is a process which makes an 'input' video into another video where the time dimension and one space dimension are swapped (transposed).
The most interesting space dimension to swap is the width (or 'X') dimension, because most things you can video in everyday life move in the 'X' directly, e.g. cars and people...
This transformation shows some interesting effects. To give an idea, here is a still from an input video and a still from the corresponding output video:
Input video still:
Output video still:
As you can see, the output has people recognisable in it, although they are somewhat misshapen.
To see actual input and output videos, please visit my flickr page here: http://www.flickr.com/photos/redlex/sets/72157606249489942/
What is happening
As a specific example of what is going on, suppose we have a video that is 3 pixels wide, 4 pixels high, and which has 6 frames in it.
The video can be thought of a three dimensional array of pixels:
When this video is played, you are effectively seeing a 2D slice move through this 3D array, front to back.
The transformation can be thought of as leaving the pixel data unchanged, but changing the meaning of the various axes:
So to play the transformed video, we have a plane at right angles to original picture plane, and it moves through the data left to right. Thw width of the transformed video is equal to the number of frames in the original video, and the the number of frames in the transformed video is equal to the width of the original video. More succinctly:
Note that the height dimension remains the same.
Working out the behaviour of transformed videos
To work out how objects in the videos behave, it's useful to recognise that since time and the x dimension are transposed, and the height remains the same, we only really need to consider two dimensions, the x and t axes:
So by sketching lines/shapes on squares (known as 'x/t squares') we can work out the behaviour of objects in transformed videos.
As an example, here is the x/t square which represents an object travelling left to right across the original video, at high speed, about halfway through the duration of the video:
(Note that for most of the video, the object isn't even in the picture!)
To understand what the transformed video will look like, just image a vertical line travelling across the square left to right (representing a 1d picture travelling through time), and what that video would depict.
In the above example, it can be seen that the transformed video shows the object in the frame for the entire duration of the video, and it doesn't move very fast. The general principle is that speed becomes reciprocated, i.e:
- something travelling at
pixels per frame in the input video travels at
pixels per frame in transformed video
The following diagram represents something that crosses the picture over exactly the duration of the video:
You can see from this diagram that in the transformed video, the object also crosses the picture for exactly the duration of the video.
The following diagram represents something which is in the video for the duration, and which doesn't travel very far in the video:
You can see that in the transformed video, the object isn't in the video for most of it, and when it does appear, it travels fast.
Here is a video which represents two objects A and B moving, where B is moving faster than A:
In the transformed video, A now moves faster than B. (This fits in the with reciprocal speed concept mentioned earlier.)
Till now we've treated objects as very thin, and hence as lines in these diagrams. If we consider objects as having thickness, we can see something interesting. Here is a diagram representing an object crossing the video, right to left, with the left and right hand sides of the object marked:
You can see that in the transformed video, the object will travel from right to left still. Note also that the LHS of the object remains on the left hand side, and likewise for the RHS.
Here is a diagram representing an object crossing the video, left to right this time, with the left and right hand sides of the object marked:
In the transformed video, the object will travel from right to left still, but interestingly, the original LHS of the object now appears on the RHS, and vice versa. In other words, the image of the moving object is mirrored.
The left-to-right and right-to-left results above mean that in output videos, things travel in their original directions, but they all face left! Hence, some things actually appear to be moving backwards (whether they be cars, people, or whatever).
Here's a video that represents some periodic motion, such as someone waving their hand back and forth repeatedly:
The output video will show something that is periodic in space - in other words, repeated similar images across the width of the transformed video. These objects will appear out of nowhere, split into two, then merge again (but not with the original object they split with - with the next one along).
Reflective surfaces in the part of the background of the input video (e.g. a shop window, not moving) are interesting. Here's the diagram representing two reflective surfaces in the background:
Examining the above, you can see that the transformed video will not show the mirrors at all for most of the video, but for some frames, will show a mirror that spans the width of video. And due to the reciprocal speed effect mentioned earlier, reflected object in the (transformed) video actually travel faster than the real object (which is the opposite of real life).
(You can also see from the above video that any object at all that is not moving gets stretched out horizontally, losing all its detail. And the longer time the object stops, the wider it gets in the transformed video.)
So how can we work out the width of a transformed object in the transformed video? The following diagram demonstrates the calculation:
So we know that in the input video, the object has a width of
pixels, and it moves
pixels every
frames of animation (in other words, its speed is
pixels per frame.) So from similar triangles in the above diagram, we can see the transformed width will be
.
So we can see that in order for something to maintain its width, it must travel at 1 pixel per frames in the original video. Any slower, it is stretched out; any faster, it gets compressed.
Python script to perform video space/time transposition
# Video space/time transposition script.
# A create52.com thang: http://create52.com
#
# This script takes as input a number of image files (e.g. PNGs) which
# should be frames from a video. (I personally extract the frames
# from a .mov movie using quicktime.)
# The script outputs a collection of image files, which can then be
# composed into a video. I used virtualdub for this (http://virtualdub.org/).
#
# The action of this script is to transpose (swap) the time and width dimensions
# of the input video represented by the input images.
#
# Alex Hunsley, July 2008
#
# Use and adapt this script freely, but please keep this notice.
#
#
#
# This script uses Python Imaging Library, get it from
# http://www.pythonware.com/products/pil/.
#
import Image
# with below definition, image names would be a001.png, a002.png, etc.
baseName="a"
startFrame = 1
endFrame = 200
# if you start a run, then have to stop it halfway through, change this
# variable to be the frame you stopped at and re-start the script to
# resume at the correct point. Otherwise, don't worry about this and leave it at 0.
skipFrames = 0
numInputFrames = endFrame - startFrame + 1
# find out dimension of image files
fname = "%s%03d.png" % (baseName, startFrame)
im = Image.open(fname)
(width, height) = im.size
newImageWidth = numInputFrames
newImageHeight = height
numOutputFrames = width
outputImage = Image.new('RGB', (newImageWidth, newImageHeight))
for outputImgIdx in range(skipFrames, numOutputFrames):
# output images are called out000.png, out001.png, etc.
outputFileName = "out%03d.png" % outputImgIdx
print "generating output image", outputFileName
currX = 0
for i in range(0, numInputFrames):
fname = "%s%03d.png" % (baseName, i + startFrame)
im = Image.open(fname)
vertSlice = im.crop((outputImgIdx, 0, outputImgIdx + 1, height))
outputImage.paste(vertSlice, (currX, 0, currX + 1, height))
currX += 1
outputImage.save(outputFileName, "PNG")















