Utilizing Inline Motion Vector Estimates from an H.264 Encoder to Implement Motion Detection

or Motion Detection Made Easy

Press Down to continue, or Swipe Up if you prefer.

A Brief Introduction to Video Encoding

  • Video is just lots of images
  • Simple encoders just encode lots of images (MJPEG):

    JPEG JPEG JPEG JPEG JPEG JPEG JPEG JPEG
  • Complex encoders exploit similarity of images (MPEG):

    I-Frame P-Frame B-Frame B-Frame P-Frame B-Frame

MPEG I-Frames

Simplest frame type: it's just a picture. Also known as a key frame

MPEG P-Frames

Stores what changed since the last frame. Useful!

MPEG B-Frames

Like P-Frames, but with time travel.

Motion Estimation

Every frame, GPU calculates which macro blocks moved where.

Limitations

  • Very crude estimate
  • Doesn't try and identify features
  • Based solely on similarity of blocks
  • 2D vector only, doesn't attempt to infer rotation, Z-motion, etc.

Interpretation

What on earth is this?!

The Data!

For each frame, Pi's camera produces a stream of inline motion data. Consists of X, Y, and SAD values for each macro-block:


      

The Code!


          
        

The Code!

The picamera library does most of the tedious stuff (since 1.5)…


      

How to detect motion

Crude first steps...


      

Refinements

How about a histogram of magnitudes?


      

Future Work

Left as an exercise for the reader...

  • Use SAD values to filter good blocks
  • Define features in motion (convex hull? feature labelling?)
    • to ignore lighting effects
    • ... and cats
  • Combine with image data (e.g. ignore certain colours?)

Thank You!

The picamera library: picamera.readthedocs.org

The author: dave@waveform.org.uk

The Twitter feed: @waveform80

Obligatory Cat Video

Because Fiona insists!

Now go play with this and make something interesting!