Thursday, October 22, 2009

CUDA

A major issue found in Network Video Recorders is high CPU utilization.  Unlike in DVRs where encoding and decoding is being done by hardware, receiving and recording multiple video streams, while encoding and decoding them for live view and remote view using just software demands a lot of processing power.  Imagine if there are 32 incoming video streams and resolution size is D1 (720X480), lacking a dedicated hardware encoder and decoder chip,  will heavily tax even the latest PC CPUs available in the market today.  In fact, no present NVR solution can display more than 4 simultaneous D1 streams while recording them at the same time.


CUDA to the rescue

Short of installing another encoder/decoder card, one can use the spare processing power of a PC GPU (Graphic processing unit) for encoding and decoding, as well as add more capabilities to the system.   Last year, Nvidia released the developer kit for CUDA for various developers, to enable applications to take advantage of the powerful NVIDIA GPUs.

What is CUDA?

CUDA (an acronym for Compute Unified Device Architecture) is a parallel computing architecture developed by NVIDIA. CUDA is the computing engine in NVIDIA graphics processing units or GPUs that is accessible to software developers through industry standard programming languages. Programmers use 'C for CUDA' (C with NVIDIA extensions), compiled through a PathScale Open64 C compiler, to code algorithms for execution on the GPU. CUDA architecture supports a range of computational interfaces including OpenCL and DirectCompute. Third party wrappers are also available for Python, Fortran, Java and Matlab.

The latest drivers all contain the necessary CUDA components. CUDA works with all NVIDIA GPUs from the G8X series onwards, including GeForce, Quadro and the Tesla line. NVIDIA states that programs developed for the GeForce 8 series will also work without modification on all future Nvidia video cards, due to binary compatibility. CUDA gives developers access to the native instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest NVIDIA GPUs effectively become open architectures like CPUs. Unlike CPUs however, GPUs have a parallel "many-core" architecture, each core capable of running thousands of threads simultaneously - if an application is suited to this kind of an architecture, the GPU can offer large performance benefits.

In the computer gaming industry, in addition to graphics rendering, graphics cards are used in game physics calculations (physical effects like debris, smoke, fire, fluids); examples include PhysX and Bullet. CUDA has also been used to accelerate non-graphical applications incomputational biology, cryptography and other fields by an order of magnitude or more. An example of this is the BOINC distributed computing client.


CUDA provides both a low level API and a higher level API. The initial CUDA SDK was made public on 15 February 2007, for Microsoft Windowsand Linux. Mac OS X support was later added in version 2.0, which supersedes the beta released February 14, 2008.

GPUs are cheap, massively parallel, programmable compute devices that can be used for many general purpose (non-graphics) tasks. They are a "good fit" for many scientific applications and significant speedups (as compared to contemporary CPUs) have been reported. The CUDA language makes NVIDIA GPUs accessible to developers through a series of extensions to C (with no mention of pixels or shading!).

By harnessing CUDA, it may be possible to have high quality recording simultaneous with multiple high quality video streams for playback and remote view.  It also deloads the CPU of tasks that can help the application run faster and more smoother.  Expect to see a lot of CUDA based NVRs in the future.

Some links:

http://codingplayground.blogspot.com/2009/02/web-search-ranking-and-gpp-many-core.html
http://www.bikal.co.uk/network-video-recorder/nvr-pro.html

No comments:

Post a Comment