In this first session, you will be given an broad overview of the course, and an introduction to the topics covered.
Duration: 3 mins.
The Mandelbrot Set is a beautiful object that arises out of a simple Mathematical formula. Throughout the course, an interactive Mandelbrot Set browser application will be used to demonstrate each of the technologies covered, and to evaluate their performance characteristics. In this session, you are introduced to the Mandelbrot Set, and the browser application, which is called 'MangleBrot'.
Duration: 8 mins.
Concurrency, or undertaking multiple tasks simultaneously, is often misunderstood in software development circles. Far from being just another term for 'multithreading', concurrency can be achieved in many different ways, and with many different technologies. In this session, you learn what concurrency is, when you should leverage it, and why it has become so important in the age of multi-core processors.
Duration: 27 mins.
Concurrency is often associated with multithreading, but it is much broader than that. In high-performance computing (HPC), for example, it is common to utilize distributed-memory concurrency, in which processes on different computers work together over a network to achieve a single task. In this session, you get a brief introduction to a broad range of technologies that ship on every Mac OS X system, from Cocoa classes like NSTask, to the Message-Passing Interface (OpenMPI), and multithreading with NSThread and OpenMP.
Duration: 25 mins .
Multithreading is traditionally the most popular approach to concurrency on shared-memory machines such as the Mac, but it is a difficult skill to master, and can very easily result in hard to trace bugs. In this session, you will learn why multithreading is such a challenge to software developers, and be introduced to the elements that together form the performance equation: granularity, load balancing, and synchronization.
Duration: 30 mins.
Snow Leopard introduces technologies that make leveraging modern computer architectures considerably simpler. These technologies relieve the developer of having to deal with threads directly, and instead split tasks into easy-to-understand 'packets of computation'. In this session, you'll learn about the components that make up this new approach: packets and queues.
Duration: 9 mins.
Introduced in Mac OS X 10.5 Leopard, NSOperation and NSOperationQueue are Cocoa classes which help realize Apple's vision of packet-based concurrency. In this session, you learn how to create your own NSOperation subclasses, set dependencies between operations, and submit them to a queue for concurrent execution.
Duration: 32 mins.
Blocks, or closures, have existed in programming languages like Ruby and Smalltalk for many years, but with Snow Leopard, they find their way into C. Blocks play a very important role in the low level, packet-based approach to concurrency of Grand Central Dispatch (GCD). In this session, you learn the C block syntax, how they differ from standard functions, and what it is about blocks that make them so well suited to concurrent programming.
Duration: 28 mins.
Blocks are a C construct, but they are equally useful in Objective-C/Cocoa applications. In fact, Apple have done a great job integrating blocks with Objective-C, making them behave virtually the same as Cocoa objects. This session introduces you to how blocks are integrated into Objective-C, and some of the new methods available in the Cocoa frameworks that work directly with blocks.
Duration: 25 mins.
Grand Central Dispatch is a low-level C framework that Apple introduced in Snow Leopard to support packet-based concurrency. In this session, you will be introduced to this framework, and get a first glimpse of how it is used.
Duration: 9 mins.
Together with C blocks, queues are one of the most important components of Grand Central Dispatch. Queues come in different forms, including concurrent queues and serial queues. This session addresses how you retrieve and create queues, when to use each variety of queue, and how blocks should be dispatched to a queue. Groups, which allow you to monitor the status of a set of blocks, are also covered.
Duration: 19 mins.
With the system taking over thread creation, you need a new way to control access to limited resources in GCD. With dispatch semaphores, you can limit the number of blocks that can access a particular resource at any given time. In this session, you learn how to create dispatch semaphores, and utilize them to protect limited resources. You will also see how they can be used to serialize a section of code in an efficient way, and how serial queues can serve the same purpose while avoiding synchronization.
Duration: 21 mins.
The heart of Grand Central Dispatch is blocks and queues, but there is more. GCD can be used to build event-driven C applications without resorting to Objective-C and Cocoa. In this session, you learn how to make use of dispatch sources to develop event driven code in pure C.
Duration: 32 mins.
OpenCL is a new open specification designed to facilitate computation on heterogeneous systems like modern-day Macs. With OpenCL, you can write code that can be compiled at run-time, and run on either the CPU or the GPU. Modern GPUs can have hundreds of compute cores, and, in some cases, utilizing the GPU can lead to marked performance gains over computing on the CPU. In this session, you will learn a little of the history of OpenCL, how it works, and why this is a watershed moment for computing on graphics chips.
Duration: 13 mins.
To get the most out of OpenCL, you have to understand its memory architecture and terminology. In this session, you are introduced to the different forms of memory in an OpenCL system, the terms used to identify units of work, and the way those units get divided into workgroups.
Duration: 15 mins.
An OpenCL kernel is similar to a function in C, but it gets compiled at run time, and can be dispatched to run on any OpenCL-compliant device in your Mac, including the CPU and the GPU. This session introduces you to the OpenCL kernel language, and how you use it to write high-performance kernels in OpenCL.
Duration: 25 mins.
The OpenCL framework is a C API used to build up a pipeline in order to dispatch and execute your kernels on devices like the GPU. This session introduces you to the most important types and functions in the OpenCL framework, and how to combine them to run execute your kernels.
Duration: 28 mins.
This session begins by addressing a few aspects of the framework not yet covered, such as memory management, and using events to determine when tasks have completed. It continues with a discussion of debugging OpenCL, and finishes with a demonstration of OpenCL running the Mandlebrot.
Duration: 18 mins.
If the GPU is such a high performance device, why don't we just run everything on it? While it is true that a GPU is capable of amazing performance in some instances, it is much more limited than a general-purpose CPU. In this session, you will learn what those limitations are, so that you can account for them when developing your OpenCL kernels. Optimizing OpenCL kernels is largely about managing the memory hierarchy, and taking full advantage of the fastest memory on the device. In this session, you will learn how to take a basic kernel, and with just a letter manipulation, achieve an order of magnitude performance boost.
Duration: 14 mins.
OpenCL is broad, and requires you to learn a new C-based language, as well as the OpenCL framework API. These exercises offer an opportunity to get your hands dirty with OpenCL.
Duration: 8 mins.
Just a few minutes of review and suggestions for going forward from here
Duration: 5 mins.