Welcome to Fang's Blog

A Panicking Note For 2020
2019-12-31 Fang Personal
Books/Papers To Be Finished Numerical Analysis: Theory and Experiments The Concepts and Practice of Mathematical Finance What Every…
Notes From The Concepts and Practice of Mathematical Finance -- 2. Pricing Methodologies
2019-12-30 Fang Mathematical Finance
These notes about how to price options and derivative products. Derivative product is a product whose value is determined by the behavior of…
Notes From The Concepts and Practice of Mathematical Finance -- 1. Risk
2019-12-29 Fang Mathematical Finance
1. What is risk Every transaction can be viewed as the buying and selling of risk. Risk can be regarded as a synonym for uncertainty…
Notes From What Every Programmer Should Know About Memory
2019-12-28 Fang OS
This series contains my note taken from What Every Programmer Should Know About Memory. 1. Introduction Background: mass storage and memory…
Using OCaml for Scientific Computing -- 2. Ndarray
2019-12-25 Fang OCaml
(This is my study note on OCaml Scientific Computing 1st Edition) Ndarray Types The Ndarray module is built on top of OCaml’s module. C…
Using OCaml for Scientific Computing -- 1. Setup & Conventions
2019-12-23 Fang OCaml
Intro Use Owl in Toplevel load Owl in utop with the following commands. owl-top is Owl’s toplevel library which will automatically load…
CUDA Unified Virtual Address Space & Unified Memory
2019-12-16 Fang CUDA
Unified Virtual Address Space (UVA) From CUDA 4.0 and on, UVA has been an important feature. It puts all CUDA execution, host and GPUs, in…
Several Things About CUDA Resource Assignment
2019-12-15 Fang CUDA
Motivated by a CUDA puzzle I tried to solve today, I’d like to talk more about resource assignment. A Puzzle problem Adding two big arrays…
Memory Alignment For CUDA
2019-12-14 Fang CUDA
In the fifth post of the CUDA series (The CUDA Parallel Programming Model - 5. Memory Coalescing), I put up a note on the effect of memory…
Recap: GPU Latency Tolerance and Zero-Overhead Thread-Scheduling
2019-12-13 Fang CUDA
I briefly talked about how CUDA processors hide long-latency operations such as global memory accesses through their warp-scheduling…
CUDA Dynamic Parallelism
2019-12-12 Fang CUDA
Have you wondered if it’s possible to launch nested kernels (i.e. a kernel calls another kernel) in CUDA? Well, this is where dynamic…
Some CUDA Related Questions
2019-12-11 Fang CUDA
In this post, I talk about what happens when blocks are assigned to SMs, as well as CUDA code optimization across GPU architecture. Before…
CUDA Programming - 2. CUDA Variable Type Qualifiers
2019-12-10 Fang CUDA programming
CUDA Variable Type Qualifiers lifetime == kernel? If the lifetime of a variable is within a kernel execution, it must be declared within…
Packing Files With Reprozip On MacOS Via Vagrant
2019-12-09 Fang devOps
I recently had to pack a project with Reprozip where all the dependencies can be nicely preserved. Reprozip uses ptrace and thus only works…
The CUDA Parallel Programming Model - 9. Interleave Operations by Stream
2019-12-08 Fang CUDA
In the last post, we saw how full concurrency can be achieved amongst streams. Here I’d like to talk about how CUDA operations from…
The CUDA Parallel Programming Model - 8. Concurrency by Stream
2019-12-07 Fang CUDA
In the previous posts, we have sometimes assumed that only one kernel is launched at a time. But this is not all that kernels can do. They…
The CUDA Parallel Programming Model - 7.Tiling
2019-12-06 Fang CUDA
There’s an intrinsic tradeoff in the use of device memories in CUDA: the global memory is large but slow, whereas the shared memory is small…
The CUDA Parallel Programming Model - 6. More About Memory
2019-12-05 Fang CUDA
The compute-to-global-memory-access ratio has major implications on the performance of a CUDA kernel. Programs whose execution speed is…
The CUDA Parallel Programming Model - 5. Memory Coalescing
2019-12-04 Fang CUDA
This post talks about a key factor to CUDA kernel performace: accessing data in the globle memory. CUDA applications tend to process a…
The CUDA Parallel Programming Model - 4. Syncthreads Examples
2019-12-03 Fang CUDA
This is the fourth post in a series about what I learnt in my GPU class at NYU this past fall. Here I talk about barrier synchronization…
The CUDA Parallel Programming Model - 3. More On Thread Divergence
2019-12-02 Fang CUDA
This is the third post in a series about what I learnt in my GPU class at NYU this past fall. Here I dive a bit deeper than the previous…
The CUDA Parallel Programming Model - 2. Warps
2019-12-01 Fang CUDA
This is the second post in a series about what I learnt in my GPU class at NYU this past fall. This will be mostly about warps, why using…
The CUDA Parallel Programming Model - 1. Concepts
2019-11-30 Fang CUDA
I took the Graphics Processing Units course at NYU this past fall. This is the first post in a series about what I learnt. Buckle up for…
CUDA Programming - 1. Matrix Multiplication
2019-11-29 Fang CUDA programming
Notation Matrix Matrix Output Matrix row counter: column counter: is the element at position in the vertical direction and position…

Fang Cabrera

A tech notebook

Welcome to Fang's Blog

A Panicking Note For 2020

Notes From The Concepts and Practice of Mathematical Finance -- 2. Pricing Methodologies

Notes From The Concepts and Practice of Mathematical Finance -- 1. Risk

Notes From What Every Programmer Should Know About Memory

Using OCaml for Scientific Computing -- 2. Ndarray

Using OCaml for Scientific Computing -- 1. Setup & Conventions

CUDA Unified Virtual Address Space & Unified Memory

Several Things About CUDA Resource Assignment

Memory Alignment For CUDA

Recap: GPU Latency Tolerance and Zero-Overhead Thread-Scheduling

CUDA Dynamic Parallelism

Some CUDA Related Questions

CUDA Programming - 2. CUDA Variable Type Qualifiers

Packing Files With Reprozip On MacOS Via Vagrant

The CUDA Parallel Programming Model - 9. Interleave Operations by Stream

The CUDA Parallel Programming Model - 8. Concurrency by Stream

The CUDA Parallel Programming Model - 7.Tiling

The CUDA Parallel Programming Model - 6. More About Memory

The CUDA Parallel Programming Model - 5. Memory Coalescing

The CUDA Parallel Programming Model - 4. Syncthreads Examples

The CUDA Parallel Programming Model - 3. More On Thread Divergence

The CUDA Parallel Programming Model - 2. Warps

The CUDA Parallel Programming Model - 1. Concepts

CUDA Programming - 1. Matrix Multiplication