Skip to content

Graphs

Graphs allow capturing GPU kernels and executing them as one unit, reducing host overhead.

Simple operations can be captured as is:

julia
using AMDGPU

f!(o) = o .+= one(eltype(o))

z = AMDGPU.zeros(Int, 4, 4)
graph = AMDGPU.@captured f!(z)
@assert sum(z) == 16

AMDGPU.launch(graph)
@assert sum(z) == 16 * 2

However, if your code contains more complex flow, it requires more preparations:

  • code must not result in hostcall invokation.

  • if code contains malloc and respective frees, then it can be captured and relaunched as is.

  • if code contains only allocations (without freeing), allocations must be cached with GPUArrays.@cached beforehand (see example below).

  • other unsupported operations (e.g. RNG init) must be done beforehand as well.

  • updating graph, does not update allocated pointers, only instantiation is supported in such cases.

julia
using AMDGPU, GPUArrays

function f(o)
    x = AMDGPU.rand(Float32, size(o))
    y = AMDGPU.rand(Float32, size(o))
    o .+= sin.(x) * cos.(y) .+ 1f0
    return
end

cache = GPUArrays.AllocCache()
z = AMDGPU.zeros(Float32, 256, 256)
N = 10

# Execute function normally and cache all allocations.
GPUArrays.@cached cache f(z)

# Capture graph using AllocCache to avoid capturing malloc/free calls.
graph = GPUArrays.@cached cache AMDGPU.@captured f(z)

# Allocations cache must be kept alive while executing graph.
for i in 1:N
    AMDGPU.launch(graph)
end
AMDGPU.synchronize()
AMDGPU.HIP.capture Function
julia
capture(f::Function; flags = hipStreamCaptureModeGlobal, throw_error::Bool = true)::Union{Nothing, HIPGraph}

Capture fiven function f to a graph. If successful, returns a captured graph that needs to be instantiate'd to obtain executable graph.

source
AMDGPU.HIP.@captured Macro
julia
graph = AMDGPU.@captured begin
    # code to capture in a graph.
end

Macro to capture a given expression in a graph & execute it. Returns captured graph, that can be relaunched with launch or updated with update.

If capture fails (e.g. due to JIT), attempts recovery, compilation and re-capture.

source
AMDGPU.HIP.instantiate Function
julia
instantiate(graph::HIPGraph)::HIPGraphExec

Instantiate captured graph making it executable with launch.

source
AMDGPU.HIP.update Function
julia
update(exec::HIPGraphExec, graph::HIPGraph; throw_error::Bool = true)::Bool

Given executable graph, perform update with graph. Return true if successful, false otherwise.

If throw_error=false allows avoiding throwing an exception if update was not successful.

source
AMDGPU.HIP.is_capturing Function
julia
is_capturing(stream::HIPStream = AMDGPU.stream())::Bool

For a given stream check if capturing for a graph is performed.

source
AMDGPU.HIP.launch Function
julia
launch(exec::HIPGraphExec, stream::HIPStream = AMDGPU.stream())

Launch executable graph on a given stream.

source