toad.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Mastodon server operated by David Troy, a tech pioneer and investigative journalist addressing threats to democracy. Thoughtful participation and discussion welcome.

Administered by:

Server stats:

269
active users

#rocm

0 posts0 participants0 posts today

#NVIDIA Bringing #CUDA To #RISCV
NVIDIA's drivers and CUDA software stack are predominantly supported on x86_64 and AArch64 systems but in the past was supported on IBM POWER. This week at the RISC-V Summit China event, NVIDIA's Frans Sijstermans announced that CUDA will be coming to RISC-V.
#AMD for their part with the upstream #opensource #AMDKFD kernel compute driver can already build on RISC-V and the #ROCm user-space components can also be built on RISC-V.
phoronix.com/news/NVIDIA-CUDA-

www.phoronix.comNVIDIA Bringing CUDA To RISC-VNVIDIA announced this week that they are bringing their CUDA software to RISC-V processors.

New #ZLUDA 5 Preview Released For #CUDA On Non-NVIDIA #GPU
For now this ability to run unmodified CUDA apps on non-#NVIDIA GPUs is focused on #AMD GPUs of the #Radeon RX 5000 series and newer, which is AMD Radeon GPUs with #ROCm. Besides CUDA code samples, GeekBench has been one of the early targets for testing.
phoronix.com/news/ZLUDA-5-prev

www.phoronix.comNew ZLUDA 5 Preview Released For CUDA On Non-NVIDIA GPUsZLUDA Version 5-preview.43 was released today as this open-source CUDA implementation for use on non-NVIDIA GPUs, with one of the current focuses being on enabling CUDA on AMD Radeon GPUs with ROCm.

It's convenient that I can use #LLMs to help me learn how to use LLMs because I'm pretty sure I wouldn't be able to figure it out any other way.

I want to use my local #Ollama models with #Copilot in #VSCode, but I have an #AMD #GPU so apparently I need to install something called the #ROCm (Radeon Open Compute Platform) via the Windows 11 HIP SDK?

And maybe all this doesn't work in #WSL, so I'll have to reinstall it in #Ubuntu there if I want to use it in one of those workspaces?

#ZLUDA Making Progress In 2025 On Bringing #CUDA To Non-NVIDIA #GPU
ZLUDA #opensource effort that started half-decade ago as drop-in CUDA implementation for #Intel GPUs and then for several years was funded by ##AMD as a CUDA implementation for #Radeon GPUs atop #ROCm and then open-sourced but then reverted has been continuing to push along a new path since last year. Current take on ZLUDA is a multi-vendor CUDA implementation for non-NVIDIA GPUs for #AI workloads & more.
phoronix.com/news/ZLUDA-Q2-202

www.phoronix.comZLUDA Making Progress In 2025 On Bringing CUDA To Non-NVIDIA GPUs
Replied in thread

@eugenialoli same with photo #raw processing, non of the #foss apps use the color profiles of the cameras, no 16bit raw and many other issues. I gave up retouch due to the still bad gimp ux and the bad implementation of non destructive editing.

The sad reality is the #linuxdesktop is not ready for professional #mediaproduction and this is such a bad thing in times like this.

#davinciresolve also barely runs on everything except nvidia on #linux and still has no #flatpak

Even blender is a pain with amd #rocm

Any #Linux #kernel ,
#graphics or #GPU people out there?

I'm trying to understand the relationship between the #amgdpu driver shipped with the kernel; and the "andgpu-dkms" driver that comes with #ROCm .

Specifically, with a recent enough kernel, do we really need to install the ROCm version of the driver? Does the ROCm version contain stuff the general driver does not? Or is the ROCm stack (esp. libhsa) tightly tied to a very specific version of the driver?

#AMD splits #ROCm toolkit into two parts – ROCm #AMDGPU drivers get their own branch under Instinct #datacenter #GPU moniker
The new #datacenter Instinct driver is a renamed version of the #Linux AMDGPU driver packages that are already distributed and documented with ROCm. Previously, everything related to ROCm (including the amdgpu driver) existed as part of the ROCm software stack.
tomshardware.com/pc-components

Tom's Hardware · AMD splits ROCm toolkit into two parts – ROCm AMDGPU drivers get their own branch under Instinct datacenter GPU monikerBy Aaron Klotz
Replied in thread

Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <doi.org/10.1002/cpe.8313>

This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <doi.org/10.1016/j.jcp.2022.111>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

a. too much effort
b. probably not worth it.

Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

6/

AI rabbit hole ... I've been playing with Ollama and some stability diffusion tools on my MacBook Pro M2 Max and my Linux desktop ... the desktop is way faster and only has an RX6800 in it, so of course I'm now thinking about an Rx7900XTX ... (I don't do Nvidia cards) ...

Anyone have experience with this upgrade? Is going from 16gb of VRAM to 24gb going to make a massive difference?

Using radeontop I can see it's using all 16gb at some points, but not consistently ... and I'm not sure if that's an issue or a feature. I believe #rocm still has some issues.

Just how deep is #Nvidia's #CUDA moat really?
Not as impenetrable as you might think, but still more than Intel or AMD would like
It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
theregister.com/2024/12/17/nvi #developers

The Register · Just how deep is Nvidia's CUDA moat really?By Tobias Mann

I'm quite excited by this output. Darknet normally displays information on the NVIDIA CUDA-capable GPU when you run "darknet --version".

Check out these new changes I'm working on.

- rsmi_num_monitor_devices()
- rsmi_dev_name_get()
- rsmi_dev_memory_total_get()

#darknet#yolo#amd
Continued thread

Even better, in the afternoon I managed to find a workaround for my #GPGPU software building but hanging when trying to run it, which seems to be related to an issue with some versions of the #AMD software stack and many integrated GPUs, not just the #SteamDeck specifically. So exporting the HSA_ENABLE_SDMA=0 environment vriable was sufficient to get my software running again. I'm dropping the information here in case others find it useful.

#ROCm #GPU #APU #HIP

2/2

Replied in thread

One of the nice things of the refactoring that I had to do to introduce CPU support is that it also allowed me to trivially had support for #AMD #HIP / #ROCm.
That, and the fact that AMD engineers have written a drop-in replacement for the Thrust library that we depend on in a couple of places. (This is also one of the things that is holding back a full #SYCL port for #GPUSPH, BTW.)

Had an opportunity to reboot my homelab server. Big kernel update and all that. This seems to have fixed my broken ROCm.

So I’m back to GPU accelerating my AI workloads.

The version of the kernel, though, might include the new AMD AI stuff, so I need to investigate this.