Dave Troy @davetroy

**Benjamin Carr, Ph.D.** @BenjaminHCCarr@hachyderm.io · Jul 20

Benjamin Carr, Ph.D. @BenjaminHCCarr@hachyderm.io

#NVIDIA Bringing #CUDA To #RISCV
NVIDIA's drivers and CUDA software stack are predominantly supported on x86_64 and AArch64 systems but in the past was supported on IBM POWER. This week at the RISC-V Summit China event, NVIDIA's Frans Sijstermans announced that CUDA will be coming to RISC-V.
#AMD for their part with the upstream #opensource #AMDKFD kernel compute driver can already build on RISC-V and the #ROCm user-space components can also be built on RISC-V.
https://www.phoronix.com/news/NVIDIA-CUDA-Coming-To-RISC-V

www.phoronix.comNVIDIA Bringing CUDA To RISC-VNVIDIA announced this week that they are bringing their CUDA software to RISC-V processors.

**Benjamin Carr, Ph.D.** @BenjaminHCCarr@hachyderm.io · Jul 10

Jul 10

Benjamin Carr, Ph.D. @BenjaminHCCarr@hachyderm.io

New #ZLUDA 5 Preview Released For #CUDA On Non-NVIDIA #GPU
For now this ability to run unmodified CUDA apps on non-#NVIDIA GPUs is focused on #AMD GPUs of the #Radeon RX 5000 series and newer, which is AMD Radeon GPUs with #ROCm. Besides CUDA code samples, GeekBench has been one of the early targets for testing.
https://www.phoronix.com/news/ZLUDA-5-preview.43

www.phoronix.comNew ZLUDA 5 Preview Released For CUDA On Non-NVIDIA GPUsZLUDA Version 5-preview.43 was released today as this open-source CUDA implementation for use on non-NVIDIA GPUs, with one of the current focuses being on enabling CUDA on AMD Radeon GPUs with ROCm.

@schizanon@mastodon.social · Jul 5

Jul 5

@schizanon@mastodon.social

It's convenient that I can use #LLMs to help me learn how to use LLMs because I'm pretty sure I wouldn't be able to figure it out any other way.

I want to use my local #Ollama models with #Copilot in #VSCode, but I have an #AMD #GPU so apparently I need to install something called the #ROCm (Radeon Open Compute Platform) via the Windows 11 HIP SDK?

And maybe all this doesn't work in #WSL, so I'll have to reinstall it in #Ubuntu there if I want to use it in one of those workspaces?

#technology

**Benjamin Carr, Ph.D.** @BenjaminHCCarr@hachyderm.io · Jul 3

Jul 3

Benjamin Carr, Ph.D. @BenjaminHCCarr@hachyderm.io

#ZLUDA Making Progress In 2025 On Bringing #CUDA To Non-NVIDIA #GPU
ZLUDA #opensource effort that started half-decade ago as drop-in CUDA implementation for #Intel GPUs and then for several years was funded by ##AMD as a CUDA implementation for #Radeon GPUs atop #ROCm and then open-sourced but then reverted has been continuing to push along a new path since last year. Current take on ZLUDA is a multi-vendor CUDA implementation for non-NVIDIA GPUs for #AI workloads & more.
https://www.phoronix.com/news/ZLUDA-Q2-2025-Update

www.phoronix.comZLUDA Making Progress In 2025 On Bringing CUDA To Non-NVIDIA GPUs

**HGPU group** @hgpu@mast.hpc.social · Jun 22

Jun 22

HGPU group @hgpu@mast.hpc.social

Engineering Supercomputing Platforms for Biomolecular Applications

#CUDA #ROCm #Biology #Biomolecules #MolecularDynamics #HPC #Physics #Package

https://hgpu.org/?p=29954

hgpu.org · Jun 22Engineering Supercomputing Platforms for Biomolecular ApplicationsA range of computational biology software (GROMACS, AMBER, NAMD, LAMMPS, OpenMM, Psi4 and RELION) was benchmarked on a representative selection of HPC hardware, including AMD EPYC 7742 CPU nodes, N…

Replied in thread

**Profile13115** @fredy_pferdi@social.linux.pizza · May 28 *

May 28 *

Profile13115 @fredy_pferdi@social.linux.pizza

@eugenialoli same with photo #raw processing, non of the #foss apps use the color profiles of the cameras, no 16bit raw and many other issues. I gave up retouch due to the still bad gimp ux and the bad implementation of non destructive editing.

The sad reality is the #linuxdesktop is not ready for professional #mediaproduction and this is such a bad thing in times like this.

#davinciresolve also barely runs on everything except nvidia on #linux and still has no #flatpak

Even blender is a pain with amd #rocm

**Janne Moren** @jannem@fosstodon.org · May 27

May 27

Janne Moren @jannem@fosstodon.org

Any #Linux #kernel ,
#graphics or #GPU people out there?

I'm trying to understand the relationship between the #amgdpu driver shipped with the kernel; and the "andgpu-dkms" driver that comes with #ROCm .

Specifically, with a recent enough kernel, do we really need to install the ROCm version of the driver? Does the ROCm version contain stuff the general driver does not? Or is the ROCm stack (esp. libhsa) tightly tied to a very specific version of the driver?

**Benjamin Carr, Ph.D.** @BenjaminHCCarr@hachyderm.io · Apr 16

Apr 16

Benjamin Carr, Ph.D. @BenjaminHCCarr@hachyderm.io

#AMD splits #ROCm toolkit into two parts – ROCm #AMDGPU drivers get their own branch under Instinct #datacenter #GPU moniker
The new #datacenter Instinct driver is a renamed version of the #Linux AMDGPU driver packages that are already distributed and documented with ROCm. Previously, everything related to ROCm (including the amdgpu driver) existed as part of the ROCm software stack.
https://www.tomshardware.com/pc-components/gpus/amd-splits-rocm-toolkit-into-two-parts-rocm-amdgpu-drivers-get-their-own-branch-under-instinct-datacenter-gpu-moniker

Tom's Hardware · Apr 14AMD splits ROCm toolkit into two parts – ROCm AMDGPU drivers get their own branch under Instinct datacenter GPU monikerBy Aaron Klotz

**RenézuCode** @ReneRebe@chaos.social · Apr 1

Apr 1

RenézuCode @ReneRebe@chaos.social

100 CPU threads & 240GB RAM to make @risc_v #AI @amd #ROCm and #t2linux https://www.twitch.tv/videos/2421181919

Twitch100 CPU threads & 240GB RAM to make RISCV AI ROCm! - t2sde on Twitcht2sde went live on Twitch. Catch up on their Software and Game Development VOD now.

**Natasha Nox** @Natanox@chaos.social · Mar 29

Mar 29

Natasha Nox @Natanox@chaos.social

ffs, why does their docker only support Navi 31 and not Navi 32?
https://hub.docker.com/r/rocm/pytorch

I just wish both #Nvidia and #AMD would stop with that whole licensing bullshit around #CUDA and #ROCm and just include that damn stuff in the default driver.
I just want to run #Codestral on my local machine so I can use it with non-public code. Will be troublesome enough to cram it into 16gb VRAM.
#computer #Linux #AI

**HGPU group** @hgpu@mast.hpc.social · Mar 23

Mar 23

HGPU group @hgpu@mast.hpc.social

The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUs

#SYCL #ROCm #CUDA #PTX #OpenMP #MPI #Astrophysics #Physics #Package

https://hgpu.org/?p=29827

hgpu.org · Mar 23The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUsWe present Shamrock, a performance portable framework developed in C++17 with the SYCL programming standard, tailored for numerical astrophysics on Exascale architectures. The core of Shamrock is a…

Replied in thread

**Giuseppe Bilotta** @giuseppebilotta@fediscience.org · Mar 10

Mar 10

Giuseppe Bilotta @giuseppebilotta@fediscience.org

Even now, Thrust as a dependency is one of the main reason why we have a #CUDA backend, a #HIP / #ROCm backend and a pure #CPU backend in #GPUSPH, but not a #SYCL or #OneAPI backend (which would allow us to extend hardware support to #Intel GPUs). <https://doi.org/10.1002/cpe.8313>

This is also one of the reason why we implemented our own #BLAS routines when we introduced the semi-implicit integrator. A side-effect of this choice is that it allowed us to develop the improved #BiCGSTAB that I've had the opportunity to mention before <https://doi.org/10.1016/j.jcp.2022.111413>. Sometimes I do wonder if it would be appropriate to “excorporate” it into its own library for general use, since it's something that would benefit others. OTOH, this one was developed specifically for GPUSPH and it's tightly integrated with the rest of it (including its support for multi-GPU), and refactoring to turn it into a library like cuBLAS is

a. too much effort
b. probably not worth it.

Again, following @eniko's original thread, it's really not that hard to roll your own, and probably less time consuming than trying to wrangle your way through an API that may or may not fit your needs.

**Fishd** @Fishd@infosec.exchange · Feb 21

Feb 21

Fishd @Fishd@infosec.exchange

AI rabbit hole ... I've been playing with Ollama and some stability diffusion tools on my MacBook Pro M2 Max and my Linux desktop ... the desktop is way faster and only has an RX6800 in it, so of course I'm now thinking about an Rx7900XTX ... (I don't do Nvidia cards) ...

Anyone have experience with this upgrade? Is going from 16gb of VRAM to 24gb going to make a massive difference?

Using radeontop I can see it's using all 16gb at some points, but not consistently ... and I'm not sure if that's an issue or a feature. I believe #rocm still has some issues.

#selfhosting #ai #sdxl

**JP Lehr** @jplehr@mast.hpc.social · Feb 3

Feb 3

JP Lehr @jplehr@mast.hpc.social

I also uploaded the slides for my talk in the #hpc devroom at #fosdem about the #programming models in #ROCm.
Video is reviewed too and waiting to be released to the public. You can get the slides and the video (once released) at https://fosdem.org/2025/schedule/event/fosdem-2025-5143-programming-models-with-the-rocm-compiler/

fosdem.orgFOSDEM 2025 - Programming models with the ROCm™ compiler

**Benjamin Carr, Ph.D.** @BenjaminHCCarr@hachyderm.io · Jan 15

Jan 15

Benjamin Carr, Ph.D. @BenjaminHCCarr@hachyderm.io

Just how deep is #Nvidia's #CUDA moat really?
Not as impenetrable as you might think, but still more than Intel or AMD would like
It's not enough just to build a competitive part: you also have to have #software that can harness all those #FLOPS — something Nvidia has spent the better part of two decades building with its CUDA runtime, while competing frameworks for low-level #GPU #programming are far less mature like AMD's #ROCm or Intel's #OneAPI.
https://www.theregister.com/2024/12/17/nvidia_cuda_moat/ #developers

The Register · Dec 17, 2024Just how deep is Nvidia's CUDA moat really?By Tobias Mann

**Stéphane Charette** @charette@mstdn.ca · Dec 30, 2024

Dec 30, 2024

Stéphane Charette @charette@mstdn.ca

I'm quite excited by this output. Darknet normally displays information on the NVIDIA CUDA-capable GPU when you run "darknet --version".

Check out these new changes I'm working on.

- rsmi_num_monitor_devices()
- rsmi_dev_name_get()
- rsmi_dev_memory_total_get()

#darknet #yolo #amd

Replied in thread

**Methylzero** @Methylzero@mast.hpc.social · Dec 24, 2024 *

Dec 24, 2024 *

Methylzero @Methylzero@mast.hpc.social

@fclc it is quite sad honestly, so much wasted potential #HPC #ROCm #amd #radeon #gpu

Continued thread

**Oblomov** @oblomov@sociale.network · Nov 29, 2024

Nov 29, 2024

Oblomov @oblomov@sociale.network

Even better, in the afternoon I managed to find a workaround for my #GPGPU software building but hanging when trying to run it, which seems to be related to an issue with some versions of the #AMD software stack and many integrated GPUs, not just the #SteamDeck specifically. So exporting the HSA_ENABLE_SDMA=0 environment vriable was sufficient to get my software running again. I'm dropping the information here in case others find it useful.

#ROCm #GPU #APU #HIP

2/2

Replied in thread

**Giuseppe Bilotta** @giuseppebilotta@fediscience.org · Oct 29, 2024

Oct 29, 2024

Giuseppe Bilotta @giuseppebilotta@fediscience.org

One of the nice things of the refactoring that I had to do to introduce CPU support is that it also allowed me to trivially had support for #AMD #HIP / #ROCm.
That, and the fact that AMD engineers have written a drop-in replacement for the Thrust library that we depend on in a couple of places. (This is also one of the things that is holding back a full #SYCL port for #GPUSPH, BTW.)

**bedast** @bedast@squirrelmob.com · Oct 27, 2024

Oct 27, 2024

bedast @bedast@squirrelmob.com

Had an opportunity to reboot my homelab server. Big kernel update and all that. This seems to have fixed my broken ROCm.

So I’m back to GPU accelerating my AI workloads.

The version of the kernel, though, might include the new AMD AI stuff, so I need to investigate this.

#Homelab #Tech #HomeServer

Recent searches

Search options

Administered by:

Server stats:

#rocm

Recent searches

Search options

Administered by:

Server stats:

RoCm

#rocm