C2ME OpenCL Acceleration Module

By
ishlandmc

C2ME addon that provides hardware accelerated world generation through OpenCL

C2ME OpenCL Acceleration Module

Experimental C2ME addon that provides hardware accelerated world generation through OpenCL. Requires the base C2ME mod.
It is strongly recommended to install ScalableLux, because lighting can easily become a bottleneck.

Note

This mod requires Java 25 to function correctly, even on versions before 26.1.

World generation should have full vanilla parity in vanilla worldgen, with one exception:
Biome borders may get shifted by one or two blocks in very rare cases due to the vanilla implementation being order-dependent.

Usual worldgen non-determinism applies.

Currently only noise stage and biome stage are implemented.
The expected performance uplift is 80+% on vanilla overworld when cpu-bound. Performance will vary depending on seeds, datapacks, etc.

Also, worldgen now include GPU driver bugs in them. Please backup your worlds before using this on existing worlds.
Some worldgen mods is known to fail catastrophically. See Mod compatibility section for details.

Platform compatibility matrix

Supported: known to be fully working in general
Partial: known to be working with some big caveats
Unsupported: known to be not working at all
N/A: Not applicable because the combination doesn't exist.

Vendor	Generation	Driver	Windows	Linux
NVIDIA	Maxwell and beyond	Proprietary and Open	Supported	Partial¹
NVIDIA	Kepler	Proprietary	Unknown	Supported
NVIDIA	Older cards	Any	Unknown	Unknown
NVIDIA	nouveau supported GPUs	Rusticl on nouveau	N/A	Unknown
Intel	Gen9, Gen9.5⁵	Official²	Partial³	Partial⁴
Intel	Gen11, Gen12, Gen12.5⁶	Official²	Unsupported¹²	Unsupported¹²
Intel	Gen12.7, Xe2, Xe3 and beyond⁷	Official²	Partial³	Partial⁴
Intel	Older graphics	Official²	Unknown	Unknown
Intel	iris supported GPUs	Rusticl on iris	N/A	Unknown
AMD	RDNA1 and beyond¹⁴	Official⁸	Supported⁹	Supported
AMD	GCN	Official¹⁰	Unsupported¹³	Unsupported¹³
AMD	radeonsi supported GPUs	Rusticl on radeonsi	N/A	Partial¹¹
Qualcomm	Any	Official	Unsupported¹²	Unsupported¹²
Apple	Any	Official	Unsupported¹²	Unsupported¹²

¹ The driver is known to hang after a while. Driver branch 535 LTS seem to work fine.
² The official driver package on Windows. Gen9 and Gen9.5 needs up-to-date drivers:
Gen9: https://www.intel.com/content/www/us/en/download/762755/intel-6th-gen-processor-graphics-windows.html
Gen9.5: https://www.intel.com/content/www/us/en/download/776137/intel-7th-10th-gen-processor-graphics-windows.html
For Linux: https://github.com/intel/compute-runtime
³ GPU is known to crash on pretty much all non-vanilla worldgen. Your millage may vary.
⁴ GPU is known to crash with some complex worldgen datapacks, such as Terralith. Your millage may vary.
⁵ Gen9 and Gen9.5 are integrated graphics on 6th-9th gen core processors, and 10th gen non-G series core processors
⁶ Gen11 and Gen12, Gen12.5 here are integrated graphics in 10th G-series core processors, 11th-14th gen core processors, plus Arc DG1, Arc A-series
⁷ Gen12.7 refers to Meteor Lake and Arrow Lake integrated graphics.
At this point it is integrated graphics in Core Ultra 100 series and above, plus Battlemage dedicated graphic and above.
⁸ The official driver package on Windows. The ROCm runtime on Linux.
⁹ Driver versions 26.5.1 is known to always crash. Existing installations upgraded to 26.6.1 may also crash as well.
If you are experiencing crashes on 26.6.1, it is recommended to DDU then do a fresh installation.
¹⁰ The official driver package on Windows. The AMDGPU-Pro runtime on Linux.
¹¹ Mesa 26.1.x branch is known to work on RDNA3/4. Anything can happen with any hardware combination, including corrupted worldgen. Your millage may vary.
¹² Missing FP64 support
¹³ Driver crashes
¹⁴ Integrated graphics for Ryzen 7000 series and 9000 series not included. They are too slow for this task.

Any hardware not listed here are in Unknown status. Feel free to test other hardware configurations that meets the minimum requirements detailed below.

Minimum hardware requirements

a working OpenCL 1.2+ driver
cl_khr_fp64 support (fp64 support)

Nice to have things

a working OpenCL 3.0 driver
cl_khr_device_uuid for stable device matching
CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE for optimal performance. Not present on AMD GPUs
cl_khr_priority_hints and cl_khr_throttle_hints for queue priority. Only known to be present on Intel GPUs
Non-uniform workgroups, not present on Nvidia GPUs and some AMD ones

Performance expectations using Chunky

For 1200+ cps targets in vanilla overworld:

CPUs: modern mid-range desktops (9700X, 9800X3D, 245K)
GPUs:

Nvidia GTX 1060 or greater
AMD Radeon RX 6500 XT or greater
Intel Arc B570 or greater (there's no weaker GPUs from Intel, so there's that)

For 2500+ cps targets in vanilla overworld:

CPUs: modern flagship desktops (9950X, 9950X3D, 285K, 270K+)
GPUs:

Nvidia GTX 1080 Ti or higher, RTX 4060 Ti or higher
AMD Radeon RX 7600 XT or greater
Intel Arc B570 or greater (there's still no weaker GPUs from Intel)

Usage

Windows

Install Fabric
Install the mod
Done

Linux

Since most linux distributions do not ship OpenCL out-of-the-box, you'll have to install it manually.

Running in Flatpak is an unsupported configuration

Flatpak currently only support Rusticl (and probably NVIDIA) as a OpenCL runtime, which is very rough on Linux. See the compatibility matrix above.

For Nvidia users

Usually the nvidia driver package your distro provides includes the OpenCL driver. You should be able to just use it. If it does not, see the following distro-specific setups.

Debian / Ubuntu

All vendors: install ocl-icd-opencl-dev
Nvidia: install nvidia-driver-full package
AMD: install rocm-opencl-icd package
Intel: install intel-opencl-icd for Gen12 and above or intel-opencl-icd-legacy for older iGPUs (not available on Debian 13 though, you'll have to compile them)

Arch Linux

https://wiki.archlinux.org/title/General-purpose_computing_on_graphics_processing_units#OpenCL
All vendors: install ocl-icd
Nvidia: install opencl-nvidia
AMD: install rocm-opencl-runtime
Intel: install intel-compute-runtime for Gen12 and above or intel-compute-runtime-legacy from AUR for older iGPUs

Fedora

All vendors: install ocl-icd-devel
Nvidia: https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/fedora.html
AMD: install rocm-opencl
Intel: install intel-opencl for Xe

Using Rusticl (very experimental)

This requires Mesa 26.1 and above and rusticl+fp64 to be enabled.

Compatibility

Datapack compatibility

This feature is guaranteed to work with datapacks that can be loaded with vanilla. For example:

Stardust Labs datapacks (Terralith, Incedium, …)
Tectonic
CliffTree
… more

Mod compatibility

Most non-worldgen mods should work.

For worldgen mods:

Mods that repackage datapacks (that is, if renamed to a .zip that can still work as a datapack), see datapack compatibility.
Tectonic as a mod works
Mods that use custom density functions do NOT work, for now. (Such as Enderscape)
Mods that rolls their entire new world generator do NOT work and probably never will without a lot of work. (Such as Big Globe)
Some other special cases:
- Biomes O' Plenty: causes biome placement to fail completely
- TerraBlender: also causes biome placement to fail completely

Known issues:

Shader compilation is known to take a while, depending on the datapack used.
Extra memory usage outside of heap is expected for shader compilation.
PoCL with CPU backend will almost certainly crash even if pocl is blacklisted in the config. The solution is to remove it entirely.
Datapacks referencing minecraft:beardifier density function directly can have slight errors in terrain shape. There's no plans to fix this, as vanilla isn't affected by this, and fixing this will halve the gpu throughput.

Tuning recommendations, for people that just want worldgen to go fast

Credit goes to skillnoob_ on discord.

Mods:

ScalableLux (Light Engine Optimization, bottleneck in high performance chunk generation)
Lithium (General Optimization mod for various things)
FerriteCore (Memory usage Improvements)
Structure Layout Optimizer (Makes Structures generate faster)
zFastNoise (speeds up noise and surface builder in worldgen)

Java/JVM flags:

-XX:+UseCompactObjectHeaders -Dchunky.maxWorkingCount=768 (The -Dchunky.maxWorkingCount=768 argument is only relevant if you are using chunky).
Use -XX:+UseZGC if you are allocating more than 16GB of memory, otherwise use -XX:+UseG1GC or -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational.

Then in the c2me.toml in your config folder you can change the globalExecutorParallelism = "default" option to your thread count or slightly below.
For example if you have a 16 thread CPU you'd need to change it to globalExecutorParallelism = 16.
You can also enable gcFreeChunkSerializer = true in the config, which can increase chunk gen performance.

Frequently Asked Questions

Does this use multiple GPUs?

By default, it does least-busy scheduling on all OpenCL devices it can find.
However, usually it is your CPU that's the bottleneck. See below.
So don't expect multi-GPU to bring any improvements unless you only have a bunch of GT1030.

My GPU is barely used and my CPU is being maxed out. What is happening?

With a reasonable pair of CPU and GPU, you will be CPU bound. This is mostly because only noise stage and biome stage are implemented.
Other stages may get implemented in the future.

How do I select GPUs?

You can specify whitelists and blacklists in the config file. Device UUID can be found in the logs.
AMD GPU names are gfx something in the logs, not their marketing names.

Does voxy work with this?

Yes.
In short, to generate render distance for voxy, install Chunky, run /voxy import current, then start a chunky task. It is recommended to join their discord server for more information.

Does Distant Horizons work with this?

Short answer: not recommended. Use Voxy instead.

Long answer: DH is too slow to see the benefit this brings.
If you still intend to use DH, use the Internal Server / Full - Save Chunks mode in DH for acceleration to work.
Even so, you may not see improvements because LoD generation is the slowest part in the chain already.

I'm getting `OpenCL error [-1001]`. What does it mean?

The OpenCL ICD loader is unable to locate any OpenCL drivers. Check your driver installation.
It is recommended to use clinfo tool as a quick check.

Does this work on dedicated servers?

Works on dedicated servers and singleplayer as long as drivers is in place.
Only linux x86_64, linux arm64 and windows x86_64 binaries is shipped on dedicated server.

Can I make it fall back to normal worldgen if initialization fails?

Not by default. This can be done with openclAccel.allowIncompatibilityFallback in the config file.

I heard that Vulkan is THE graphics API to go. Why OpenCL?

I'm familiar with it
I need untyped pointers in vulkan, which did not exist in Vulkan until very recently
and that's effectively requiring Vulkan 1.4, shrinking hardware compatibility by a lot
Vulkan does not clearly specify FP64 precision outside "at least that of FP32".
correctly rounded fp division and sqrt is still missing from the vulkan spec

Why not CUDA? or Rocm? or Level0? or Metal?

No vendor locked APIs.

I'm on AMD with a RDNA GPU, and I'm seeing crashes before worldgen even starts. Why?

Driver versions 26.5.1 is known to always crash. Existing installations upgraded to 26.6.1 may also crash as well.
If you are experiencing crashes on 26.6.1, it is recommended to DDU then do a fresh driver installation.

See footnote 9 in Platform compatibility matrix.

The C2ME OpenCL Acceleration Module Team

24
Followers
12
Projects
39.4M
Downloads

C2ME OpenCL Acceleration Module

C2ME OpenCL Acceleration Module

Note

Platform compatibility matrix

Minimum hardware requirements

Nice to have things

Performance expectations using Chunky

For 1200+ cps targets in vanilla overworld:

For 2500+ cps targets in vanilla overworld:

Usage

Windows

Linux

Running in Flatpak is an unsupported configuration

For Nvidia users

Debian / Ubuntu

Arch Linux

Fedora

Using Rusticl (very experimental)

Compatibility

Datapack compatibility

Mod compatibility

Known issues:

Tuning recommendations, for people that just want worldgen to go fast

Frequently Asked Questions

Does this use multiple GPUs?

My GPU is barely used and my CPU is being maxed out. What is happening?

How do I select GPUs?

Does voxy work with this?

Does Distant Horizons work with this?

I'm getting OpenCL error [-1001]. What does it mean?

Does this work on dedicated servers?

Can I make it fall back to normal worldgen if initialization fails?

I heard that Vulkan is THE graphics API to go. Why OpenCL?

Why not CUDA? or Rocm? or Level0? or Metal?

I'm on AMD with a RDNA GPU, and I'm seeing crashes before worldgen even starts. Why?

The C2ME OpenCL Acceleration Module Team

More from ishlandmcView all

I'm getting `OpenCL error [-1001]`. What does it mean?