You might require to utilize the gpu_memory_limit and/or lora_on_cpu config alternatives to stop managing away from memory. If you continue to run outside of CUDA memory, it is possible to try and merge in method RAM https://socialrus.com/story17234811/indicators-on-https-imtoken-wt-com-you-should-know