If you’re just getting into GPU Mining or you’re starting to have problems with your existing miner, then you’re in the right spot. With over 10 years of GPU Mining Experience, I will explain the approach I take to troubleshooting a miner when it starts to have issues.
Problems I address (there are a lot more this will fix).
- Computer shuts off or restarts when miner tries to start up
- GPUs not detected at all
- GPUs in Device Manager have yellow warning signs, that are either on all the time or disappear and reappear randomly indicating an error
- Miner crashes due to lost GPU, crashed GPU, missing table ID, or missing device
- Miner or computer randomly crashes
- Fake Blue Screens that indicate graphics driver errors, memory management errors, or other error codes
Troubleshooting Steps
Step 1: Inspect Cabling and Ensure Proper Cable Setup
A lot of beginner miners and even some that have been doing it for awhile tend to ignore basic electrical laws and overload adapters and cable limitations. This can cause a system to get up and running but over time as reality and heat starts to run through the setup, cables can start to degrade and cause a variety of issues.
The problem with issues that are caused by cable deterioration or improper setup is that they are nearly impossible to troubleshoot. They can show up as graphics drivers errors, memory errors, software errors, windows or operating system errors, device manager errors, and all sorts of other errors.
Before you ever try to address software, it is vital you eliminate cabling from being the culprit.
Check each cable for evidence of burning. Yellow cables usually turn brown or become brittle. Disconnect the peripherals, accessories, molex, or sata cables (usually 6 pin cables coming from the PSU) and make sure the connectors aren’t burn into the power supply. You can also do this with the PCIe cables (8 pins) if you somehow overloaded the cables made for GPUs.
If possible never use any adapters, if you must, use them sparingly and make sure they are of proper electrical requirements, ie, do not run a double molex (4 pin) to 6 pin cable into a 6 pin to dual 8 pin cable to power 2 GPUs as talked about in the video.
Ensure any 6 pin cables from the psu (SATA, peripheral, etc) are only powering accessories (risers, sata hard drives, etc, NOT GPUS). Limit the number of devices on these cables to 2 per sata/molex cable, only do 1 per if possible.
In certain cases it is acceptable to run a double molex or double sata to 8 pin adapter to power a very small GPU that has only 1 8 pin requirement and uses less than 100w of power (think 1660 Supers).
Step 2: Mining Settings
The longer your system runs, the higher the probability the hardware and cables will start to deteriorate, it’s just something that happens with electronics. So a system that’s been running fine for 6 months with certain overclock settings may need to be adjusted if you one day get random crashes and your cables are still good.
For example, if you have an RTX 3080 running at +1200mhz on a memory overclock, you could try either resetting it to factory settings and see if you still get the error, or you could start reducing it by -200mhz and starting the system up to see if that resolves the issue.
You may have to add/subtract power in conjunction with these changes. Even though some guides say a card can run at 70% of its power, I have found that each card manufacturer has specific needs. The card you have may need to run at 71-75% of its power to maintain stability.
At this point 99% of all system problems will be solved. The remaining troubleshooting steps are in this order.
Step 3: Ensure PSU can properly handle the load
Sometimes people will base the PSU they buy off the mining settings of a card. Say for example, a 3080 runs at around 240w once its undervolted. So they take 240w and they times it by 6 or however many cards they have, and then they buy a PSU just outside of that range. When your overclock settings come off or reset (which happens sometimes) those cards run at full power and can cause instability or crash your computer.
You can encounter this problem if you add GPUs or change other settings so just be aware of what the potential power draw is of your system, not just what you have it configured to. If the potential gets close to the PSU limit or exceeds it, it can cause instability.
Step 4: Bios Settings
Sometimes you can install 6 gpus into a motherboard and like a fluke it will load and mine properly. Then later on after an update or a power outage and it restarts, you find that the system only recognizes 3-4 cards. This could be due to not having properly configured your bios settings or your bios settings getting reset on you for any number of reasons.
Just make sure that your bios settings are still correct. For 6 GPU rigs these settings are specific to the motherboard. You will have to google “How to mine with 6 GPUS on a <insert your board name and model>” and search through the threads to find what people are using.
Other Problems
Hashrate Throttling – This is more apparent in the higher powered cards, if a 3090 is supposed to run at 120-122mhs on ethereum, starts out that high, but slowly comes down or throttles back and forth to a significantly lower number. 99 out of 100 times you will have to replace the thermal pads as the VRAM is over heating.
You can quickly identify a potential card you think is causing issue by adjusting the fan speed to either 0 or 100% in software like MSI Afterburner.
Leave a Reply