Maybe get in touch with their support. Since we're on the default config maybe they can do the mdadm magic as well when replacing that hdd!
My experience with Hetzner support has been very good so far.
Dear Client,
We would like to check the hard drives briefly from our side. Please tell us when we may turn off the server for approx 30-45 minutes in order to perform the test.
Kind regards
xxxxxx
I am just talking out of my ass here, since the last time I got deep into drive technology was Amiga floppy disks. If I remember correctly, such a disk was divided in a bunch of tracks, which were divided in a bunch of sectors. Each sector had a checksum. So when you read data from the sector, and then compared with the checksum, you knew if the data was healthy or corrupt. I presume technology hasn't deteriorated and modern HDD and SSD also checksum or otherwise validate parts, so in a raid 1 setup you know which disk has the right data and which the broken?
However, I just checked smartctl -a again, and the numbers seem significantly worse than yesterday.Dear Client
Both hard drives are fine. We have started your server back into your installed system. But note there is currently a rebuild of one device running.
Kind regards
xxxxxxx
Code: Select all
root@server [~]# while true; do smartctl -a /dev/sdb |grep Raw_Read_Error_Rate; sleep 300; done
1 Raw_Read_Error_Rate 0x000f 070 063 044 Pre-fail Always - 12163138
RAID is not a backup system, it's just a way to have some redundancy (or a nice way to be able to add disk space to an array).webwit wrote: I am just talking out of my ass here, since the last time I got deep into drive technology was Amiga floppy disks. If I remember correctly, such a disk was divided in a bunch of tracks, which were divided in a bunch of sectors. Each sector had a checksum. So when you read data from the sector, and then compared with the checksum, you knew if the data was healthy or corrupt. I presume technology hasn't deteriorated and modern HDD and SSD also checksum or otherwise validate parts, so in a raid 1 setup you know which disk has the right data and which the broken?
Code: Select all
root@server [~]# while true; do smartctl -a /dev/sdb |grep Raw_Read_Error_Rate; sleep 300; done
1 Raw_Read_Error_Rate 0x000f 070 063 044 Pre-fail Always - 12163138
1 Raw_Read_Error_Rate 0x000f 070 063 044 Pre-fail Always - 12518172
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 12762654
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 13082807
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 13765149
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 14005397
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 14182096
1 Raw_Read_Error_Rate 0x000f 071 063 044 Pre-fail Always - 14432541
1 Raw_Read_Error_Rate 0x000f 072 063 044 Pre-fail Always - 14697695
1 Raw_Read_Error_Rate 0x000f 072 063 044 Pre-fail Always - 14840703
Unless we are experiencing HDD performance bottlenecks I would prefer a good enterprise hdd over a ssd.webwit wrote: This one:
https://www.hetzner.de/dedicated-rootserver/ex41
When you order you can pick options such as extra SSD drive (cheapest one 250 GB 11,90 EUR), but the real question is, do we need it? In any case, that's a different discussion, priority is now to get a stable environment asap. I'm planning the move on Saturday or Sunday.
Losing your data to something like SSD failure as opposed to catching and replacing a failing HDD is kind of irrelevant IMO, because SSD failure is much less common than HDD failure, by like, an order of magnitude, and i find that generally early warning measures for HDD failure aren't as reliable as one would hope. It can be just as sudden and unexpected as SSD failure.Wodan wrote:Unless we are experiencing HDD performance bottlenecks I would prefer a good enterprise hdd over a ssd.webwit wrote: This one:
https://www.hetzner.de/dedicated-rootserver/ex41
When you order you can pick options such as extra SSD drive (cheapest one 250 GB 11,90 EUR), but the real question is, do we need it? In any case, that's a different discussion, priority is now to get a stable environment asap. I'm planning the move on Saturday or Sunday.
Most HDDs die slowly and give you time to react .. while some SSDs just stop working and there is no way to recover your data.
Maybe get weekls SMART reports from the server for an early watch