Probably due to a power outage, I suddenly lost access to a connected USB HDD yesterday. According to parted, I get the message “unknown partition table,” and gdisk says the GPT is corrupted. Using testdisk, I was able to copy the files to another drive and restore the partition table and mount the HDD.

Is it possible that the partition table was damaged by the power outage, or does this point to a different problem? Can I safely store data on such an HDD again, or should I replace it?

  • KiwiTB@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    4
    ·
    5 days ago

    You can use a tool like smart control to see some drive info but unfortunately Linux has very few decent drive diagnostic tools. All the good tools need DOS or Windows.

      • KiwiTB@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 days ago

        It’s good, the only downside is it’s a smart tool which only gives you smart data Vs actual drive health

        • muhyb@programming.dev
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          4 days ago

          While it’s true we don’t really need those old tools anymore, unless one have ancient hardware. On Linux we can use badblocks to test the hard drive. This is from Arch Wiki:

          Modern HDDs and SSDs include firmware that will automatically detect, attempt to correct, and report errors. However, firmware becomes aware of a corrupted sector only upon an attempt to read or write to it. Badblocks may be used to test the entire device at once. It operates by sequentially attempting to read and optionally write to and read back every sector on a drive, and report errors. Consequently, the firmware will react to any detected failures in this process.

          So, for most cases SMART data is actually sufficient. And there is badblocks if you want to check the entire disk. However we don’t have manufacturer tools like Windows has.

          A little warning about badblocks. Don’t do a write test if you have important stuff on it because it will erase the disk.

          • KiwiTB@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            4 days ago

            As a computer technician I have seen hundreds of times the drive and smart data didn’t actually know what was going on and it took quality tools to alert the driver to what was happening to start work.

            • muhyb@programming.dev
              link
              fedilink
              English
              arrow-up
              2
              ·
              4 days ago

              If the drive’s firmware is faulty, SMART data will be faulty too. But can you say the percentage is somewhat high from what you dealt with, a little statistics? What I saw is my personal experience and it’s definitely wouldn’t be accurate as yours. I only saw a drive died out of nowhere a handful of times which is not high if I make it into a percentage.Though if the drive itself is faulty, it won’t take long for it to die too.

              The best I saw is a WD Caviar Black 500 GB drive from 2011 we use, still kicking. Took a backup because of its age a couple years ago but haven’t died yet. The worst I saw was my friend’s NVMe SSD that died in 3 months after he installed. Probably its firmware was also faulty because SMART didn’t help that time.

              • KiwiTB@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                3 days ago

                It’s nothing to do with faulty firmware, it’s that smart will only see 1 in 3 issues and as such is simply not good enough to use as actual diagnostics.

                • muhyb@programming.dev
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 days ago

                  I see. So, you’re saying that occasionally checking smartctl (or having smartd as a daemon continuously), running badblocks time to time and maybe checking iostat not really enough? I mean, Linux is by far the most used OS on servers and datacenters, if these are not enough someone would write a proper tool I guess, don’t you think?

                  • KiwiTB@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    3 days ago

                    Not at all. It takes a huge amount of work to do so, and the benefit of using raid etc is redundancy so they can afford for things to fail. Smart mon tools is a great example, the software is great but it needs it’s database to support that drives functions to work well and they can’t and don’t support everything.