#46147 closed defect (fixed)
smartmontools@6.20 - smartd automated scanning failing
Reported by: | thatrat@… | Owned by: | pixilla (Bradley Giesbrecht) |
---|---|---|---|
Priority: | Normal | Milestone: | |
Component: | ports | Version: | 2.3.3 |
Keywords: | Cc: | ||
Port: | smartmontools |
Description
Smartmontools/smartd is no longer scanning drives when I have the appropriate configuration file at /opt/local/etc/smartd.conf configured in Yosemite.
Here's the relevant line in /opt/local/etc/smartd.conf to run a short scan at 5AM every morning and then a long scan at 6AM:
/dev/disk0 -H -l error -l selftest -f -s (S/../.././05|L/../.././06) -m xxxxxxx@xxxxx.xxx
I took a look at the console, and saw the following lines when the scanning was supposed to occur:
Nov 25 05:22:24 Ubences-Intel-iMac.local smartd[152]: Authorization, server not available Nov 25 05:22:24 Ubences-Intel-iMac.local smartd[152]: Device: /dev/disk0, execute Short Self-Test failed. Nov 25 06:22:25 Ubences-Intel-iMac.local smartd[152]: Device: /dev/disk0, execute Long Self-Test failed.
What stands out to me is the first line "Authorization, server not available".
I've tried looking for info on this particular output, but I can't find anything to trace back to how this is occurring?
I think this might not have been working since Mavericks. It used to work perfectly a few Mac OS X releases ago.
I'm running the latest version of smartmontools that MacPorts has available [6.2] on Yosemite.
Below is output from running smartctl -a /dev/disk0. The test log shows when the automated scanning used to work properly.
Ubences-Intel-iMac:~ uquevedo$ smartctl -a /dev/disk0 smartctl 6.2 2013-07-26 r3841 [x86_64-apple-darwin14.0.0] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Black Device Model: WDC WD1002FAEX-00Y9A0 Serial Number: [Hiding this on purpose] LU WWN Device Id: 5 0014ee 2064339e8 Firmware Version: 05.01D05 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Thu Dec 4 17:03:58 2014 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (16860) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 174) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 13 3 Spin_Up_Time 0x0027 171 170 021 Pre-fail Always - 4433 4 Start_Stop_Count 0x0032 076 076 000 Old_age Always - 24328 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 5066 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 079 079 000 Old_age Always - 21823 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21 193 Load_Cycle_Count 0x0032 192 192 000 Old_age Always - 24306 194 Temperature_Celsius 0x0022 107 081 000 Old_age Always - 40 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5066 - # 2 Extended offline Completed without error 00% 4673 - # 3 Extended offline Completed without error 00% 4080 - # 4 Short offline Completed without error 00% 2987 - # 5 Extended offline Aborted by host 80% 2985 - # 6 Short offline Completed without error 00% 2982 - # 7 Short offline Completed without error 00% 2975 - # 8 Short offline Aborted by host 90% 2962 - # 9 Short offline Completed without error 00% 2960 - #10 Short offline Completed without error 00% 2957 - #11 Short offline Completed without error 00% 2956 - #12 Short offline Completed without error 00% 2950 - #13 Short offline Completed without error 00% 2948 - #14 Short offline Completed without error 00% 2946 - #15 Short offline Completed without error 00% 2944 - #16 Short offline Completed without error 00% 2941 - #17 Short offline Completed without error 00% 2939 - #18 Short offline Completed without error 00% 2934 - #19 Short offline Completed without error 00% 2933 - #20 Short offline Completed without error 00% 2929 - #21 Short offline Completed without error 00% 2927 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Attachments (1)
Change History (9)
comment:1 Changed 10 years ago by pixilla (Bradley Giesbrecht)
comment:2 Changed 10 years ago by thatrat@…
Unfortunately, no:
12/8/14 7:33:23.362 PM smartd[2669]: Authorization, server not available 12/8/14 7:33:23.362 PM smartd[2669]: Device: /dev/disk0, execute Long Self-Test failed. 12/8/14 8:03:23.519 PM smartd[2669]: Device: /dev/disk0, execute Short Self-Test failed.
However, running the daemon manually does work:
Ubences-Intel-iMac:~ uquevedo$ sudo /opt/local/sbin/smartd -n -d -c /opt/local/etc/smartd.conf smartd 6.3 2014-07-26 r3976 [x86_64-apple-darwin14.0.0] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org Opened configuration file /opt/local/etc/smartd.conf Configuration file /opt/local/etc/smartd.conf parsed. Device: /dev/disk0, opened Device: /dev/disk0, WDC WD1002FAEX-00Y9A0, S/N:WD-WCAW32908017, WWN:5-0014ee-2064339e8, FW:05.01D05, 1.00 TB Device: /dev/disk0, found in smartd database: Western Digital Black Device: /dev/disk0, is SMART capable. Adding to "monitor" list. Device: /dev/disk0, state read from /opt/local/var/lib/smartmontools/smartd.WDC_WD1002FAEX_00Y9A0-WD_WCAW32908017.ata.state Monitoring 1 ATA and 0 SCSI devices Device: /dev/disk0, opened ATA device Device: /dev/disk0, state written to /opt/local/var/lib/smartmontools/smartd.WDC_WD1002FAEX_00Y9A0-WD_WCAW32908017.ata.state Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, starting scheduled Short Self-Test. Device: /dev/disk0, state written to /opt/local/var/lib/smartmontools/smartd.WDC_WD1002FAEX_00Y9A0-WD_WCAW32908017.ata.state Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, starting scheduled Long Self-Test. Device: /dev/disk0, state written to /opt/local/var/lib/smartmontools/smartd.WDC_WD1002FAEX_00Y9A0-WD_WCAW32908017.ata.state Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device Device: /dev/disk0, opened ATA device ^\smartd received signal 3: Quit: 3 Device: /dev/disk0, state written to /opt/local/var/lib/smartmontools/smartd.WDC_WD1002FAEX_00Y9A0-WD_WCAW32908017.ata.state smartd is exiting (exit status 0) Ubences-Intel-iMac:~ uquevedo$ smartctl -a /dev/disk0 smartctl 6.3 2014-07-26 r3976 [x86_64-apple-darwin14.0.0] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Black Device Model: WDC WD1002FAEX-00Y9A0 Serial Number: WD-WCAW32908017 LU WWN Device Id: 5 0014ee 2064339e8 Firmware Version: 05.01D05 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Tue Dec 9 05:47:27 2014 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (16860) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 174) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 13 3 Spin_Up_Time 0x0027 172 170 021 Pre-fail Always - 4375 4 Start_Stop_Count 0x0032 076 076 000 Old_age Always - 24368 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5124 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 079 079 000 Old_age Always - 21861 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21 193 Load_Cycle_Count 0x0032 192 192 000 Old_age Always - 24346 194 Temperature_Celsius 0x0022 101 081 000 Old_age Always - 46 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 5124 - # 2 Short offline Completed without error 00% 5120 - # 3 Short offline Completed without error 00% 5093 - # 4 Extended offline Completed without error 00% 5082 - # 5 Short offline Completed without error 00% 5069 - # 6 Short offline Completed without error 00% 5066 - # 7 Extended offline Completed without error 00% 4673 - # 8 Extended offline Completed without error 00% 4080 - # 9 Short offline Completed without error 00% 2987 - #10 Extended offline Aborted by host 80% 2985 - #11 Short offline Completed without error 00% 2982 - #12 Short offline Completed without error 00% 2975 - #13 Short offline Aborted by host 90% 2962 - #14 Short offline Completed without error 00% 2960 - #15 Short offline Completed without error 00% 2957 - #16 Short offline Completed without error 00% 2956 - #17 Short offline Completed without error 00% 2950 - #18 Short offline Completed without error 00% 2948 - #19 Short offline Completed without error 00% 2946 - #20 Short offline Completed without error 00% 2944 - #21 Short offline Completed without error 00% 2941 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
comment:3 Changed 10 years ago by pixilla (Bradley Giesbrecht)
Keywords: | smartmontools smartd smartctl removed |
---|---|
Owner: | changed from macports-tickets@… to takanori@… |
Changed 10 years ago by pixilla (Bradley Giesbrecht)
Attachment: | patch-sysutils-smartmontools-no-fork.diff added |
---|
comment:4 Changed 10 years ago by pixilla (Bradley Giesbrecht)
thatrat: does the attached no-fork patch solve the issue for you? With this patch, smartmontools has been running for 24+ hours here.
comment:5 follow-up: 8 Changed 10 years ago by thatrat@…
It looks like the patch worked.
From the below, the increments in testing are recent relative to the power on hours value:
Ubences-Intel-iMac:smartmontools uquevedo$ smartctl -a /dev/disk0 smartctl 6.3 2014-07-26 r3976 [x86_64-apple-darwin14.0.0] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Black Device Model: WDC WD1002FAEX-00Y9A0 Serial Number: [hiding this] LU WWN Device Id: 5 0014ee 2064339e8 Firmware Version: 05.01D05 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Thu Dec 11 12:32:48 2014 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (16860) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 174) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 13 3 Spin_Up_Time 0x0027 174 170 021 Pre-fail Always - 4300 4 Start_Stop_Count 0x0032 076 076 000 Old_age Always - 24413 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 5149 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 079 079 000 Old_age Always - 21882 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21 193 Load_Cycle_Count 0x0032 192 192 000 Old_age Always - 24391 194 Temperature_Celsius 0x0022 102 081 000 Old_age Always - 45 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 1 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 5146 - # 2 Extended offline Completed without error 00% 5143 - # 3 Short offline Completed without error 00% 5140 - # 4 Extended offline Completed without error 00% 5135 - # 5 Extended offline Completed without error 00% 5124 - # 6 Short offline Completed without error 00% 5120 - # 7 Short offline Completed without error 00% 5093 - # 8 Extended offline Completed without error 00% 5082 - # 9 Short offline Completed without error 00% 5069 - #10 Short offline Completed without error 00% 5066 - #11 Extended offline Completed without error 00% 4673 - #12 Extended offline Completed without error 00% 4080 - #13 Short offline Completed without error 00% 2987 - #14 Extended offline Aborted by host 80% 2985 - #15 Short offline Completed without error 00% 2982 - #16 Short offline Completed without error 00% 2975 - #17 Short offline Aborted by host 90% 2962 - #18 Short offline Completed without error 00% 2960 - #19 Short offline Completed without error 00% 2957 - #20 Short offline Completed without error 00% 2956 - #21 Short offline Completed without error 00% 2950 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Can this patch be rolled into the main release so I can update some other systems to test?
comment:6 Changed 10 years ago by pixilla (Bradley Giesbrecht)
Owner: | changed from takanori@… to pixilla@… |
---|---|
Status: | new → assigned |
comment:7 Changed 10 years ago by pixilla (Bradley Giesbrecht)
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
See r129382
comment:8 Changed 10 years ago by pixilla (Bradley Giesbrecht)
Replying to thatrat@…:
Can this patch be rolled into the main release so I can update some other systems to test?
Done.
Does the recent update to version 6.3 solve this issue? See r129182.