/ Forside / Teknologi / Operativsystemer / Linux / Nyhedsindlæg
Login
Glemt dit kodeord?
Brugernavn

Kodeord


Reklame
Top 10 brugere
Linux
#NavnPoint
o.v.n. 11177
peque 7911
dk 4814
e.c 2359
Uranus 1334
emesen 1334
stone47 1307
linuxrules 1214
Octon 1100
10  BjarneD 875
SMART melder fejl hver 30 minutter i /var/~
Fra : KV


Dato : 14-07-07 12:34

Jul 14 08:41:58 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 09:11:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 09:41:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 10:11:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 10:41:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 11:11:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 11:41:58 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 12:11:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 12:41:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors
Jul 14 13:11:57 smartd[2758]: Device: /dev/hda, 1 Offline uncorrectable
sectors

Hvordan slår man det fra, at den skal logge dét for hver halve time? Jeg
synes det ville være fint nok, hvis den kan rapportere nye fejl, men jeg
gider ikke have min log fyldt med kendte fejl.

En anden ting - jeg kan se, at min 2½" laptop disk har 8 fejl. De er alle
kommet på samme dag, så jeg håber lidt på, at det er sket ved et stød og at
det ikke er fordi disken er ved at blive træt. Fejlene er sket 193 dage
henne i diskens levetid - for 400 timer siden.

[root@]# smartctl -l error /dev/hda
smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 8 (device log contains only the most recent five errors)

Jeg har taget diverse harddisks test men kan ikke finde én eneste fejl på
disken. Er det fordi de 8 fejl er meldt som døde og at de aldrig igen vil
blive brugt (og derfor ikke findes af testen) eller er det SMART der er lige
smart nok og måske fejlrapportere?



 
 
Michael Rasmussen (14-07-2007)
Kommentar
Fra : Michael Rasmussen


Dato : 14-07-07 13:14

On Sat, 14 Jul 2007 13:34:20 +0200, "KV" <nospam@REMOVE.gmail.com>
wrote:

>Jeg har taget diverse harddisks test men kan ikke finde én eneste fejl på
>disken. Er det fordi de 8 fejl er meldt som døde og at de aldrig igen vil
>blive brugt (og derfor ikke findes af testen) eller er det SMART der er lige
>smart nok og måske fejlrapportere?

Mangler nogle oplysninger. Kør en

smartctl - a /dev/hda

og vis os den fulde udskrift.

Også meget gerne en udskrift af din /etc/smartd.conf fil.

SÅ skal vi prøve at hjælpe dig på vej...

<mlr>

KV (14-07-2007)
Kommentar
Fra : KV


Dato : 14-07-07 13:33

>>Jeg har taget diverse harddisks test men kan ikke finde én eneste fejl på
>>disken. Er det fordi de 8 fejl er meldt som døde og at de aldrig igen vil
>>blive brugt (og derfor ikke findes af testen) eller er det SMART der er
>>lige
>>smart nok og måske fejlrapportere?
>
> Mangler nogle oplysninger. Kør en
> smartctl - a /dev/hda
> og vis os den fulde udskrift.
> Også meget gerne en udskrift af din /etc/smartd.conf fil.
> SÅ skal vi prøve at hjælpe dig på vej...

Jo, men den er jo ret uoverskuelig her, hvor det hele bliver wrappet til
60-70 karaktére. Men her er det:

[root@]# smartctl -a /dev/hda
smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Fujitsu MHTxxxxAH family
Device Model: FUJITSU MHT2040AH
Serial Number: NP0JT4B2DGKJ
Firmware Version: 846C
User Capacity: 40,007,761,920 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 3a
Local Time is: Sat Jul 14 14:30:15 2007 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Disabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test has
ever
been run.
Total time to complete Offline
data collection: ( 293) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 40) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 046 Pre-fail
s - 136981
2 Throughput_Performance 0x0005 100 100 030 Pre-fail
ine - 11927734
3 Spin_Up_Time 0x0003 100 100 025 Pre-fail
s - 0
4 Start_Stop_Count 0x0032 099 099 000 Old_age
ys - 2512
5 Reallocated_Sector_Ct 0x0033 100 100 024 Pre-fail
s - 8589934592000
7 Seek_Error_Rate 0x000f 100 100 047 Pre-fail
s - 824
8 Seek_Time_Performance 0x0005 100 100 019 Pre-fail
ine - 3
9 Power_On_Seconds 0x0032 090 090 000 Old_age
ys - 5113h+05m+53s
10 Spin_Retry_Count 0x0013 100 100 020 Pre-fail
s - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
ys - 1568
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age
ys - 152
193 Load_Cycle_Count 0x0032 095 095 000 Old_age
ys - 56909
194 Temperature_Celsius 0x0022 100 100 000 Old_age
ys - 41 (Lifetime Min/Max 10/51)
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age
ys - 250
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age
ys - 287047680
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
ys - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
line - 1
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
ys - 0
200 Multi_Zone_Error_Rate 0x000f 100 100 060 Pre-fail
s - 8144
203 Run_Out_Cancel 0x0002 100 100 000 Old_age
ys - 429519470745

SMART Error Log Version: 1
ATA Error Count: 8 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
When the command that caused the error occurred, the device was active or
idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 d8 08 af 78 5e e0 00 00:02:59.386 READ DMA
c8 d8 08 af 78 5e e0 00 00:02:55.495 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:55.495 READ DMA
c8 d8 08 b7 78 5e e0 00 00:02:55.495 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:55.495 READ DMA

Error 7 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
When the command that caused the error occurred, the device was active or
idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 d8 08 af 78 5e e0 00 00:02:55.495 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:55.495 READ DMA
c8 d8 08 b7 78 5e e0 00 00:02:55.495 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:55.495 READ DMA
c8 d8 08 af 78 5e e0 00 00:02:51.593 READ DMA

Error 6 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
When the command that caused the error occurred, the device was active or
idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 d8 08 af 78 5e e0 00 00:02:51.593 READ DMA
c8 d8 08 af 78 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 3f 00 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 47 00 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 37 00 5e e0 00 00:02:47.611 READ DMA

Error 5 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
When the command that caused the error occurred, the device was active or
idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 d8 08 af 78 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 3f 00 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 47 00 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 37 00 5e e0 00 00:02:47.611 READ DMA
c8 d8 08 2f 00 5e e0 00 00:02:47.610 READ DMA

Error 4 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
When the command that caused the error occurred, the device was active or
idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 d8 08 af 78 5e e0 00 00:02:43.666 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:43.666 READ DMA
c8 d8 08 af 78 5e e0 00 00:02:39.780 READ DMA
c8 d8 01 00 00 00 e0 00 00:02:39.779 READ DMA
c8 d8 08 b7 78 5e e0 00 00:02:39.753 READ DMA

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours)
LBA_of_first_error
# 1 Conveyance offline Completed without error 00%
1 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




[root@]# cat /etc/smartd.conf
# *SMARTD*AUTOGENERATED* /etc/smartd.conf
# Remove the line above if you have edited the file and you do not want
# it to be overwritten on the next smartd startup.

# Sample configuration file for smartd. See man 5 smartd.conf.
# Home page is: http://smartmontools.sourceforge.net

# The file gives a list of devices to monitor using smartd, with one
# device per line. Text after a hash (#) is ignored, and you may use
# spaces and tabs for white space. You may use '\' to continue lines.

# You can usually identify which hard disks are on your system by
# looking in /proc/ide and in /proc/scsi.

# The word DEVICESCAN will cause any remaining lines in this
# configuration file to be ignored: it tells smartd to scan for all
# ATA and SCSI devices. DEVICESCAN may be followed by any of the
# Directives listed below, which will be applied to all devices that
# are found. Most users should comment out DEVICESCAN and explicitly
# list the devices that they wish to monitor.
# DEVICESCAN

# First (primary) ATA/IDE hard disk. Monitor all attributes
# /dev/hda -a

# Monitor SMART status, ATA Error Log, Self-test log, and track
# changes in all attributes except for attribute 194
# /dev/hdb -H -l error -l selftest -t -I 194

# A very silent check. Only report SMART health status if it fails
# But send an email in this case
/dev/hda -H -m root

# First two SCSI disks. This will monitor everything that smartd can
# monitor.
# /dev/sda -d scsi
# /dev/sdb -d scsi

# HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE
# -d TYPE Set the device type to one of: ata, scsi
# -T TYPE set the tolerance to one of: normal, permissive
# -o VAL Enable/disable automatic offline tests (on/off)
# -S VAL Enable/disable attribute autosave (on/off)
# -H Monitor SMART Health Status, report if failed
# -l TYPE Monitor SMART log. Type is one of: error, selftest
# -f Monitor for failure of any 'Usage' Attributes
# -m ADD Send warning email to ADD for -H, -l error, -l selftest, and -f
# -M TYPE Modify email warning behavior (see man page)
# -p Report changes in 'Prefailure' Normalized Attributes
# -u Report changes in 'Usage' Normalized Attributes
# -t Equivalent to -p and -u Directives
# -r ID Also report Raw values of Attribute ID with -p, -u or -t
# -R ID Track changes in Attribute ID Raw value with -p, -u or -t
# -i ID Ignore Attribute ID for -f Directive
# -I ID Ignore Attribute ID for -p, -u or -t Directive
# -v N,ST Modifies labeling of Attribute N (see man page)
# -a Default: equivalent to -H -f -t -l error -l selftest
# -F TYPE Use firmware bug workaround. Type is one of: none, samsung
# -P TYPE Drive-specific presets: use, ignore, show, showall
# # Comment: text after a hash sign is ignored
# \ Line continuation character
# Attribute ID is a decimal integer 1 <= ID <= 255
# All but -d, -m and -M Directives are only implemented for ATA devices
#
# If the test string DEVICESCAN is the first uncommented text
# then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z]
# DEVICESCAN may be followed by any desired Directives.




Michael Rasmussen (14-07-2007)
Kommentar
Fra : Michael Rasmussen


Dato : 14-07-07 14:12

On Sat, 14 Jul 2007 14:33:10 +0200, "KV" <nospam@REMOVE.gmail.com>
wrote:

>Jo, men den er jo ret uoverskuelig her, hvor det hele bliver wrappet til
>60-70 karaktére. Men her er det:

Du har ret, den er lidt overskuelig...

Der er åbenbart et problem med en enkelt sector der ligger i LBA
6191280..

Prøv at køre en Smart Extended Selftest på disken, dvs

smartctl -c long /dev/hda

Vent 50 - 60 minutter (Du kan godt arbejde på maskinen under testen
!!) og hent resultatet med

smartctl -l selftest /dev/hda


- - -

Vender tilbage med lidt flere kommentarer til din smartctl udskrift
samt opsætningen at smartd engang i aften....

<mlr>





Michael Rasmussen (14-07-2007)
Kommentar
Fra : Michael Rasmussen


Dato : 14-07-07 21:36

On Sat, 14 Jul 2007 14:33:10 +0200, "KV" <nospam@REMOVE.gmail.com>
wrote:

>Error 8 occurred at disk power-on lifetime: 4645 hours (193 days + 13 hours)
> When the command that caused the error occurred, the device was active or
>idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 59 07 b0 78 5e e0 Error: UNC 7 sectors at LBA = 0x005e78b0 = 6191280

...cut

Din harddisk har een enkelt syg sektor i LBA 6191280.

SMART systemet genlæser sektoren en antal gange, og hvis det lykkes at
læse sektoren korrekt bare een gang, vil indholdet blive reallokeret
til en ny sektor. Og den gamle sektor vil blive markeret BAD.

Hvis ikke det lykkes at læse sektoren vil smart systemet ikke
reallokere sektoren - Hvis den gjorde, ville du som bruger miste
informationen fra sektoren...

Smartmon FAQ'en har en god beskrivelse af hvordan du tvinger systemet
til at reallokere en dårlig sektor:

http://smartmontools.sourceforge.net/badblockhowto.html

Håber at det kan hjælpe dig lidt på vej...

<mlr>




Thorbjørn Ravn Ander~ (14-07-2007)
Kommentar
Fra : Thorbjørn Ravn Ander~


Dato : 14-07-07 21:57

Michael Rasmussen <michael@invalid> writes:

> Din harddisk har een enkelt syg sektor i LBA 6191280.

SÅ er det tiden at anskaffe sig erstatningen.
--
Thorbjørn Ravn Andersen

Henning Præstegaard (15-07-2007)
Kommentar
Fra : Henning Præstegaard


Dato : 15-07-07 05:36

"Thorbjørn Ravn Andersen" wrote:

>> Din harddisk har een enkelt syg sektor i LBA 6191280.
>
> SÅ er det tiden at anskaffe sig erstatningen.
>
Det på grund af netop den sektor eller "blot" fordi smart melder
fejl at disken skal skiftets?

mvh
Hening



Thorbjørn Ravn Ander~ (15-07-2007)
Kommentar
Fra : Thorbjørn Ravn Ander~


Dato : 15-07-07 06:58

"Henning Præstegaard" <onkelhenning@not.vallid> writes:

> > SÅ er det tiden at anskaffe sig erstatningen.
> >
> Det på grund af netop den sektor eller "blot" fordi smart melder
> fejl at disken skal skiftets?

Du vil måske finde det her interessant - Google har analyseret lidt på
deres diske:

http://labs.google.com/papers/disk_failures.html

"We find, for example, that after their first scan error, drives are
39 times more likely to fail within 60 days than drives with no such
errors. First errors in reallocations, offline reallocations, and
probational counts are also strongly correlated to higher failure
probabilities."

--
Thorbjørn Ravn Andersen

Adam Sjøgren (14-07-2007)
Kommentar
Fra : Adam Sjøgren


Dato : 14-07-07 14:52

On Sat, 14 Jul 2007 15:12:06 +0200, Michael wrote:

> Prøv at køre en Smart Extended Selftest på disken, dvs

> smartctl -c long /dev/hda
^^
Ska' det ikke være '-t'?


Mvh.

--
"Our hero regains consciousness at the feet of a Adam Sjøgren
sarcastic alien..." asjo@koldfront.dk

Michael Rasmussen (14-07-2007)
Kommentar
Fra : Michael Rasmussen


Dato : 14-07-07 21:27

On Sat, 14 Jul 2007 15:52:16 +0200, asjo@koldfront.dk (Adam Sjøgren)
wrote:

>> Prøv at køre en Smart Extended Selftest på disken, dvs
>
>> smartctl -c long /dev/hda
> ^^
> Ska' det ikke være '-t'?

Ups, en tyrkfejl. Det skal sæ'følig være

smartctl -t long /dev/hda

<mlr>

Søg
Reklame
Statistik
Spørgsmål : 177558
Tips : 31968
Nyheder : 719565
Indlæg : 6408926
Brugere : 218888

Månedens bedste
Årets bedste
Sidste års bedste