Crash Dump Analyse with Scat Oracle Solaris

No comments

Oracle_DumpAt this post I ‘ll provide  the  information how to  analyze crash dump  file for  Oracle Solaris.It helps  to find  out the  root  cause  of crash dump also  hang.If your system installed  properly then there  must be a separated  raw device which  size is as  much as  physical memory. Dump device size  is  related to physical memory size. It’s  important  to assign  dedicated dump device  no less than size  of physical memory.Otherwise,when system gets a  panic, memory dump won’t  be written  dump device  because  of lack  of  free size  in dump device.You ‘ll lose  last  memory snapshot to dig  root cause  of  problem.

First  of  all we  need to check configuratin  if system dump device  configured  properly.

After  check system  dump device  configuration you can check if server  got a dump file  while panics.

If  you ‘ll  make a test  on your test server  and you need  to a crash dump file,  apply  this  steps  to create a  dump  file.It’ll give panic and  get a snapshot  of  memory.

Then  you need a server  which is same  architecture(Sparc or  X86 which depends  on your  server) and  SCAT  package  is  installed. If your  dump file generated  on a  sparc machine so  you need a sparc machine which  SCAT installed.

Login  MOS and  download “Oracle Solaris Crash Analysis Tool”.The  patch  ID  is “p10364262_520_SOLARIS64”. Then install  SCAT.

While  installing package  you should  accept  installation.”Please enter your response to the license: ACCEPT”

After  installation  Scat  finishes then you need  to copy  vmcore.x file to  the  machine which  you installed SCAT. Start  analyze  with  this command .

After  scat finish  operation run “kstat  xck”.

Check this  logs; and  also  run “panic” to see  what happened why server  got panic.

The strings  which you need to check  are;

  • Panic Strings:
    • panic string:   BAD TRAP: type=31 rp=2a107ee31e0 addr=70 mmu_fsr=0 occurred in module “pxfs” due to a NULL pointer dereference
  • Trap:
    • — trap data  type: 0x31 (data access MMU miss)  rp: 0x2a107ee31e0  —
  • Trap  Strings:
    • <trap>pxfs:void fobj_ii::lock_deleted(lock_descriptor*)+0x88()

After  you get  all dates  abaut  panic now you can  search  google  and  MOS for  more  informations and  bugs.

Run  help command get  full list  of  commands  which available on  SCAT.

I added some of  controls like ;

  • Memory  usage
  • Process  information
  • Memory  Errors

You can check help  output  for  more  details.

Follow me

Abdurrahim

I'm a System Engineer with extensive experience and administration skills and works for Interbank Card Center Of Turkey.I provide hardware and software support for the following Unix/Linux and Windows platforms.(Oracle Solaris,HP-UX, Linux, IBM-AIX, Windows Servers)
Follow me
facebooktwittergoogle_pluslinkedinby feather