informix >> CKPT REQ - Blocked:CKPT

by Rajib Sarkar » Tue, 09 Sep 2003 06:47:44 GMT






No .. don't do that ... I would suggest bouncing the SECONDARY and seeing
if both servers sync up or not ... otherwise you'll have to do the physical
restore to re-establish HDR again from scratch ...

Thanx much,

Rajib Sarkar
Advisory Software Engineer (RAS)
IBM Data Management Group
Ph : (602)-217-2100
Fax: (602)-217-2100
T/L : 667-2100

As long as you derive inner help and comfort from anything, keep it --
Mahatma Gandhi



antzz
<member37779@dbfo To: XXXX@XXXXX.COM
rums.com> cc:
Sent by: Subject: Re: Re: CKPT REQ - Blocked:CKPT
owner-informix-li
XXXX@XXXXX.COM


09/08/2003 11:42
AM
Please respond to
antzz






Rajib,



I have the same problem when I restored my latest backup to the Primary.
Can I just kill this user session that has the flag -X?



Thanks fo your help.



P.S. I can ping my secondary so my network connection seems to be good.



Originally posted by Rajib Sarkar

> This means that the instance is waiting on a CHECKPOINT to
> complete. From

> the info..it looks like an HDR pair where the Primary is
> waiting for the

> Checkpoint to complete.

>

> check the onstat -u output and look for flags -X- on any of the
> sessions,

> which means this session is in a critical section of the code and the

> engine is waiting on that session to finish its work and then
> complete the

> checkpoint ...

>

> Since, its HDR another scenario could be that the primary is
> waiting on the

> secondary to finish its checkpoint as with checkpoints the

> primary-secondary pair sync up. In this situation, I think u
> better check

> up your network status as well ...

>

> HTH

>

>

> Thanx much,

> Rajib Sarkar

> Advisory Support Engineer (RAS)

> IBM Data Management Solutions

> Ph: 602-217-2100

>

> As long as you derive inner help and comfort from anything, keep it --

> Mahatma Gandhi

>

>

>

>

> Monday, June 23, 2003 10:44 AM

> To: XXXX@XXXXX.COM

> cc:

> Wrom: DDJBLVLMHAALPTCXLYRWTQTIP

> Subject: CKPT REQ - Blocked:CKPT

>

>

>

> hello,

> When i run (onstat -) the return i get looks like this:

> Informix Dynamic Server 2000 Version 9.21.HC5 -- On-Line (Prim)

> (CKPT REQ) -- Up 3 days 00:31:02 -- 1102976 Kbytes

> Blocked:CKPT

>

> could someone enlighten me what this means when i see (CKPT REQ) and

> Blocked:CKPT ?

>

> this gets returned when the log says a checkpoint is not even

> happening....

>

> Thanks in advance!

> Tom

>

sending to informix-list



sending to informix-list

Similar Threads

1. Alter fragment stalled with CKPT REQ

2. CKPT REQ stops ontape -r recovery

Greetings, Family.

I am trying to "clone" a server from the ontape backup set of another.
It was going great for about 11 hours and then seems to have stopped
all actvity.  From the online.log:

01:20:38  Maximum server connections 0
01:32:39  Checkpoint Completed:  duration was 0 seconds.
01:32:39  Checkpoint loguniq 494406, logpos 0xe0d018, timestamp:
-2141117962

01:32:39  Maximum server connections 0
01:44:52  Checkpoint Completed:  duration was 0 seconds.
01:44:52  Checkpoint loguniq 494406, logpos 0xe0d018, timestamp:
-2141117565

01:44:52  Maximum server connections 0
---------------
That's the last activity at 1:44 AM - it stops checkpointing after
that.

I have a script to monitor the total output by dbspace by filtering the
output of onstat -g iof.  It shows me that there is no chunk I/O taking
place.  For the first few hours of the recovery process I was runnning
this script every few minutes and watching the I/O count rise at a nice
clip.

Here's the last couple of lines of my output:

----  DBSpace           totalops dskread dskwrite io/s
...
DBSP  thddb_tscon041dbs   589820       0   589820   4.80
DBSP  thddb_tscon034dbs   589820       0   589820   4.80

Total IO-Stats:         33318739     122 33318617 275.30

That total I/O count, 33,318,739 has not changed since I noticed the
CKPT REQ about an hour ago.

I found a old thread from 2003 that sounded similar to this but that
one ended when the poster realized there was I/O activity on the
chunks.  No such luck here.

I have tried "onmode -c unblock", to no avail.  I did look into the
undocumentd "onmode -O" but it gave a dire warning about marking
chunks/dbspaces down and requiring a recovery.  That would be a useless
exercise for me so I answered N and exited.

Any other ideas?  You have my rapt attention!  ;-(

Thanks much.

-- J.S.

PS.

Here's my script.  I will see about posting it to the IIUG library when
I get a round tuitt.

#!/usr/bin/ksh
# monitorDBspaceIO.sh
#
onstat -g iof|tail +6 |
gawk '
BEGIN {
  total_io = 0
  iopersec = 0.0
}
$3 != 0 {
  #print
  total_io += $3
  dskread  += $4
  dskwrite += $5
  iopersec += $6
  split($2, q_chunk, ".")
  cur_dbspace = q_chunk[2]
  dbspace_total_io[cur_dbspace] += $3
  dbspace_dskread[cur_dbspace]  += $4
  dbspace_dskwrite[cur_dbspace] += $5
  dbspace_iopersec[cur_dbspace] += $6
}
END {
  printf("---- DBSpace totalops dskread dskwrite io/s\n")
  for (dbsname in dbspace_total_io)
  {
    if (length(dbsname) == 0)
      continue
    #printf ("dbsname is <%s>\n", dbsname)
    printf("DBSP %s %d %d %d %7.2f\n", dbsname,
           dbspace_total_io[dbsname], dbspace_dskread[dbsname],
           dbspace_dskwrite[dbsname], dbspace_iopersec[dbsname])
  }
  printf ("\nTotal IO-Stats: %d %d %d %7.2f\n", total_io, dskread,
dskwrite, iopersec)
}
'
BTW, the above output had been piped through "beautify-unl.sh -db",
which is available at IIUG.

3. ontape -r and CKPT REQ

4. Fast Recovery (CKPT REQ)?

Hi,

I have a database that is in Fats Recovery (CKPT REQ),,,,,
,,,,
Blocked:CKPT

and it will not come up!

Have tried onmode -c unblock without any success,,,,

How can I get around this Checkpoint state and get the database in
Online state?

Thankful for any help on this topic!!

Regards
Christian

5. Read-Only (SDS) (CKPT REQ)

6. Blocked:CKPT DYNAMIC_LOG

This is a multi-part message in MIME format.

Hi All,
 
IBM Informix Dynamic Server Version 11.10.UB4TL -- Fast Recovery (CKPT REQ)
-- Up 00:33:28 -- 19552 Kbytes
Blocked:CKPT DYNAMIC_LOG

This is the downloaded trial version. The story so far, I ran out of logical
log space and the message in the online log is as follows :

************

Message Log File: /opt/IBM/cheetah/demo/server/online.log
09:53:11   Action: For better performance, increase the physical log buffer
size to 128.
09:53:11  The current size of the logical log buffer is smaller than
recommended.
09:53:12  IBM Informix Dynamic Server Initialized -- Shared Memory
Initialized.
 
09:53:12  Physical Recovery Started at Page (1:576).
09:53:12  Physical Recovery Complete: 0 Pages Examined, 0 Pages Restored.
09:53:12  Logical Recovery Started.
09:53:12  10 recovery worker threads will be started.
09:53:12  WARNING! Physical Log size 2000 is too small.
          Physical Log overflows may occur during peak activity.
          Recommended minimum Physical Log size is 160 times maximum
          concurrent user threads.
 
09:53:15  Logical Recovery has reached the transaction cleanup phase.
09:53:15  ALERT: Because the oldest logical log (0) contains records
          from an open transaction (0x(nil)), the server is
          attempting to dynamically add a log file.  But there is no
          space available.  Please add a DBspace or chunk.  Then
          complete the transaction as soon as possible.
09:53:15  Waiting for Next Logical Log File to be Freed

************

so I followed the advice and tried to add a chunk to rootdbs, this hung so I
ended up having to kill the engine. onspaces won't work without the engine
running and the engine gets to this point and hangs

************

Checking group membership to determine server run mode...succeeded
Reading configuration file
'/opt/IBM/cheetah/etc/onconfig.cheetah'...succeeded
Creating /INFORMIXTMP/.infxdirs...succeeded
Creating infos file "/opt/IBM/cheetah/etc/.infos.cheetah"...succeeded
Linking conf file "/opt/IBM/cheetah/etc/.conf.cheetah"...succeeded
Writing to infos file...succeeded
Checking config parameters...Invalid value of DUMPDIR '/usr/informix/tmp' in
onconfig file. Setting it to default value
 '/opt/IBM/cheetah/tmp'...succeeded
Allocating and attaching to shared memory...succeeded
Creating resident pool 890 kbytes...succeeded
Allocating 2016 kbytes for buffer pool of 2K page size...succeeded
Initializing rhead structure...succeeded
Initializing ASF...succeeded
Initializing Dictionary Cache and SPL Routine Cache...succeeded
Bringing up ADM VP...succeeded
Creating VP classes...succeeded
Onlining 0 additional cpu vps...succeeded
Onlining 2 IO vps...succeeded
Initialization of Encryption...succeeded
Forking main_loop thread...succeeded
Initializing DR structures...succeeded
Forking 1 'soctcp' listener threads...succeeded
Starting tracing...succeeded
Initializing 1 flushers...succeeded
Initializing log/checkpoint information...succeeded
Opening primary chunks...succeeded
Opening mirror chunks...succeeded
Initializing dbspaces...succeeded
Validating chunks...succeeded
Initialize Async Log Flusher...succeeded
Initializing DBSPACETEMP list...succeeded

************

At this point however, the onspaces will at least start running but stays at
Verifying disk space message. If I control-C out of the initialisation, I
can see the chunk there but nothing moves on. If the engine is killed, the
chunk disappears. 

Creating a new dbspace brought forth the same symptoms and results.

I think the engine won't start due to lack of space but won't allow me to
add the space due to the engine not starting completely. Sounds very
logical.

************

The onstat -l shows :


IBM Informix Dynamic Server Version 11.10.UB4TL -- Fast Recovery (CKPT REQ)
-- Up 00:37:55 -- 19552 Kbytes
Blocked:CKPT DYNAMIC_LOG

Physical Logging
Buffer bufused  bufsize  numpages   numwrits   pages/io
  P-1  0        16       0          0          0.00
      phybegin         physize    phypos     phyused    %used
      1:263            1000       313        0          0.00

Logical Logging
Buffer bufused  bufsize  numrecs    numpages   numwrits   recs/pages
pages/io
  L-1  1        16       0          0          0          0.0        0.0
        Subsystem    numrecs    Log Space used

Buffer Waiting
Buffer  ioproc   flags
  L-1   444cc018 0x1      0

address  number   flags    uniqid   begin                size     used
%used
4450a4c8 1        U-B----  85       1:1263               1000     1000
100.00
4450a510 2        U---C-L  86       1:2263               1000     1000
100.00
4450a558 3        U-B----  81       1:3263               1000     1000
100.00
4450a5a0 4        U-B----  82       1:4263               1000     1000
100.00
4450a5e8 5        U-B----  83       1:5263               1000     1000
100.00
4450a630 6        U-B----  84       1:6263               1000     1000
100.00
 6 active, 6 total

************

I am a bit worried as to what would happen on a production server if this
should happen. I am unable to follow the advice in the online log and am
curious as to how I can get this back up and running again or if this is one
of those occasions where one wishes one had a backup and the oninit -i is
the only way to go.

Any ideas greatly appreciated.

Thanks and regards
 
---------------------------------------------------------------
********** _/     **********  David Logan 
*******   _/         *******  ITO Delivery Specialist - Database
*****    _/            *****  Hewlett-Packard Australia Ltd
****    _/_/_/  _/_/_/  ****  E-Mail:  XXXX@XXXXX.COM 
****   _/  _/  _/  _/   ****  Desk:   +61 8 8408 4273
****  _/  _/  _/_/_/    ****  Mobile: +61 417 268 665
*****        _/       ******    
******      _/      ********  Postal: 148 Frome Street,
********   _/     **********          Adelaide SA 5001
                                      Australia 
i    n    v    e    n    t                                   
---------------------------------------------------------------


7. CKPT Blocked

8. Enable 32K Block in 8K Block DB

"Frank van Bortel" < XXXX@XXXXX.COM > wrote in message
news:c4otl4$3fg$ XXXX@XXXXX.COM ...
> Bottom line (still) is - don't use buffered IO; use direct or raw
> (why do you think every benchmark of oracle still uses raw?!?)

  Really Oracle benchmarks all use raw?...As an Informix person why did
  I used to keep hearing

    "according to Oracle raw devices are not faster, in fact filesystems are
quicker"

   FROM MY CUSTOMERS than?
   ^^^^^^^^^^^^^^^^^^^^^^

   Used to really annoy me...so Oracle have come around to my way of
thinking than?
   As it's faster? What a suprise...


>
> Now - how do I convince the SA to go raw on his HP machine with VX7100,
> where he found a benchmark that proofs he was right to create the
> filesystem with 8k blocks?

    Dunno.

>
> -- 
>
> Regards,
> Frank van Bortel
>