Discussion:
Has Reclaim Storage becoming outdated?
(too old to reply)
Graap, Kenneth
2014-10-13 06:57:53 UTC
Permalink
Raw Message
I have a Power720 system with 2 active processors, 96GB of RAM and an 8TB iASP - 50% utilized. Not a large system by today's standards.

For some reason, that is still being determined, this system CRASHED HARD (immediately did a MSD and IPL'ed) right at the beginning of a work day.

After a hard crash like this it is strongly recommended that a RCLSTG process be run.

After negotiating a 12 hour window on Sunday, I started running RCLSTG against the 8TB iASP...

After over 9 hours, the "Reading objects from disk" step was only 51% complete with an estimated remaining time of almost 9 more hours!

Reclaim Storage in Progress S02
10/12/14 23:01:46
RCLSTG:
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7

Reclaim Storage Step Percent Time Elapsed Time Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0

Total . . . . . . . . . . . . . . . . . . . . : 09:14:05

I had to abort the Reclaim Storage process and bring up the system, knowing full well that when (or if) I restart RCLSTG it will start from the beginning again.

It seems like the RCLSTG process needs to be modified in some way that it can complete in a reasonable amount of time. Maybe it could be designed to multitask (??) or keep track of what it had done so if restarted it could continue where it left off (??) ... All I know is it isn't working for me...

Has anyone else experienced this?

Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Jim Oberholtzer
2014-10-13 11:16:40 UTC
Permalink
Raw Message
RCLSTG speed really depends on how many objects you have on your system.
YES it is still needed after a crash. I guess I am a bit surprised at how
long it was taking but that also implies that you might have quite a few
objects to fix as well. iASP so does that imply PowerHA?

--
Jim Oberholtzer
Chief Technical Architect
Agile Technology Architects


-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org] On Behalf Of
Graap, Kenneth
Sent: Monday, October 13, 2014 1:58 AM
To: Midrange Systems Technical Discussion (midrange-l-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org)
Subject: Has Reclaim Storage becoming outdated?


I have a Power720 system with 2 active processors, 96GB of RAM and an 8TB
iASP - 50% utilized. Not a large system by today's standards.

For some reason, that is still being determined, this system CRASHED HARD
(immediately did a MSD and IPL'ed) right at the beginning of a work day.

After a hard crash like this it is strongly recommended that a RCLSTG
process be run.

After negotiating a 12 hour window on Sunday, I started running RCLSTG
against the 8TB iASP...

After over 9 hours, the "Reading objects from disk" step was only 51%
complete with an estimated remaining time of almost 9 more hours!

Reclaim Storage in Progress S02
10/12/14
23:01:46
RCLSTG:
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7

Reclaim Storage Step Percent Time Elapsed Time
Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0

Total . . . . . . . . . . . . . . . . . . . . : 09:14:05

I had to abort the Reclaim Storage process and bring up the system, knowing
full well that when (or if) I restart RCLSTG it will start from the
beginning again.

It seems like the RCLSTG process needs to be modified in some way that it
can complete in a reasonable amount of time. Maybe it could be designed to
multitask (??) or keep track of what it had done so if restarted it could
continue where it left off (??) ... All I know is it isn't working for me...

Has anyone else experienced this?

Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap



--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org To subscribe, unsubscribe,
or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org Before posting, please take a
moment to review the archives at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Graap, Kenneth
2014-10-14 02:25:07 UTC
Permalink
Raw Message
Post by Jim Oberholtzer
RCLSTG speed really depends on how many objects you have on your system.
YES it is still needed after a crash. I guess I am a bit surprised at
how long it was taking but that also implies that >you might have quite a
few objects to fix as well. iASP so does that imply PowerHA?
Yes, we are doing Geographic Mirroring using PowerHA.

Yes, we do have LOTS of objects in our iASP... Especially in the IFS on the iASP.

Would turning on the additional 4 available processors in our Power7 720 (we only have 2 active now) significantly speed up the RECLAIM STORAGE process????

Reply or Forwarded mail from: Kenneth E Graap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Graap, Kenneth
2014-10-13 06:18:58 UTC
Permalink
Raw Message
I have a Power720 system with 2 active processors, 96GB of RAM and an 8TB iASP - 50% utilized. Not a large system by today's standards.

For some reason, that is still being determined, this system CRASHED HARD (immediately did a MSD and IPL'ed) right at the beginning of a work day.

After a hard crash like this it is strongly recommended that a RCLSTG process be run.

After negotiating a 12 hour window on Sunday, I started running RCLSTG against the 8TB iASP...

After over 9 hours, the "Reading objects from disk" step was only 51% complete with an estimated remaining time of almost 9 more hours!

Reclaim Storage in Progress S02
10/12/14 23:01:46
RCLSTG:
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7

Reclaim Storage Step Percent Time Elapsed Time Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0

Total . . . . . . . . . . . . . . . . . . . . : 09:14:05

I had to abort the Reclaim Storage process and bring up the system, knowing full well that when (or if) I restart RCLSTG it will start from the beginning again.

It seems like the RCLSTG process needs to be modified in some way that it can complete in a reasonable amount of time. Maybe it could be designed to multitask (??) or keep track of what it had done so if restarted it could continue where it left off (??) ... All I know is it isn't working for me...

Has anyone else experienced this?

Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Mark S Waterbury
2014-10-13 15:25:31 UTC
Permalink
Raw Message
Hi, Kenneth:

A single large IASP of ~8TB, even if only 50% occupied, will take a long
time for RCLSTG to analyze.

The following approach may help ...

Instead of one large IASP of 8TB, why not have 8 IASPs of 1TB each?
That way, in the event of a system crash, you could attach and vary on
each separate IASP in a different LPAR and then run the RCLSTG process
"in parallel" ... I would think this should complete much faster than
running RCLSTG against a single large (8 TB) IASP.

Then, once that "parallel RCLSTG" is completed, you can vary off the
IASPs, then attach them and vary them all on to the primary "production"
LPAR once again.

Someone else mentioned "High Availability" -- for example, this approach
would also permit you to vary off one IASP, attach it to another LPAR
and run a full back-up of that IASP on that LPAR, then vary it off and
back onto the "live" production LPAR once again. If you group libraries
into these IASPs based on what "applications" use those libraries, this
could allow you to keep your "warehouse" applications "up and running"
while the "financials" applications are being backed-up, for instance.

HTH,

Mark S. Waterbury
Post by Graap, Kenneth
I have a Power720 system with 2 active processors, 96GB of RAM and an 8TB iASP - 50% utilized. Not a large system by today's standards.
For some reason, that is still being determined, this system CRASHED HARD (immediately did a MSD and IPL'ed) right at the beginning of a work day.
After a hard crash like this it is strongly recommended that a RCLSTG process be run.
After negotiating a 12 hour window on Sunday, I started running RCLSTG against the 8TB iASP...
After over 9 hours, the "Reading objects from disk" step was only 51% complete with an estimated remaining time of almost 9 more hours!
Reclaim Storage in Progress S02
10/12/14 23:01:46
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7
Reclaim Storage Step Percent Time Elapsed Time Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0
Total . . . . . . . . . . . . . . . . . . . . : 09:14:05
I had to abort the Reclaim Storage process and bring up the system, knowing full well that when (or if) I restart RCLSTG it will start from the beginning again.
It seems like the RCLSTG process needs to be modified in some way that it can complete in a reasonable amount of time. Maybe it could be designed to multitask (??) or keep track of what it had done so if restarted it could continue where it left off (??) ... All I know is it isn't working for me...
Has anyone else experienced this?
Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
DrFranken
2014-10-13 15:46:11 UTC
Permalink
Raw Message
While I appreciate the CONCEPT here of allowing more parallelization of
the reclaim I don't at all support it for a production system. Here's why:

In the last decade I can name two systems that really benefited from
RCLSTG and one of those had multiple hard crashes (due to power and
storms) in a three day period. Eventually it had to have one to correct
issues.

Second it is A LOT of work in most cases to break up workload by ASP.

Third it could cost A LOT of money to have enough disk arms available,
preferably in separate RAID sets to supply adequate performance for each
ASP.

Fourth there are many parts of RCLSTG that can be run with the system up
already.

Fifth what seems to be the most important part of RCLSTG the *DBXREF
part usually is a small fraction of the overall time, so I'd start with
that only.

So the risk/reward benefit here doesn't work for me.

Remember you can put 8TB EASILY in two drawers of disk today with lots
of open slots and hot spares. 8TB isn't a large system......

- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com
Post by Mark S Waterbury
A single large IASP of ~8TB, even if only 50% occupied, will take a long
time for RCLSTG to analyze.
The following approach may help ...
Instead of one large IASP of 8TB, why not have 8 IASPs of 1TB each? That
way, in the event of a system crash, you could attach and vary on each
separate IASP in a different LPAR and then run the RCLSTG process "in
parallel" ... I would think this should complete much faster than
running RCLSTG against a single large (8 TB) IASP.
Then, once that "parallel RCLSTG" is completed, you can vary off the
IASPs, then attach them and vary them all on to the primary "production"
LPAR once again.
Someone else mentioned "High Availability" -- for example, this approach
would also permit you to vary off one IASP, attach it to another LPAR
and run a full back-up of that IASP on that LPAR, then vary it off and
back onto the "live" production LPAR once again. If you group libraries
into these IASPs based on what "applications" use those libraries, this
could allow you to keep your "warehouse" applications "up and running"
while the "financials" applications are being backed-up, for instance.
HTH,
Mark S. Waterbury
Post by Graap, Kenneth
I have a Power720 system with 2 active processors, 96GB of RAM and an
8TB iASP - 50% utilized. Not a large system by today's standards.
For some reason, that is still being determined, this system CRASHED
HARD (immediately did a MSD and IPL'ed) right at the beginning of a
work day.
After a hard crash like this it is strongly recommended that a RCLSTG process be run.
After negotiating a 12 hour window on Sunday, I started running RCLSTG
against the 8TB iASP...
After over 9 hours, the "Reading objects from disk" step was only 51%
complete with an estimated remaining time of almost 9 more hours!
Reclaim Storage in
Progress S02
10/12/14 23:01:46
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7
Reclaim Storage Step Percent Time Elapsed Time Remaining
Data base/library/directory recovery 100 00:00:11
00:00:00
Reading objects from disk 51 09:13:54
08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0
Total . . . . . . . . . . . . . . . . . . . . : 09:14:05
I had to abort the Reclaim Storage process and bring up the system,
knowing full well that when (or if) I restart RCLSTG it will start
from the beginning again.
It seems like the RCLSTG process needs to be modified in some way that
it can complete in a reasonable amount of time. Maybe it could be
designed to multitask (??) or keep track of what it had done so if
restarted it could continue where it left off (??) ... All I know is
it isn't working for me...
Has anyone else experienced this?
Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Mark S Waterbury
2014-10-13 16:23:15 UTC
Permalink
Raw Message
Hi, Larry:

Your insights into such matters is always appreciated. Thanks.

Mark S. Waterbury
Post by DrFranken
While I appreciate the CONCEPT here of allowing more parallelization
of the reclaim I don't at all support it for a production system.
In the last decade I can name two systems that really benefited from
RCLSTG and one of those had multiple hard crashes (due to power and
storms) in a three day period. Eventually it had to have one to
correct issues.
Second it is A LOT of work in most cases to break up workload by ASP.
Third it could cost A LOT of money to have enough disk arms available,
preferably in separate RAID sets to supply adequate performance for
each ASP.
Fourth there are many parts of RCLSTG that can be run with the system
up already.
Fifth what seems to be the most important part of RCLSTG the *DBXREF
part usually is a small fraction of the overall time, so I'd start
with that only.
So the risk/reward benefit here doesn't work for me.
Remember you can put 8TB EASILY in two drawers of disk today with lots
of open slots and hot spares. 8TB isn't a large system......
- Larry "DrFranken" Bolhuis
www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com
Post by Mark S Waterbury
A single large IASP of ~8TB, even if only 50% occupied, will take a long
time for RCLSTG to analyze.
The following approach may help ...
Instead of one large IASP of 8TB, why not have 8 IASPs of 1TB each? That
way, in the event of a system crash, you could attach and vary on each
separate IASP in a different LPAR and then run the RCLSTG process "in
parallel" ... I would think this should complete much faster than
running RCLSTG against a single large (8 TB) IASP.
Then, once that "parallel RCLSTG" is completed, you can vary off the
IASPs, then attach them and vary them all on to the primary "production"
LPAR once again.
Someone else mentioned "High Availability" -- for example, this approach
would also permit you to vary off one IASP, attach it to another LPAR
and run a full back-up of that IASP on that LPAR, then vary it off and
back onto the "live" production LPAR once again. If you group libraries
into these IASPs based on what "applications" use those libraries, this
could allow you to keep your "warehouse" applications "up and running"
while the "financials" applications are being backed-up, for instance.
HTH,
Mark S. Waterbury
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Steinmetz, Paul
2014-10-13 17:27:36 UTC
Permalink
Raw Message
Larry,

Is it correct that RCLSTG is NOT recommended as much as in past S/38 AS/400 days?
Is is also correct that some of the RCLSTG code is auto executed on an IPL, if and where needed?
Last time I worked on an issue, IBM support recommended NOT to run RCLSTG.
What was recommended was RCLSTG SELECT(*DBXREF) OMIT(*NONE) ASPDEV(*SYSBAS) following a V6R1 to V7R1 upgrade issue.

Paul

-----Original Message-----
From: MIDRANGE-L [mailto:midrange-l-bounces-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org] On Behalf Of Mark S Waterbury
Sent: Monday, October 13, 2014 12:23 PM
To: Midrange Systems Technical Discussion
Subject: Re:Has Reclaim Storage becoming outdated?

Hi, Larry:

Your insights into such matters is always appreciated. Thanks.

Mark S. Waterbury
Post by DrFranken
While I appreciate the CONCEPT here of allowing more parallelization
of the reclaim I don't at all support it for a production system.
In the last decade I can name two systems that really benefited from
RCLSTG and one of those had multiple hard crashes (due to power and
storms) in a three day period. Eventually it had to have one to
correct issues.
Second it is A LOT of work in most cases to break up workload by ASP.
Third it could cost A LOT of money to have enough disk arms available,
preferably in separate RAID sets to supply adequate performance for
each ASP.
Fourth there are many parts of RCLSTG that can be run with the system
up already.
Fifth what seems to be the most important part of RCLSTG the *DBXREF
part usually is a small fraction of the overall time, so I'd start
with that only.
So the risk/reward benefit here doesn't work for me.
Remember you can put 8TB EASILY in two drawers of disk today with lots
of open slots and hot spares. 8TB isn't a large system......
- Larry "DrFranken" Bolhuis
www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com
Post by Mark S Waterbury
A single large IASP of ~8TB, even if only 50% occupied, will take a
long time for RCLSTG to analyze.
The following approach may help ...
Instead of one large IASP of 8TB, why not have 8 IASPs of 1TB each?
That way, in the event of a system crash, you could attach and vary
on each separate IASP in a different LPAR and then run the RCLSTG process "in
parallel" ... I would think this should complete much faster than
running RCLSTG against a single large (8 TB) IASP.
Then, once that "parallel RCLSTG" is completed, you can vary off the
IASPs, then attach them and vary them all on to the primary "production"
LPAR once again.
Someone else mentioned "High Availability" -- for example, this
approach would also permit you to vary off one IASP, attach it to
another LPAR and run a full back-up of that IASP on that LPAR, then
vary it off and back onto the "live" production LPAR once again. If
you group libraries into these IASPs based on what "applications" use
those libraries, this could allow you to keep your "warehouse" applications "up and running"
while the "financials" applications are being backed-up, for instance.
HTH,
Mark S. Waterbury
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org Before posting, please take a moment to review the archives at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Graap, Kenneth
2014-10-16 15:28:06 UTC
Permalink
Raw Message
Thanx everyone for your responses regarding RCLSTG.

IBM SupportLine has informed me that a RCLSTG *ALL should only be run when requested by IBM Support personnel.

Recovery after a crash is much easier now since the RCLSTG command has been changed to allow pieces of it to run separately (for example *DBXREF).

This change and the use of "system-managed system access path recovery" makes it possible to recover from a crash within minutes instead of days !

For example, after the recent crash of our system, none of these large access paths had to be rebuilt because they were protected with "system-managed system access path recovery":

Display Protected Access Paths

Estimated
Recovery
File Library ASP If Not Protected
NARLCUS P1FILES IASP1 02:37:46
NARL01 P1FILES IASP1 01:56:29
NARLPRM P1FILES IASP1 01:54:07
NARL_PI8 P1FILES IASP1 01:50:59
NARL_PIA P1FILES IASP1 01:50:58
NARL_PI3 P1FILES IASP1 01:50:11
NARLUTDT P1FILES IASP1 01:46:24
NARL_PI1 P1FILES IASP1 01:45:10
NARLTR# P1FILES IASP1 01:42:59
NARL_PI2 P1FILES IASP1 01:42:03
NARL_PIC P1FILES IASP1 01:41:09
NARL_PI7 P1FILES IASP1 01:41:03
NARL_PIH P1FILES IASP1 01:40:40
NARL_PI5 P1FILES IASP1 01:37:30
NARL_PIG P1FILES IASP1 01:36:00
NARL_PIE P1FILES IASP1 01:30:45

My original premise that RCLSTG has become outdated was misguided. It has actually evolved into an even better tool for recovery of the system in the event of a crash.


Reply or Forwarded mail from: Kenneth E Graap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Justin C. Haase
2014-10-16 21:00:43 UTC
Permalink
Raw Message
A full reclaim is not recommended as part of "normal maintenance" and
shouldn't be run unless directed by support. See techdoc below for the
guidance if people don't believe you. Many customers are still doing this
monthly or quarterly or annually just because they always have. Presenting
this doc allows breaking of that cycle and saving of valuable technical
personnel's time!

http://www-01.ibm.com/support/docview.wss?uid=nas8N1010683

On Thu, Oct 16, 2014 at 10:28 AM, Graap, Kenneth <
Post by Graap, Kenneth
Thanx everyone for your responses regarding RCLSTG.
IBM SupportLine has informed me that a RCLSTG *ALL should only be run when
requested by IBM Support personnel.
Recovery after a crash is much easier now since the RCLSTG command has
been changed to allow pieces of it to run separately (for example *DBXREF).
This change and the use of "system-managed system access path recovery"
makes it possible to recover from a crash within minutes instead of days !
For example, after the recent crash of our system, none of these large
access paths had to be rebuilt because they were protected with
Display Protected Access Paths
Estimated
Recovery
File Library ASP If Not Protected
NARLCUS P1FILES IASP1 02:37:46
NARL01 P1FILES IASP1 01:56:29
NARLPRM P1FILES IASP1 01:54:07
NARL_PI8 P1FILES IASP1 01:50:59
NARL_PIA P1FILES IASP1 01:50:58
NARL_PI3 P1FILES IASP1 01:50:11
NARLUTDT P1FILES IASP1 01:46:24
NARL_PI1 P1FILES IASP1 01:45:10
NARLTR# P1FILES IASP1 01:42:59
NARL_PI2 P1FILES IASP1 01:42:03
NARL_PIC P1FILES IASP1 01:41:09
NARL_PI7 P1FILES IASP1 01:41:03
NARL_PIH P1FILES IASP1 01:40:40
NARL_PI5 P1FILES IASP1 01:37:30
NARL_PIG P1FILES IASP1 01:36:00
NARL_PIE P1FILES IASP1 01:30:45
My original premise that RCLSTG has become outdated was misguided. It has
actually evolved into an even better tool for recovery of the system in the
event of a crash.
Reply or Forwarded mail from: Kenneth E Graap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
r***@public.gmane.org
2014-10-17 11:07:30 UTC
Permalink
Raw Message
Then again, people like me who still run it periodically have discovered
issues with it that IBM has had to correct. Thus having it fixed for when
others need it.


Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1
Group Dekko
Dept 1600
Mail to: 2505 Dekko Drive
Garrett, IN 46738
Ship to: Dock 108
6928N 400E
Kendallville, IN 46755
http://www.dekko.com





From: "Justin C. Haase" <jchaase-***@public.gmane.org>
To: Midrange Systems Technical Discussion <midrange-l-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org>
Date: 10/16/2014 05:01 PM
Subject: Re: Re:Has Reclaim Storage becoming outdated?
Sent by: "MIDRANGE-L" <midrange-l-bounces-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org>



A full reclaim is not recommended as part of "normal maintenance" and
shouldn't be run unless directed by support. See techdoc below for the
guidance if people don't believe you. Many customers are still doing this
monthly or quarterly or annually just because they always have. Presenting
this doc allows breaking of that cycle and saving of valuable technical
personnel's time!

http://www-01.ibm.com/support/docview.wss?uid=nas8N1010683

On Thu, Oct 16, 2014 at 10:28 AM, Graap, Kenneth <
Post by Graap, Kenneth
Thanx everyone for your responses regarding RCLSTG.
IBM SupportLine has informed me that a RCLSTG *ALL should only be run when
requested by IBM Support personnel.
Recovery after a crash is much easier now since the RCLSTG command has
been changed to allow pieces of it to run separately (for example *DBXREF).
This change and the use of "system-managed system access path recovery"
makes it possible to recover from a crash within minutes instead of days !
For example, after the recent crash of our system, none of these large
access paths had to be rebuilt because they were protected with
Display Protected Access Paths
Estimated
Recovery
File Library ASP If Not Protected
NARLCUS P1FILES IASP1 02:37:46
NARL01 P1FILES IASP1 01:56:29
NARLPRM P1FILES IASP1 01:54:07
NARL_PI8 P1FILES IASP1 01:50:59
NARL_PIA P1FILES IASP1 01:50:58
NARL_PI3 P1FILES IASP1 01:50:11
NARLUTDT P1FILES IASP1 01:46:24
NARL_PI1 P1FILES IASP1 01:45:10
NARLTR# P1FILES IASP1 01:42:59
NARL_PI2 P1FILES IASP1 01:42:03
NARL_PIC P1FILES IASP1 01:41:09
NARL_PI7 P1FILES IASP1 01:41:03
NARL_PIH P1FILES IASP1 01:40:40
NARL_PI5 P1FILES IASP1 01:37:30
NARL_PIG P1FILES IASP1 01:36:00
NARL_PIE P1FILES IASP1 01:30:45
My original premise that RCLSTG has become outdated was misguided. It has
actually evolved into an even better tool for recovery of the system in the
event of a crash.
Reply or Forwarded mail from: Kenneth E Graap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
r***@public.gmane.org
2014-10-13 17:20:49 UTC
Permalink
Raw Message
I have multiple lpars. I have lpars on the same machine that run RCLSTG
that run in under an hour while other lpars on the same machine took over
6 hours.
Length of time between RCLSTG's doesn't seem to matter. Certain lpars
just take longer.
Another machine's lpars varied between under an hour and up to 8 hours.

IDK if it was the Power 8 upgrade or the serious intense cleanup that an
unload/reload does which drastically shortened the RCLSTG run time.
Back before STRASPBAL unload/reload was the preferred method to rebalance
disks so it may still have some theraputic qualities.

If your business is going to die with such an outage what are your DR/HA
plans?

Rob Berendt
--
IBM Certified System Administrator - IBM i 6.1
Group Dekko
Dept 1600
Mail to: 2505 Dekko Drive
Garrett, IN 46738
Ship to: Dock 108
6928N 400E
Kendallville, IN 46755
http://www.dekko.com





From: "Graap, Kenneth" <Kenneth.Graap-EwKdArkbKgQS+***@public.gmane.org>
To: "Midrange Systems Technical Discussion (midrange-l-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org)"
<midrange-l-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org>
Date: 10/13/2014 04:34 AM
Subject: Has Reclaim Storage becoming outdated?
Sent by: "MIDRANGE-L" <midrange-l-bounces-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org>




I have a Power720 system with 2 active processors, 96GB of RAM and an 8TB
iASP - 50% utilized. Not a large system by today's standards.

For some reason, that is still being determined, this system CRASHED HARD
(immediately did a MSD and IPL'ed) right at the beginning of a work day.

After a hard crash like this it is strongly recommended that a RCLSTG
process be run.

After negotiating a 12 hour window on Sunday, I started running RCLSTG
against the 8TB iASP...

After over 9 hours, the "Reading objects from disk" step was only 51%
complete with an estimated remaining time of almost 9 more hours!

Reclaim Storage in Progress S02
10/12/14
23:01:46
RCLSTG:
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7

Reclaim Storage Step Percent Time Elapsed Time
Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0

Total . . . . . . . . . . . . . . . . . . . . : 09:14:05

I had to abort the Reclaim Storage process and bring up the system,
knowing full well that when (or if) I restart RCLSTG it will start from
the beginning again.

It seems like the RCLSTG process needs to be modified in some way that it
can complete in a reasonable amount of time. Maybe it could be designed to
multitask (??) or keep track of what it had done so if restarted it could
continue where it left off (??) ... All I know is it isn't working for
me...

Has anyone else experienced this?

Kenneth
Kenneth E. Graap
NW Natural
System Administrator for IBM Power Systems
503.226.4211 x5537
http://www.linkedin.com/in/kennethgraap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing
list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Graap, Kenneth
2014-10-14 02:32:39 UTC
Permalink
Raw Message
If your business is going to die with such an outage what are your DR/HA plans?
We are mirroring to a Power6 system located off site. It is too small to take on our current load. It's intent was to be used in a diminished role in the case of a loss of our entire data center. A decision that was made several years ago to "minimize" the cost of getting into a mirrored environment at all!

We are currently looking at plans to go to 2 mirrored Power8 systems at our primary site using two V840 SAN's and then replicate via Global Mirroring to another V840 SAN offsite that is attached to another Power8 of equal capacity.

Reply or Forwarded mail from: Kenneth E Graap
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
CRPence
2014-10-14 15:24:14 UTC
Permalink
Raw Message
Post by Graap, Kenneth
I have a Power720 system with 2 active processors, 96GB of RAM and
an 8TB iASP - 50% utilized. Not a large system by today's standards.
For some reason that is still being determined, this system CRASHED
HARD (immediately did a MSD and IPL'ed) right at the beginning of a
work day.
If there was a successful MSD, then the crash actually was on the
softer side of "hard"; given a scale of softest to hardest. While *any*
crash on a production system is "hard" from the perspective of the
system\partition owner and users, any crash that both generates and
properly stores a full Main Storage Dump (MSD) is relatively somewhat
"soft", from the perspective of the OS.
Post by Graap, Kenneth
After a hard crash like this it is strongly recommended that a
RCLSTG process be run.
Surely recommended after the hardest of crashes; a crash that results
from an effective pulled power plug, with either of a failed UPS or no
UPS, for which the failure to perform any shutdown processing and\or
even to produce a MSD would be the effect. Other "softer" crashes
[mostly software, even sometimes, but less so, if due to limits],
whereby the MSD was able to be written to disk, then probably not so
strongly.

AFaIK, the reclaim is not really "strongly recommended" anymore,
outside of a power loss. My recollection is that IBM has long been
advocating that no Reclaim Storage (RCLSTG) be performed except when
recommended by service [as likely resolution to an identified issue for
which an alternate corrective is not possible or unreasonable] or given
error messages for which the request was identified by the system as a
preferable recovery action [though those are more likely to be about and
thus to be resolved by, a reclaim of just the *DBXREF, rather than using
a full reclaim]. That direction to discourage the request, due entirely
to the requirements to perform the reclaim being incompatible with most
service\up-time requirements, was a direct consequence of the systems
having been enabled to become and thus becoming much larger due to
having extended system limits over time. There should be some KB
articles [now called TechNote documents] that imply the avoidance of a
full reclaim is most desirable, and that other means are available for
corrections of specific issues that a full RCLSTG might otherwise be
used as resolution; e.g. Reclaim DB Cross Reference (RCLDBXREF) [or
RCLSTG SELECT(*DBXREF)] for System Database XREF issues, Change object
Owner (CHGOBJOWN) and Grant Object Authority (GRTOBJAUT) to correct
ownership\authority issues, Reclaim Objects by Owner (RCLOBJOWN) to
correct out-of-context /QSYS.LIB objects, and Reclaim Object Links
(RCLLNK) for some file-system issues.
Post by Graap, Kenneth
After negotiating a 12 hour window on Sunday, I started running
RCLSTG against the 8TB iASP...
As an iASP reclaim, I seem to recall that given their probable and
preferred use in HA, then the backup\mirror iASP would be activated in
place of the one taken offline, and thus that offline copy could be
reclaimed _while_ the other iASP is active in its place.?
Post by Graap, Kenneth
After over 9 hours, the "Reading objects from disk" step was only
51% complete with an estimated remaining time of almost 9 more
hours!
I do not recall what is a reasonable estimate for that phase. Long
ago I recall some user experiences being documented with timings and
some specific configuration information; gathered from actual reclaim
requests, from the data stored in the QRCLSTG data area.?
Post by Graap, Kenneth
Reclaim Storage in Progress S02
10/12/14 23:01:46
Select/Omit/ASP device or group : *ALL *NONE IASP1
Start date and time . . . . . . : 10/12/14 13:47:38
Current step / total . . . . . : 2 7
Reclaim Storage Step Percent Time Elapsed Time Remaining
Data base/library/directory recovery 100 00:00:11 00:00:00
Reading objects from disk 51 09:13:54 08:52:10
Processing data base relationships 0
File ID table recovery 0
Directory recovery 0
Object description verification 0
Final cleanup 0
Total . . . . . . . . . . . . . . . . . . . . : 09:14:05
The "almost 9 more hours" was solely for that one phase of
processing; i.e. the estimated Time Remaining includes just that
currently active step [as designated with the greater than sign on the
same line of output].?

FWiW: The RECLAIM instruction also implements the Retrieve Disk
Information (RTVDSKINF) request. Thus the requirements for that similar
phase of processing [listing and some minimal review\processing of the
objects in the permanent storage directory] in both, could probably be
inferred from making and timing that RTVDSKINF request.
Post by Graap, Kenneth
I had to abort the Reclaim Storage process and bring up the system,
knowing full well that when (or if) I restart RCLSTG it will start from
the beginning again.
It seems like the RCLSTG process needs to be modified in some way
that it can complete in a reasonable amount of time. Maybe it could
be designed to multitask (??)
The RECLAIM instruction already operates with multiple LIC tasks to
perform the stage shown; obtaining the list of every /object/ from the
Permanent Storage Directory. I believe the number of tasks that are
used for the request, is calculated based on the CPU [number] and
storage [size; possibly also arms] configurations; possibly also the
available memory. A review of the [number of] RCxxx tasks in the LIC
task list [IIRC: STRSST, D/A/D, ...] would show how many of those LIC
tasks are active for the RECLAIM instruction.
Post by Graap, Kenneth
or keep track of what it had done so if restarted it could continue
where it left off (??) ...
If the operation is both discouraged and has lower-impact
alternatives, then the costs to achieve that capability [by IBM] for the
perceived benefits [of the few using the feature[, likely would be hard
to justify; i.e. very seldom and by few, would the benefits be experienced.

IMO the request is most often performed more for the placebo effect
than for any legitimate effect. But the pain of that pill [the cost of
the reclaim] often far outweighs the actual benefits that the reclaim
could offer as relief for the pain to the system, in care of the
system-owner.
Post by Graap, Kenneth
All I know is it isn't working for me...
Has anyone else experienced this?
Consider: If a scratch-install Disaster Recovery of the disks is
faster than a RCLSTG, then performing that DR restore and applying
changes since the last save\backup, could be considered a /better/
choice than the reclaim; i.e. effectively the same result [though an
even /cleaner/ effect], achieved quicker.
--
Regards, Chuck
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
DrFranken
2014-10-14 16:42:21 UTC
Permalink
Raw Message
Post by CRPence
As an iASP reclaim, I seem to recall that given their probable and
preferred use in HA, then the backup\mirror iASP would be activated in
place of the one taken offline, and thus that offline copy could be
reclaimed _while_ the other iASP is active in its place.?
Well yes this could be done. In PowerHA this would be a CHGASPSSN with
*DETATCH as the action. Then you would be able to bring the backup copy
on line and then there run the full reclaim on that iASP. This would
work marvelously and allow production to continue.

BUT then there is a problem. When you *REATTACH the backup site to
production, any changes done there by the RCLSTG to that copy will be
overwritten by the current copy on production. Oops!

So for that to be in any way effective one would need to do some sort of
object level compare to see what's different between production and
backup, then apply those changes to production before reestablishing the
mirroring. That seems fabulously problematic to me.


- Larry "DrFranken" Bolhuis

www.frankeni.com
www.iDevCloud.com
www.iInTheCloud.com
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list
To post a message email: MIDRANGE-L-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
To subscribe, unsubscribe, or change list options,
visit: http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request-Zwy7GipZuJhWk0Htik3J/***@public.gmane.org
Before posting, please take a moment to review the archives
at http://archive.midrange.com/midrange-l.
Loading...