Skip to content

[Bug] [zos_copy] Return error message when concurrent copy fails #586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rexemin opened this issue Dec 8, 2022 · 2 comments · Fixed by #794
Closed

[Bug] [zos_copy] Return error message when concurrent copy fails #586

rexemin opened this issue Dec 8, 2022 · 2 comments · Fixed by #794
Assignees
Labels
Bug Something isn't working as designed. In Plan Issue has been accepted put into a planned release

Comments

@rexemin
Copy link
Collaborator

rexemin commented Dec 8, 2022

Bug description

When zos_copy is trying to copy a source dataset into a destination dataset that is already in use by another program (with DISP=SHR), the module will report a successful task but nothing will actually be copied into the destination. This bug is related to #357, and so while the implementation of supporting other disposition modes is being explored, the module should at least return an error message when this happens.

This is also related to #559, more information about the issue is found there.

Playbook verbosity output

No response

Contents of ansible.cfg

No response

Contents of the inventory

No response

Contents of group_vars or host_vars

No response

Ansible version

ansible [core 2.13.5]
  config file = /Users/bpanyar/github/IBMZSoftware/ims-containerization/ansible.cfg
  configured module search path = ['/Users/bpanyar/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.10/site-packages/ansible
  ansible collection location = /Users/bpanyar/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.10.6 (main, Aug 11 2022, 13:49:25) [Clang 13.1.6 (clang-1316.0.21.2.5)]
  jinja version = 3.1.2
  libyaml = True

IBM z/OS Ansible core Version

v1.4.0-beta.2

IBM ZOAU version

v1.1.1

z/OS version

No response

Ansible module

zos_copy

@rexemin rexemin added the Bug Something isn't working as designed. label Dec 8, 2022
@rexemin rexemin added this to the [Backlog] Bugs milestone Dec 8, 2022
@ddimatos ddimatos added Backlog and removed Backlog labels Mar 27, 2023
@ddimatos ddimatos added the In Plan Issue has been accepted put into a planned release label Mar 28, 2023
@ddimatos
Copy link
Collaborator

ddimatos commented Mar 29, 2023

After talking to @rexemin - using mls from zoau won't work for all cases, there is the case that the member being copied is not a new member therefore mls will return the same before and after.

Notes:
When a data set is opened by another process/task; one of the choices they have to open a data set is as DISP=SHR. Many times a started task (think of of a started task as a long running process) will open a data set as DISP=SHR because it will want to access it for a very long time, think of a PDS/E as a directory. The PDS/E could have many many files in it (members) needed to be read at different times by this started task, so the started task wanted to be kind and share the PDSE with other tasks while it holds it open , think of this as a file can have many readers, hence it opened it as DISP=SHR.

In other words:

  • DIS=SHR - the Data set must exist and all others can READ it concurrently (same time your process is reading it) and all others including your process must have DISP=SHR as well.

The problem is we don't know when a data set is opened as DISP=SHR, and we use dcp to copy into it and because of how its implemented we saw no error (not sure why yet without a manual recreate), thus I would suggest creating a PDS/E with some members then copying to it from shell using dcp while you have the pdse open by ISPF (this will mimic DISP=SHR). I would try a USS file --> PDSE(member) and PDSE(member) ---> PDSE(member) to see if an error appears using dcp that the code is losing.

If dcp givses us no error we can programmatically do this depending the use case:

  • Are we copying into a SEQ, if so check if its open as DISP=SHR with D GRS,RES=(*,IMSTESTL.COMNUC)
  • Are we copying to a VSAM, if so check if its open as DISP=SHR with D GRS,RES=(*,IMSTESTL.COMNUC)
  • Are we copying to a PDS or PDSE, check if its open as DISP=SHR with D GRS,RES=(*,IMSTESTL.COMNUC)

You will need to write a routine that performs a D GRS,RES=(*,IMSTESTL.COMNUC) command returning true or false and some regex to extract the column EXC/SHR to obtain SHARE else there will be no column if the data set is not DISP=SHR.

Examples:

With a lock on the data set:

00- 21.05.08           D GRS,RES=(*,IMSTESTL.COMNUC)                         
    21.05.08           ISG343I 21.05.08 GRS STATUS 830                      C
    S=SYSTEM  SYSDSN   IMSTESTL.COMNUC                                       
    SYSNAME        JOBNAME         ASID     TCBADDR   EXC/SHR    STATUS      
    EC33012A  USRT001            0057       006FEE88   SHARE      OWN      

Without a lock:

  00- 21.07.01           D GRS,RES=(*,IMSTESTL.COMNUC)                         
      21.07.01           ISG343I 21.07.01 GRS STATUS 836                      C
      NO REQUESTORS FOR RESOURCE  *        IMSTESTL.COMNUC

I don't know of another way right now to check if a data set is opened as DISP=SHARE that are clean, we could try to open the data set as MODIFY and wait for an error not, but I think the GRS command is better.

There is a very small time frame where we check if the data set is opened as DISP=SHR and the time we start to copy into the data set that remains open long enough to possibly have someone open it as DISP=SHR after we checked at it was not. This is a very small window and given this is a stop-gap (temporary) solution to block copies while we work on supporting DISP=SHR, its probably an acceptable window.

@ddimatos
Copy link
Collaborator

ddimatos commented Apr 6, 2023

Although this can be used to automate but you can help it to develop, if i split my x3270 into 2 , in the top I have entered into the PDSE with an E for edit, then in the bottom half of the x3270 session i find the data set, put qds next to it.
image
Now disp=shr is user usrt001.

image

@AndreMarcel99 AndreMarcel99 moved this from 📗In plan to 👀 Reviewing in IBM Ansible z/OS Core Collection May 16, 2023
@AndreMarcel99 AndreMarcel99 moved this from 👀 Reviewing to 🏗 In progress in IBM Ansible z/OS Core Collection May 23, 2023
@AndreMarcel99 AndreMarcel99 moved this from 🏗 In progress to 🔍 Validation in IBM Ansible z/OS Core Collection Jun 5, 2023
@AndreMarcel99 AndreMarcel99 moved this from 🔍 Validation to ✅ Done in IBM Ansible z/OS Core Collection Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working as designed. In Plan Issue has been accepted put into a planned release
Projects
4 participants