Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need to write a program to retry REDCap disconnected files/subjects #127

Open
tashrifbillah opened this issue Aug 15, 2024 · 9 comments

Comments

@tashrifbillah
Copy link
Contributor

tashrifbillah commented Aug 15, 2024

urllib3.exceptions.MaxRetryError
 CP00102_daily_activity_and_saliva_sample_collection.csv 
Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/data/predict1/miniconda3/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x2b9150649cf0>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='redcap.partners.org', port=443): Max retries exceeded with url: /redcap/api/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b9150649cf0>: Failed to establish a new connection: [Errno -2] Name or service not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/utility/rpms_to_redcap.py", line 387, in <module>
    r = requests.post('https://redcap.partners.org/redcap/api/', data= fields)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 115, in post
    return request("post", url, data=data, json=json, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='redcap.partners.org', port=443): Max retries exceeded with url: /redcap/api/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b9150649cf0>: Failed to establish a new connection: [Errno -2] Name or service not known'))
urllib3.exceptions.ProtocolError
 CP00102_missing_data.csv 
Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 1374, in getresponse
    response.begin()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 1374, in getresponse
    response.begin()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "/data/predict1/miniconda3/lib/python3.10/http/client.py", line 287, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/utility/rpms_to_redcap.py", line 387, in <module>
    r = requests.post('https://redcap.partners.org/redcap/api/', data= fields)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 115, in post
    return request("post", url, data=data, json=json, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 547, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
urllib3.exceptions.NewConnectionError
 CP00102_family_interview_for_genetic_studies_figs.csv.flat 
Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/data/predict1/miniconda3/lib/python3.10/socket.py", line 955, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x2b2fbe1dcc10>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='redcap.partners.org', port=443): Max retries exceeded with url: /redcap/api/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b2fbe1dcc10>: Failed to establish a new connection: [Errno -2] Name or service not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/predict1/utility/rpms_to_redcap.py", line 387, in <module>
    r = requests.post('https://redcap.partners.org/redcap/api/', data= fields)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 115, in post
    return request("post", url, data=data, json=json, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/data/predict1/miniconda3/lib/python3.10/site-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='redcap.partners.org', port=443): Max retries exceeded with url: /redcap/api/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x2b2fbe1dcc10>: Failed to establish a new connection: [Errno -2] Name or service not known'))

The easiest solution may be to:

  • grep bsub/ directory for urllib3.exceptions.MaxRetryError string
  • find the *err files that has it, obtain subject ID
  • retry all CSV files of that subject if grepping and finding only that CSV file is impossible
  • This program should run after all subjects have finished
@tashrifbillah
Copy link
Contributor Author

grep -B34 urllib3.exceptions.MaxRetryError /tmp/errors.txt | grep .csv

This gives only csv file names.

@tashrifbillah
Copy link
Contributor Author

cd /data/predict1/utility/bsub/
../parse_redcap_error.py "*err" | grep -B34 urllib3.exceptions.MaxRetryError | grep .csv

This gives only csv file names.

@tashrifbillah
Copy link
Contributor Author

tashrifbillah commented Aug 16, 2024

Put this within rpms_to_redcap.sh:

FORCE=1
for form in $(cat ~/failed.csv)
do
  pushd . > /dev/null
  subject=${form:0:7}
  site=${form:0:2}
  cd /data/predict1/data_from_nda/Prescient/PHOENIX/PROTECTED/Prescient${site}/raw/${subject}/surveys/
  echo $form
  /data/predict1/utility/rpms_to_redcap.py $form $redcap_dict $API_TOKEN $FORCE
  popd > /dev/null
  sleep 10
done

@tashrifbillah
Copy link
Contributor Author

Another idea is catch this error within request and set its hash to zero.

@tashrifbillah
Copy link
Contributor Author

Special characters are making it hard to streamline:

^[[0;31m ME01326_speech_sampling_run_sheet.csv ^[[0m

@tashrifbillah
Copy link
Contributor Author

As an improvement, we removed special characters around $form:

echo -e '\033[0;31m' $form '\033[0m' >&2

@tashrifbillah
Copy link
Contributor Author

We shall actually have to run the whole pipeline for those selected cases. So two ideas:

(i)

  1. set upload=1 for those subjects in date_shift database
  2. and set upload=1 for those forms in {subject}_hashes.csv

(ii)
or, write an rpms_records.txt and rerun the whole RPMS pipeline for upload, clean, down shift


(ii) seems like a big task. So we should just follow (i) and wait for the next run.

tashrifbillah added a commit that referenced this issue Aug 22, 2024
@tashrifbillah
Copy link
Contributor Author

If this scheme is successful, we should deploy this on import_records_all.py too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant