Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix contacts sync #208

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Fix contacts sync #208

wants to merge 2 commits into from

Conversation

luandy64
Copy link
Contributor

Description of change

This PR fixes what appears to be a bug from #192.

Here is the stacktrace

CRITICAL 'NoneType' object has no attribute 'utcoffset'
Traceback (most recent call last):
  File "/usr/local/share/virtualenvs/tap-hubspot/bin/tap-hubspot", line 33, in <module>
    sys.exit(load_entry_point('tap-hubspot', 'console_scripts', 'tap-hubspot')())
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1164, in main
    raise exc
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1161, in main
    main_impl()
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1155, in main_impl
    do_sync(STATE, args.properties)
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 1040, in do_sync
    STATE = stream.sync(STATE, ctx) # pylint: disable=not-callable
  File "/home/vagrant/git/tap-hubspot/tap_hubspot/__init__.py", line 505, in sync_contacts
    bookmark_values[row['vid']] = utils.strftime(modified_time)
  File "/usr/local/share/virtualenvs/tap-hubspot/lib/python3.9/site-packages/singer/utils.py", line 41, in strftime
    if dtime.utcoffset() != datetime.timedelta(0):
AttributeError: 'NoneType' object has no attribute 'utcoffset'

Diagnosis

The sync code flow seems to be using sync_contact_vids in order to collect a page of vids from the /contacts/v1/lists/all/contacts/all endpoint.

Those get passed off to _sync_contact_vids in order to be sent to the /contacts/v1/contact/vids/batch endpoint to get more details for specific vids.

The bug seems to be that we assume every object in the first response has a versionTimestamp key. So when we initialize modified_time = None on Line 494, nothing gives it a new value before we try bookmark_values[row['vid']] = utils.strftime(modified_time), leading to our stack trace

Rationale for change

I saw this line in _sync_contact_vids which makes me think that we don't always expect bookmark_values to have the vid. So the change in this PR is to not convert the a value to a datetime when we don't have a value.

Manual QA steps

  • I wrote a unit test that reproduces the stack trace on master
  • The test mocks the response of the first endpoint to return records that have and don't have a versionTimestamp. Then it mocks the second response to be super minimal. The actual test is that records in the second response are supplemented with the versionTimestamp of the first response, if we have it.

Risks

  • Low: The first API response seems to get paginated in full, so we shouldn't miss data from it. Now if you have a versionTimestamp, it gets attached to the objects in the second response instead of killing the tap.
    • It makes you wonder if this versionTimestamp is the right choice as a replication key though since it can be null.

Rollback steps

  • revert this branch

@techieshark
Copy link

If this PR was approved 2 years ago, is there a reason it hasn't been merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants