Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't round-trip nested data #301

Open
acvelozo opened this issue Apr 11, 2019 · 3 comments
Open

Can't round-trip nested data #301

acvelozo opened this issue Apr 11, 2019 · 3 comments
Labels
bug an unexpected problem or unintended behavior

Comments

@acvelozo
Copy link

Hi,

I am trying to upload nested data using bigrquery. I have been able to reproduce succesfully the example provided in issue #256, but when I try with more complicated data (different types of nesting levels using lists and dataframes), I get an error. Here is a reproducible example that captures the type of error that I am getting:

library(bigrquery)

tableToExtract <- bq_table('bigquery-public-data','google_analytics_sample','ga_sessions_20170801')
tableNested <- bq_table_download(tableToExtract,max_results=10)

tableToUpload <- bq_table(bq_test_dataset(),'test_table')
bq_table_upload(tableToUpload,tableNested)

#> Error: Invalid field name "2". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long. Table: test_table_ev_I5_pg9JKl26CP3VnsMhvWUkT5BB9uQY5G5ycLA9srMw_source [invalidQuery]

Please note this is not my exact use case, bq_table_copy() is what I would use if I needed to copy a table. My example serves to show the kind of error I am getting when trying to upload nested data, which is what I am trying to achieve.

I hope my question is clear, but please tell me if I should further clarify.

Thank you in advance,
Best regards,
Alexandre Velozo

@hadley
Copy link
Member

hadley commented Sep 16, 2020

Minimal reprex:

library(bigrquery)
bigrquery::bq_auth("[email protected]")

df <- bq_table_download(
  "bigquery-public-data.google_analytics_sample.ga_sessions_20170801", 
  max_results = 10
)

ds <- bq_test_dataset()
bq_table_upload(bq_table(ds, "test_table"), df)
#> Error: Job 'gargle-169921.job_7ordBvUNX7552SnUeH6-1O6vFobc.US' failed
#> x Invalid field name "1". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long. Table: test_table_ev_7vtyP4JeicMWTmo7g0LSGqb9NWGoq2RYK07NEhxSJVI_source [invalidQuery]

Created on 2020-09-16 by the reprex package (v0.3.0.9001)

@hadley
Copy link
Member

hadley commented Sep 16, 2020

Even more minimal reprex:

library(bigrquery)
bigrquery::bq_auth("[email protected]")

df <- bq_table_download(
  "bigquery-public-data.google_analytics_sample.ga_sessions_20170801", 
  max_results = 1
)

ds <- bq_test_dataset()
bq_table_upload(bq_table(ds, "test_table"), df["trafficSource"])
#> Error: Job 'gargle-169921.job_2G2JcarfO3y4Dd7T5Z28LOc7LvUr.US' failed
#> x Invalid field name "1". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long. Table: test_table_ev_aB94vwdaHbVUB3ImbUCfrEoif_pjD7NimlC9Hn5tBbY_source [invalidQuery]

df2 <- df["trafficSource"]
str(df2)
#> tibble [1 × 1] (S3: tbl_df/tbl/data.frame)
#>  $ trafficSource:List of 1
#>   ..$ :List of 9
#>   .. ..$ referralPath    : chr NA
#>   .. ..$ campaign        : chr "(not set)"
#>   .. ..$ source          : chr "(direct)"
#>   .. ..$ medium          : chr "(none)"
#>   .. ..$ keyword         : chr NA
#>   .. ..$ adContent       : chr NA
#>   .. ..$ adwordsClickInfo:List of 1
#>   .. .. ..$ :List of 12
#>   .. .. .. ..$ campaignId        : int NA
#>   .. .. .. ..$ adGroupId         : int NA
#>   .. .. .. ..$ creativeId        : int NA
#>   .. .. .. ..$ criteriaId        : int NA
#>   .. .. .. ..$ page              : int NA
#>   .. .. .. ..$ slot              : chr NA
#>   .. .. .. ..$ criteriaParameters: chr "not available in demo dataset"
#>   .. .. .. ..$ gclId             : chr NA
#>   .. .. .. ..$ customerId        : int NA
#>   .. .. .. ..$ adNetworkType     : chr NA
#>   .. .. .. ..$ targetingCriteria :List of 1
#>   .. .. .. .. ..$ :List of 1
#>   .. .. .. .. .. ..$ : NULL
#>   .. .. .. ..$ isVideoAd         : logi NA
#>   .. ..$ isTrueDirect    : logi NA
#>   .. ..$ campaignCode    : chr NA
df2$trafficSource[[1]]$adwordsClickInfo[[1]]$targetingCriteria <- NULL
bq_table_upload(bq_table(ds, "test_table"), df2)
#> Error: Job 'gargle-169921.job_bHASYBcMLC9rGNQS2LmGtnjTMmV4.US' failed
#> x Error while reading data, error message: JSON processing encountered too many errors, giving up. Rows: 1; errors: 1; max bad: 0; error percent: 0 [invalid]
#> x Error while reading data, error message: JSON parsing error in row starting at position 0: Only optional fields can be set to NULL. Field: referralPath; Value: NULL [invalid]

Created on 2020-09-16 by the reprex package (v0.3.0.9001)

@hadley hadley added the bug an unexpected problem or unintended behavior label Sep 16, 2020
@hadley hadley changed the title Issue uploading nested data Can't round-trip nested data Sep 16, 2020
@hadley
Copy link
Member

hadley commented Sep 16, 2020

Would need to do some exploration to figure if this is a problem with the download or upload (or both).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants