Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on Loading resource failed with status=fail #82

Open
petermr opened this issue Jun 4, 2016 · 2 comments
Open

Hangs on Loading resource failed with status=fail #82

petermr opened this issue Jun 4, 2016 · 2 comments
Assignees

Comments

@petermr
Copy link
Member

petermr commented Jun 4, 2016

[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-1407561118813147&output=html&h=90&slotname=8095622625&adk=2177631001&w=728&lmt=1465030658&loeid=19188000&ea=0&flash=0&url=http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393&wgl=0&dt=1465034257873&bpp=4&fdt=6&idt=800&shv=r20160601&cbv=r20151006&saldr=sa&correlator=3087558664193&frm=23&ga_vid=1205287208.1465034259&ga_sid=1465034259&ga_hid=1377615604&ga_fc=0&pv=2&icsg=2&nhd=2&dssz=2&mdo=0&mso=0&u_tz=60&u_his=1&u_java=0&u_h=900&u_w=1440&u_ah=826&u_aw=1440&u_cd=32&u_nplug=0&u_nmime=0&dff=times new roman&dfs=16&adx=117&ady=2162&biw=400&bih=300&isw=728&ish=90&ifk=1834829426&eid=575144605&oid=3&rx=0&eae=6&fc=216&pc=0&brdim=0,0,0,0,1440,22,0,0,728,90&vis=0&rsz=|||&abl=CS&ppjl=u1&pfx=0&fu=1044&bc=1&ifi=1&dtd=814, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-1407561118813147&output=html&h=600&slotname=8970980385&adk=1560275289&w=160&lmt=1465030659&ea=0&flash=0&url=http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393&wgl=0&dt=1465034257881&bpp=3&fdt=1807&idt=1816&shv=r20160601&cbv=r20151006&saldr=sa&correlator=3087558664193&frm=23&ga_vid=1386634615.1465034260&ga_sid=1465034260&ga_hid=1943622266&ga_fc=0&pv=1&icsg=2&nhd=2&dssz=2&mdo=0&mso=0&u_tz=60&u_his=1&u_java=0&u_h=900&u_w=1440&u_ah=826&u_aw=1440&u_cd=32&u_nplug=0&u_nmime=0&dff=times new roman&dfs=16&adx=782&ady=467&biw=400&bih=300&isw=160&ish=600&ifk=553495520&eid=575144605&oid=3&rx=0&eae=6&fc=216&pc=0&brdim=0,0,0,0,1440,22,0,0,160,600&vis=0&rsz=||o|&abl=CS&ppjl=u1&pfx=0&fu=1044&bc=1&ifi=1&dtd=1846, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] start page is loaded
[warning] [phantom] Loading resource failed with status=fail (HTTP 200): http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393

at which stage it hangs

@blahah
Copy link
Member

blahah commented Jun 4, 2016

can you please provide the command, version, URL and scraper?

@petermr
Copy link
Member Author

petermr commented Jun 4, 2016

Just rerun as single command and it seems to work. The error occurred with 100 URLs in a file...

works as single URL ... Hmm

localhost:2016-05-02 pm286$ quickscrape -u http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393 -d /Users/pm286/workspace/journal-scrapers/scrapers/ -o junk 
info: quickscrape 0.4.7 launched with...
info: - URL: http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393
info: - Scraperdir: /Users/pm286/workspace/journal-scrapers/scrapers
info: - Rate limit: 3 per minute
info: - Log level: info
info: urls to scrape: 1
info: processing URL: http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393
[info] [phantom] Starting...
[info] [phantom] Running suite: 3 steps
[debug] [phantom] opening url: http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393, HTTP GET
[debug] [phantom] Navigation requested: url=http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393, type=Other, willNavigate=true, isMainFrame=true
[debug] [phantom] url changed to "http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393"
[debug] [phantom] Navigation requested: url=http://imrn.oxfordjournals.org/resource/htmlfiles/advert.html?p=Top&u=imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=http://imrn.oxfordjournals.org/resource/htmlfiles/advert.html?p=Right1&u=imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=http://imrn.oxfordjournals.org/resource/htmlfiles/advert.html?p=Bottom&u=imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/html/r20160601/r20151006/zrt_lookup.html, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=about:blank, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-1407561118813147&output=html&h=600&slotname=8970980385&adk=1560275289&w=160&lmt=1465069785&ea=0&flash=0&url=http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393&wgl=0&dt=1465073385090&bpp=4&fdt=6&idt=187&shv=r20160601&cbv=r20151006&saldr=sa&correlator=541281675265&frm=23&ga_vid=1066943421.1465073385&ga_sid=1465073385&ga_hid=1326277707&ga_fc=0&pv=2&icsg=2&nhd=2&dssz=2&mdo=0&mso=0&u_tz=60&u_his=1&u_java=0&u_h=900&u_w=1440&u_ah=826&u_aw=1440&u_cd=32&u_nplug=0&u_nmime=0&dff=times new roman&dfs=16&adx=782&ady=467&biw=400&bih=300&isw=160&ish=600&ifk=553495520&eid=575144605&oid=3&rx=0&eae=6&fc=216&pc=0&brdim=0,0,0,0,1440,22,0,0,160,600&vis=0&rsz=||o|&abl=CS&ppjl=u1&pfx=0&fu=1044&bc=1&ifi=1&dtd=200, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Navigation requested: url=https://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-1407561118813147&output=html&h=90&slotname=8095622625&adk=2177631001&w=728&lmt=1465069785&loeid=26835105&ea=0&flash=0&url=http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393&wgl=0&dt=1465073385097&bpp=4&fdt=230&idt=242&shv=r20160601&cbv=r20151006&saldr=sa&correlator=541281675265&frm=23&ga_vid=1823730535.1465073385&ga_sid=1465073385&ga_hid=290367467&ga_fc=0&pv=1&icsg=2&nhd=2&dssz=2&mdo=0&mso=0&u_tz=60&u_his=1&u_java=0&u_h=900&u_w=1440&u_ah=826&u_aw=1440&u_cd=32&u_nplug=0&u_nmime=0&dff=times new roman&dfs=16&adx=117&ady=1544&biw=400&bih=300&isw=728&ish=90&ifk=1834829426&eid=575144605&oid=2&rx=0&eae=6&fc=216&pc=0&brdim=0,0,0,0,1440,22,0,0,728,90&vis=0&rsz=|||&abl=CS&ppjl=u1&pfx=0&fu=1044&bc=1&ifi=1&dtd=257, type=Other, willNavigate=true, isMainFrame=false
[debug] [phantom] Successfully injected Casper client-side utilities
[debug] [phantom] start page is loaded
[info] [phantom] Step anonymous 3/3 http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393 (HTTP 200)
info: [scraper]. URL rendered. http://imrn.oxfordjournals.org/content/early/2016/05/02/imrn.rnv393.
[info] [phantom] Step anonymous 3/3: done in 11911ms.
[info] [phantom] Done 3 steps in 11924ms
info: [scraper]. download started. fulltext.pdf.
info: [scraper]. download started. fulltext.html.
info: URL processed: captured 8/8 elements (0 captures failed)
info: all tasks completed
localhost:2016-05-02 pm286$ 
localhost:2016-05-02 pm286$ quickscrape --version
0.4.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants