Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with multi-process runs #9

Open
alexristich opened this issue Jul 13, 2017 · 0 comments
Open

Issue with multi-process runs #9

alexristich opened this issue Jul 13, 2017 · 0 comments
Labels

Comments

@alexristich
Copy link

When running the crawler over 1000 sites with 4 workers, I inevitably get an exception at the end of the run when only one worker is remaining. Here's the error text it displays:

Traceback (most recent call last):
  File "/home/alex/chameleon-crawler/crawler/crawler_manager.py", line 43, in __init__
    timeout * ((num_timeouts + 1) ** 2)
  File "/usr/lib/python3.5/multiprocessing/queues.py", line 105, in get
    raise Empty
queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 35, in __init__
    self.crawl()
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 68, in crawl
    self.get(url)
  File "/home/alex/chameleon-crawler/crawler/crawler_process.py", line 166, in get
    self.driver.get(url)
  File "/home/alex/chameleon-crawler/venv/local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 185, in get
    self.execute(Command.GET, {'url': url})
  File "/home/alex/chameleon-crawler/venv/local/lib/python3.5/site-packages/selenium/webdriver/remote/webdriver.py", line 173, in execute
    self.error_handler.check_response(response)
  File "/home/alex/chameleon-crawler/venv/local/lib/python3.5/site-packages/selenium/webdriver/remote/errorhandler.py", line 166, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout
  (Session info: chrome=59.0.3071.109)
  (Driver info: chromedriver=2.29,platform=Linux 4.4.0-83-generic x86_64)
@ghostwords ghostwords added the bug label Jul 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants