Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very high lag time: possible causes? #34

Open
gg4u opened this issue Feb 1, 2016 · 2 comments
Open

very high lag time: possible causes? #34

gg4u opened this issue Feb 1, 2016 · 2 comments

Comments

@gg4u
Copy link

gg4u commented Feb 1, 2016

Hello,

I'm experiencing real high lag time.
I even hit the message ('Server lag, sleeping 14 seconds').

Could you please suggest which could possible reasons be ?

I am just running this test from console:

def search_wikipedia_random():
    site = wiki.Wiki("https://en.wikipedia.org/w/api.php") 
    params = {
        'action':'query', 
        'list':'random',
        'rnnamespace' : 0,
        'rnfilterredir' : 'all' ,
        'rnlimit' : 1,
        'redirects' : '',
        'format' : 'json',
        }
    request = api.APIRequest(site, params)
    result = request.query()
    return result['query']['random'][0]['id']

import time

start = time.time()
search_wikipedia_random()
end = time.time()
print(end - start)

and got

16.8418629169
13.1237468719

!

I am not having problems in browsing, so I don't think is problem of the line.. (right now listening to youtube and doing stuff and rassodocks in the evening :) )
I wonder if I could be lagged out for not having configured something (headers?) or if I m missing something.

@mzmcbride
Copy link
Collaborator

I took your script, added the two necessary import lines, and ran it a few times. Here's what I got:

mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
1.04104304314
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.576555013657
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.634619951248
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
3.55164194107
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.607800960541
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
2.19808292389
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.659627914429
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
1.16318798065
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.809925079346
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.596135139465
mzmcbride@gonzo:~$ ./wikitools-issues-34.py 
0.724714040756

You can view lag at https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=dbrepllag&sishowalldb=.

If I do time curl "https://en.wikipedia.org/w/api.php?action=query&list=random&rnnamespace=0&rnlimit=1&rnfilterredir=all&redirects=&format=json" a few times, I get about 0.3 seconds. Setting up the Wiki() object probably accounts for the additional overhead. Your code looks fine. You could set &rnlimit= to a higher value to get more random pages in a single query. If you're getting 14 seconds of server lag... I'm not sure what's causing that. The production server admin log (https://wikitech.wikimedia.org/wiki/Server_Admin_Log) doesn't indicate that lag has been high lately.

@gg4u
Copy link
Author

gg4u commented Feb 1, 2016

hi @mzmcbride thank you for tip for viewing lag - I tried move Wiki() object as global, to declare only once.
Now I m in another network, cannot run same test.

An additional information:
I see that a few times there are spikes over the second, going from 1, 2, even 4s; I m running this as a single test, I wonder if lag time would increase or be more frequent if using wikitool() in an api for public use, with more requests.

Is lag time depending by number of connections coming from a domain?
I am would like to use full-text search queries as entry point for a site, so list=search and generator=search module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants