GitHub - familiaralien/Memrise_Crawler: Collecting available data from memrise.com course pages to create individual statistics. One of a few possible future functions: find and ignore duplicate words across different courses.

familiaralien / Memrise_Crawler Public

Notifications You must be signed in to change notification settings
Fork 3
Star 6

Collecting available data from memrise.com course pages to create individual statistics. One of a few possible future functions: find and ignore duplicate words across different courses.

6 stars 3 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
library		library
models		models
.gitattributes		.gitattributes
.gitignore		.gitignore
Memrise_Crawler.7z		Memrise_Crawler.7z
README.html		README.html
crawler.php		crawler.php
debug.txt		debug.txt
phrasebook.php		phrasebook.php
stats.php		stats.php

Repository files navigation

<html>
<head>
<title>Memrise Statistics -Instructions</title>
<meta charset="utf-8">
</head>
<body>
<pre>
=================
One time actions:
=================

Download and install wamp server (so you can run php scripts locally and don't have to trust me or another 3rd party with your password)
http://sourceforge.net/projects/wampserver/

Put the following files into the ...\wamp\www\ directory:
crawler.php
stats.php
colors.php
config.php
README.html

Startup the wamp server (you don't have to put it online for this).
Leftclick the task bar icon. Click through PHP>PHP Extensions
Make sure that additional to the defaults the following extensions are enabled: php_curl, php_tidy

Edit the config file (config.txt);

=======================
Every time actions:
=======================
1. Start your wamp server
2. Go to "http://127.0.0.1/crawler.php" in your webbrowser (I coded for firefox nd didn't check other browsers)
3. Wait for the data to be loaded*.
4. Once the data load script is done, click on the link at the end of the progress report to view your data.

*Loading of all the data may take a while (the script is crawling through all th relevant pages),
but progress should be displayed during loading.
Whenever a "." shows up in the progress report that means that a page load timed out and the script is retrying.
This is not very important, but if memrise or your connection is slow for some reason this might show up a lot.
The time out is set at 5 seconds by default. So if you're seeing lots of dots and the script is never finishing,
you might have to change that value in the config file.

======
Notes
======
This script may stop working depending on changes of the memrise webpage.

This script crawls the pages you have access to and as such can only analyse the data that is displayed. 
It cannot access the actual database behind the webpage.
So for example, if the page says that an item will be asked again "in about an hour",
the webpage will have a very specific point in time when it's going to ask it again,
but this script only has the "in about an hour" information.

Plants ready for harvesting may not show up in the graph.
</pre>
<br>
<a href="crawler.php">Go to the dataload page</a>

</body></html>

About

Collecting available data from memrise.com course pages to create individual statistics. One of a few possible future functions: find and ignore duplicate words across different courses.

Readme

Activity

6 stars

3 watching

3 forks

Report repository