Update README.md

fronchetti · Aug 10, 2022 · 65ea210 · 65ea210
1 parent 984cb73
commit 65ea210
Showing 1 changed file with 4 additions and 1 deletion.
diff --git a/scripts/README.md b/scripts/README.md
@@ -3,4 +3,7 @@
 This is probably the most complex folder of all the repository, so I will try to be as detailed as possible.
 
 This folder is organized as follows:
-- If you are looking for how we extracted documentation data from GitHub, you should look at the `scraper` folder. The `api_scraper.py` file is the main file of this folder, containing the code that requests custom URLs to GitHub API. The file `main.py` presents the whole process of extracting a documentation file, `scrapy.py` shows how to do the URL requets to the `api_scraper.py` module and `validate.py` shows how we validated if a documentation file was valid for qualitative analysis or not. If you want to know how we converted the markdown files to spreadsheets, take a look at `export.py` (noticed that we use cmark-gfm to convert the markdown content to plaintext, which might be a pain if you are not using a system based on Linux). More information about all these files are given as doctstrings. 
+- If you are looking for how we extracted documentation data from GitHub, you should look at the `scraper` folder. The `api_scraper.py` file is the main file of this folder, containing the code that requests custom URLs to GitHub API. The file `main.py` presents the whole process of extracting a documentation file, `scrapy.py` shows how to do the URL requets to the `api_scraper.py` module and `validate.py` shows how we validated if a documentation file was valid for qualitative analysis or not. If you want to know how we converted the markdown files to spreadsheets, take a look at `export.py` (Please noticed that we use cmark-gfm to convert the markdown content to plaintext and, if you want to run it, you will need to build cmark-gfm on your computer). More information about all these files are given in doctstrings.
+- Inside the `classifier` folder you will find how we performed all the classification steps until getting a final model. The subfolders are supposed to as intuitive as possible. The `data_preparation` folder, contains the code about how we prepared data for classification, the `model_selection` folder about how we selected the best estimator for our problem, the `results_report` should contain scripts used to report our final model, and the `classification` folder contains the code used to perform classification. If you want to understand the whole process, I recommend starting with the `main.py` file, where I tried to split in clear methods the stages of this process. 
+
+Don't hesitate to contact me at [email protected] if you get confused, this was a one-developer job and I know that some parts might be unclear. I did my best.