Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
yutkin committed Jul 22, 2018
1 parent e324db8 commit 153a2f5
Showing 1 changed file with 12 additions and 9 deletions.
21 changes: 12 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
## Корпус новостей с Lenta.Ru

* Размер: 1.7 Гб
* Количество новостей: 699.746
* Период: Сентябрь 1999 - июль 2018
* Размер: 1.7 Гб (288 MB архив)
* Количество новостей: ~700.000
* Период: 08.1999 -- 07.2018

+ [Скрипт](../master/download_lenta.py) для скачивания новостей.
+ [Скрипт](../master/download_lenta.py) для скачивания новостей (требуется Python 3.6+).

## (Eng) Corpus of news articles from Lenta.Ru
* Size: 1.7 Gb
* News articles: 699.746
* Dates: Sept. 1999 - July 2018
* Size: 1.7 Gb (to 288 MB compressed)
* News articles: ~700.000
* Dates: 08.1999 -- 07.2018

+ [Script](../master/download_lenta.py) for news downloading.
+ [Script](../master/download_lenta.py) for news downloading (Python 3.6+ is required).


## Скачать / Download
* [Kaggle](https://www.kaggle.com/yutkin/corpus-of-russian-news-articles-from-lenta/)
* [GitHub](https://github.com/yutkin/Lenta.Ru-News-Dataset/releases/download/0.1/news_lenta.csv)
* [Amazon S3](https://s3-us-west-2.amazonaws.com/lenta-news-dataset/news_lenta.csv)
* [Amazon S3](https://s3-us-west-2.amazonaws.com/lenta-news-dataset/news_lenta.csv.bz2)

## Распаковка / Decompression
`bzip2 -d news_lenta.csv.bz2`

## Скриншот / Screenshot

Expand Down

0 comments on commit 153a2f5

Please sign in to comment.