Skip to content

A repository of words in multiple languages sorted by their frequency

License

Notifications You must be signed in to change notification settings

nachocab/words-by-frequency

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Words by Frequency of Use

Word frequencies come from this website.

Italian

# freq  word    pronunciation
521174  un      (un)
291383  sono    (ˈsono)
204090  cosa    (ˈk⊃sa)
186605  come    (ˈkome)
170403  io      (ˈio)
149049  questo  (ˈkwesto)
140200  hai     (ˈai)
140019  bene    (ˈbεne)
138657  sei     (ˈsεi)
138657  sei     (ˈsεi)
...

English

The original source is the Carnegie Mellon University Pronuncing Dictionary. Instead of IPA it uses its own pronunciation guide. The table explaining what each letter means is on their website.

#  freq  word  pronunciation
6281002  you   Y UW
5685306  i     AY
4768490  the   DH AH
3453407  to    T UW
3048287  a     AH
2879962  it    IH T
2127187  and   AH N D
2030642  that  DH AE T
1847884  of    AH V
1554103  in    IH N
...

French

#  freq word
1622928 de
1622619 je
1348809 est
1128894 pas
1093232 le
1043411 vous
992154  la
927396  tu
909177  que
853927  un
...

Spanish

#  freq word
1109867 de
677127  la
517925  que
514187  y
498562  el
455194  en
358662  a
303229  los
232670  se
204272  las
...

Catalan

The original source is Softcatala

#  freq word
6010951 de
4994785 la
4657836 el
4004551 i
3896615 que
3374723 a
3284365 un
2373551 l
2140801 en
1981974 va
...

If you would like to collaborate with another language, feel free to send me a message or pull request.

About

A repository of words in multiple languages sorted by their frequency

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published