User Tools

Site Tools


latent_dirichlet_allocation_on_the_smiths_morrissey_s_lyrics

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
latent_dirichlet_allocation_on_the_smiths_morrissey_s_lyrics [2015/07/27 16:57]
vincenzo
latent_dirichlet_allocation_on_the_smiths_morrissey_s_lyrics [2015/08/01 00:16] (current)
vincenzo
Line 52: Line 52:
  
 In natural language processing, [[http://​scholar.harvard.edu/​files/​bstewart/​files/​stmvignette.pdf|Latent Dirichlet allocation]] (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document'​s topics. In natural language processing, [[http://​scholar.harvard.edu/​files/​bstewart/​files/​stmvignette.pdf|Latent Dirichlet allocation]] (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word's creation is attributable to one of the document'​s topics.
-LDA assumes the following generative process for each document <​latex>​w</​latex>​ in the corpus <​latex>​D</​latex>​. +
-  - Choose <​latex>​N \sim Poisson(\xi)</​latex>​ +
-  - Choose <​latex>​\theta \sim D(\alpha) </​latex>​ +
-  - For each of the <​latex>​N</​latex>​ words in <​latex>​w_i</​latex>​+
  
  
latent_dirichlet_allocation_on_the_smiths_morrissey_s_lyrics.txt · Last modified: 2015/08/01 00:16 by vincenzo