Initialize LDA/STM state from a (d, v, c) sparse count matrix.
Source:R/Rfunction.R
init_mod_from_count.RdGiven a document-term matrix in triplet form (d, v, c) using 0-based indices, this function initializes the LDA state: - samples initial topic assignments z, - constructs document-topic counts nd, - constructs topic-word counts nw, - computes ndsum, nwsum, and normalized topic proportions X.
Arguments
- count
Integer matrix with 3 columns representing triples (d, v, c), where d and v are 0-based indices.
- K
Integer, number of topics. Required if `phi` is NULL. If `phi` is provided, K is inferred from ncol(phi).
- phi
Optional numeric matrix of size V x K specifying per-word topic probabilities used only during initialization.
- seed
Optional integer random seed.
Value
A list with components:
- z
Integer vector (length NZ) of sampled topics, 0-based.
- nd
DxK document-topic count matrix.
- nw
KxV topic-word count matrix.
- ndsum
Integer vector (length D) with row sums of nd.
- nwsum
Integer vector (length K) with row sums of nw.
- X
DxK matrix of normalized topic proportions nd / ndsum.
- D
Number of documents.
- V
Vocabulary size.
- K
Number of topics.
- NZ
Number of non-zero entries (rows in count).