Supervised topic model (STM) variational inference with ELBO-based convergence.
Source:R/Rfunction.R
run_stm_vi.RdThis function performs supervised topic model (STM) using variational inference.
It initializes topic assignments from count (optionally using a
topic-word prior phi), estimates regression parameters, and repeatedly
calls the C++ routine stm_vi_parallel() until convergence.
Usage
run_stm_vi(
count,
y,
K,
alpha,
beta,
phi = NULL,
seed = NULL,
max_iter = 200L,
min_iter = 50L,
tol_elbo = 1e-04,
update_sigma = TRUE,
tau = 20L,
show_progress = TRUE,
chunk = 5000L,
verbose = TRUE,
sigma2_init = NULL
)Arguments
- count
Integer matrix with 3 columns (d, v, c) in 0-based indexing. Each row represents document index
d, word indexv, and token countc.- y
Numeric vector of length D. Must not contain NA values.
- K
Integer, number of topics. Required if
phiis NULL; ignored ifphiis provided (thenK = ncol(phi)).- alpha
Dirichlet prior parameter for document-topic distributions.
- beta
Dirichlet prior parameter for topic-word distributions.
- phi
Optional V x K topic-word probability matrix used only for initializing topic assignments.
- seed
Optional integer random seed used in the initialization step.
- max_iter
Maximum number of variational sweeps.
- min_iter
Minimum number of sweeps before checking ELBO convergence.
- tol_elbo
Numeric tolerance for relative ELBO change.
- update_sigma
Logical; if TRUE, update
sigma2each sweep.- tau
Numeric log-space cutoff used in
stm_vi_parallel().- show_progress
Logical; print low-level progress inside C++.
- chunk
Integer; number of documents per parallel block.
- verbose
Logical; print ELBO and relative change per sweep.
- sigma2_init
Optional numeric scalar specifying the initial noise variance. If NULL,
sigma2is estimated once by least squares.
Value
A list containing:
- nd
D x K document-topic count matrix.
- nw
K x V topic-word count matrix.
- ndsum
Length-D vector of document token counts.
- nwsum
Length-K vector of topic token counts.
- eta
K-dimensional regression coefficient vector.
- sigma2
Final noise variance.
- phi
V x K topic-word posterior mean.
- theta
D x K document-topic posterior mean.
- elbo
Final ELBO.
- label_loglik
Final supervised term.
- elbo_trace
ELBO values per sweep.
- label_loglik_trace
Label log-likelihood per sweep.
- n_iter
Number of iterations actually performed.
- D, V, K, NZ
Model dimensions.
Details
Convergence is assessed based on the relative changes in the evidence lower bound (ELBO) and the supervised label log-likelihood: $$ \frac{\mathrm{ELBO}_t - \mathrm{ELBO}_{t-1}}{|\mathrm{ELBO}_{t-1}|}, \qquad \frac{\ell_t - \ell_{t-1}}{|\ell_{t-1}|}. $$ After a minimum number of iterations, the algorithm is declared to have converged when both quantities are non-negative and smaller than the prescribed tolerance.
**Important:**
This function assumes that the response vector y contains **no NA**
values. The underlying C++ implementation does not skip missing responses
and requires y[d] to be finite for all documents.