Home

Last updated: 2025-06-24

Checks: 2 0

Knit directory: symmetric_covariance_decomposition/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Repository version: bad5eb5

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version bad5eb5. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	bad5eb5	Annie Xie	2025-06-24	Update home page
html	5b3d7b5	Annie Xie	2025-06-17	Build site.
Rmd	477d73d	Annie Xie	2025-06-17	Add new takeaways to website
html	996810e	Annie Xie	2025-06-13	Build site.
Rmd	f95548d	Annie Xie	2025-06-13	Add links to website
html	3b93884	Annie Xie	2025-06-06	Build site.
Rmd	4faf38a	Annie Xie	2025-06-06	Add links to website
html	11dfc4a	Annie Xie	2025-05-14	Build site.
Rmd	72a2eef	Annie Xie	2025-05-14	Add links to website
html	e4b389b	Annie Xie	2025-05-07	Build site.
Rmd	13ed7ad	Annie Xie	2025-05-07	Add links to website
html	a256140	Annie Xie	2025-04-28	Build site.
Rmd	062a8a6	Annie Xie	2025-04-28	Update website
html	5984551	Annie Xie	2025-04-28	Build site.
Rmd	590e851	Annie Xie	2025-04-28	Update website with links
html	c4604fa	Annie Xie	2025-04-09	Build site.
html	35ffb3e	Annie Xie	2025-04-08	Build site.
Rmd	6a821de	Annie Xie	2025-04-08	Update website home page
html	526f0f5	Annie Xie	2025-04-08	Build site.
Rmd	d86ba17	Annie Xie	2025-04-08	Update website home page
Rmd	0c38e1e	Annie Xie	2025-04-08	Publish initial index file
html	b957840	Annie Xie	2025-04-08	Build site.
Rmd	af8030a	Annie Xie	2025-04-08	Start workflowr project.

This site is for further exploring symmetric covariance decomposition methods.

Main takeways thus far (updated June 24):

symEBcovMF with backfit with point-exponential prior also does relatively well in the tree setting. However, the representation is not fully sparse (We also see this with GBCD and symEBcovMF with the point-Laplace plus splitting initialization). I suspect this is due to the model misspecification with regards to the noise. The corresponding analysis is here.
symEBcovMF backfit initialized with the point-Laplace plus splitting strategy does relatively well in the tree setting. One observation is the representation is not fully sparse; some of the factors have small loadings in other populations alongside the main component. The corresponding analysis is here.
When applying symEBcovMF with point-Laplace prior in the tree setting, the greedy method does not find the sparse representation. However, if you backfit for long enough, e.g. 20,000 iterations, symEBcovMF eventually will find the sparse representation. This behavior is also seen in flashier and EBCD. The corresponding analysis is here
I think the greedy methods do not find the sparse representation because \(F\) is not exactly orthogonal (I generate the data matrix \(Y\) as \(Y = LF' + E\) where \(F_{ij} \overset{i.i.d.}{\sim} N(0,2^2)\)). It seems like the method is picking up on correlations between the factors. My investigation of this can be found here
Flashier’s backfit is a lot faster than symEBcovMF’s backfit because of its extrapolation technique
For symEBcovMF, refitting the lambda values after a new factor is added helps improve the fit. This is seen in the balanced, nonoverlapping analysis here. It can also help the method find a more complete representation. This is especially apparent in the tree setting analysis here.
For the sparse overlapping setting, the method performs better when the Kmax parameter is set to a larger number. This is seen in the analysis here.

symEBcovMF analyses:

Exploration of symEBcovMF in the balanced nonoverlapping setting
Exploration of symEBcovMF in the tree setting
Exploration of generalized binary vs pt-exp symEBcovMF in the tree setting
Exploration of different binary priors in tree setting
Exploration of symEBcovMF backfit in the tree setting
Exploration of symEBcovMF in the unbalanced nonoverlapping setting
Further exploration of symEBcovMF in the unbalanced nonoverlapping setting
Exploration of symEBcovMF backfit in the unbalanced nonoverlapping setting
Exploration of symEBcovMF in the sparse, overlapping setting
Exploration of symEBcovMF backfit in the sparse, overlapping setting
Exploration of symEBcovMF in the null setting

Analyses related to initialization:

Exploration of generalized binary symEBcovMF with point exponential initialization
Exploration of generalized binary symEBcovMF backfit initialized with point exponential
Exploration of point exponential symEBcovMF backfit initialized with point laplace plus splitting
Exploration of point exponential symEBcovMF backfit initialized with point laplace plus splitting part 2
Exploration of point laplace symEBcovMF backfit in tree setting
Exploration of point laplace greedy symEBcovMF in tree setting

Tree residual matrix example:

Checking convergence
Checking other binary priors
Checking orthogonality