Projects/laydi
Projects
/
laydi
Archived
7
0
Fork 0
This repository has been archived on 2024-07-04. You can view files and clone it, but cannot push or open issues or pull requests.
laydi/wiki/design.md

1.6 KiB

The basic design concepts

The design is based around two fundamental (and related) concepts; datasets and dimensions. A dataset is a matrix with a list of identifiers for each row and another list of identifiers for each column. The dimension for rows and columns are also stored.

Dimension

A dimension is just a name of a domain that your data contain. Each element along a dimension is identified by a name that is defined to be unique across every dataset, plot and other program elements that contains that dimension.

So, if we have a dimension named samples, which contains an identifier patient1, whenever this identifier is used in the samples dimension, it is assumed to refer to the same entity.

This allows the program to do mapping between different plots and datasets, so that when <patient1 in the samples dimension> is selected in one plot, this selection can propagate to all other places in the program that displays some kind of information on samples.

Dataset

A dataset is a matrix where both columns and rows are associated with dimension. For example, a gene analysis study may have a dataset where the rows are tissue samples associated with the samples dimension and the columns are all the measured genes in the genes dimension.

Annotations

Sometimes we want additional information associated with the identifiers along a dimension for display purposes. A gene is often represented by an identifier that is not very meaningful without being looked up in a database. So if we also want some extra information, like the name of the gene, this is stored in annotations along the genes dimension.