laydi/wiki/design.md

# The basic design concepts

The design is based around two fundamental (and related) concepts; *datasets* and *dimensions*. A dataset is a matrix with a list of identifiers for each row and another list of identifiers for each column. The dimension for rows and columns are also stored.

## Dimension

A dimension is just a name of a domain that your data contain. Each element along a dimension is identified by a name that is defined to be unique across every dataset, plot and other program elements that contains that dimension.

So, if we have a dimension named `samples`, which contains an identifier `patient1`, whenever this identifier is used in the `samples` dimension, it is assumed to refer to the same entity.

This allows the program to do mapping between different plots and datasets, so that when &lt;`patient1` in the `samples` dimension&gt; is selected in one plot, this selection can propagate to all other places in the program that displays some kind of information on samples.

## Dataset

A dataset is a matrix where both columns and rows are associated with dimension. For example, a gene analysis study may have a dataset where the rows are tissue samples associated with the `samples` dimension and the columns are all the measured genes in the `genes` dimension.

## Annotations

Sometimes we want additional information associated with the identifiers along a dimension for display purposes. A gene is often represented by an identifier that is not very meaningful without being looked up in a database. So if we also want some extra information, like the name of the gene, this is stored in *annotations* along the `genes` dimension.
Relocate wiki from trac 2023-01-25 13:36:26 +01:00			`# The basic design concepts`

			`The design is based around two fundamental (and related) concepts; datasets and dimensions. A dataset is a matrix with a list of identifiers for each row and another list of identifiers for each column. The dimension for rows and columns are also stored.`

			`## Dimension`

			`A dimension is just a name of a domain that your data contain. Each element along a dimension is identified by a name that is defined to be unique across every dataset, plot and other program elements that contains that dimension.`

			So, if we have a dimension named `samples`, which contains an identifier `patient1`, whenever this identifier is used in the `samples` dimension, it is assumed to refer to the same entity.

			This allows the program to do mapping between different plots and datasets, so that when <`patient1` in the `samples` dimension> is selected in one plot, this selection can propagate to all other places in the program that displays some kind of information on samples.

			`## Dataset`

			A dataset is a matrix where both columns and rows are associated with dimension. For example, a gene analysis study may have a dataset where the rows are tissue samples associated with the `samples` dimension and the columns are all the measured genes in the `genes` dimension.

			`## Annotations`

			Sometimes we want additional information associated with the identifiers along a dimension for display purposes. A gene is often represented by an identifier that is not very meaningful without being looked up in a database. So if we also want some extra information, like the name of the gene, this is stored in annotations along the `genes` dimension.