Projects/laydi
Projects
/
laydi
Archived
7
0
Fork 0
This repository has been archived on 2024-07-04. You can view files and clone it, but cannot push or open issues or pull requests.
laydi/wiki/design.md

20 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

2023-01-25 13:36:26 +01:00
# The basic design concepts
The design is based around two fundamental (and related) concepts; *datasets* and *dimensions*. A dataset is a matrix with a list of identifiers for each row and another list of identifiers for each column. The dimension for rows and columns are also stored.
## Dimension
A dimension is just a name of a domain that your data contain. Each element along a dimension is identified by a name that is defined to be unique across every dataset, plot and other program elements that contains that dimension.
So, if we have a dimension named `samples`, which contains an identifier `patient1`, whenever this identifier is used in the `samples` dimension, it is assumed to refer to the same entity.
This allows the program to do mapping between different plots and datasets, so that when <`patient1` in the `samples` dimension> is selected in one plot, this selection can propagate to all other places in the program that displays some kind of information on samples.
## Dataset
A dataset is a matrix where both columns and rows are associated with dimension. For example, a gene analysis study may have a dataset where the rows are tissue samples associated with the `samples` dimension and the columns are all the measured genes in the `genes` dimension.
## Annotations
Sometimes we want additional information associated with the identifiers along a dimension for display purposes. A gene is often represented by an identifier that is not very meaningful without being looked up in a database. So if we also want some extra information, like the name of the gene, this is stored in *annotations* along the `genes` dimension.