Glossary

Biological terms

bulk data - a pooled assay using a weighted average of a bulk cell sample from a particular tissue (i.e., a large population of cells), obscuring cell-to-cell variation

cancer biopsy - the removal of cells or tissues for examination by a pathologist from a cancer patient. The most common types include: (1) incisional biopsy, in which only a sample of tissue is removed; (2) excisional biopsy, in which an entire lump or suspicious area is removed; and (3) needle biopsy, in which a sample of tissue or fluid is removed with a needle. (“Definition of biopsy - NCI Dictionary of Cancer Terms - National Cancer Institute” 2018)

cancer subtype - describes the smaller groups that a type of cancer can be divided into, based on certain characteristics of the cancer cells. These characteristics include how the cancer cells look under a microscope and whether there are certain substances in or on the cells or certain changes to the DNA of the cells. It is important to know the subtype of a cancer in order to plan treatment and determine prognosis. (“Definition of cancer subtype - NCI Dictionary of Cancer Terms - National Cancer Institute” 2018)

cytotoxic - toxic to cells, cell-toxic, cell-killing. Any agent or process that kills cells. Chemotherapy and radiotherapy are forms of cytotoxic therapy. They kill cells. (“Definition of Cytotoxic” 2018)

effectors molecule - small molecule that selectively binds to a protein and regulates its biological activity. In this manner, effector molecules act as ligands that can increase or decrease enzyme activity, gene expression, or cell signaling. (Cambronne and Roy 2006)

gene enrichment analysis - a means to characterize biological attributes in a given gene set (“Analysis of DNA Chips and Gene Networks c,” n.d.)

gene expression - the generation of a functional gene product from the information encoded by a gene, through the processes of transcription and translation. Gene products are often proteins, however non-protein coding genes can encode functional RNA, including ribosomal RNA (rRNA), transfer RNA (tRNA) or small nuclear RNA (snRNA). (“Gene expression definition” 2018)

genome - the full genetic complement of an organism, encoded in either DNA or, in many viruses, RNA. It includes both the genes and non-coding sequences. (“Genome” 2018)

immunosuppression - suppression of the immune system and its ability to fight infection. Immunosuppression may result from certain diseases, such as AIDS or lymphoma, or from certain drugs, such as some of those used to treat cancer. Immunosuppression may also be deliberately induced with drugs, as in preparation for bone marrow or other organ transplantation, to prevent the rejection of a transplant. Also known as immunodepression. (“Definition of Immunosuppression” 2018)

inducers - a molecule, usually a substrate of a specific enzyme pathway, that combines with and deactivates an active repressor(produced by a regulator gene); this allows an operator gene previously repressed to activate the structural genes controlled by it to result in enzyme production; a homeostatic mechanism for regulating enzyme production in an inducible enzyme system. (“Inducer (biology) | definition of Inducer (biology) by Medical dictionary” 2018)

ligand - a molecule, as an antibody, hormone, or drug, that binds to a receptor. (“Ligand | Define Ligand at Dictionary.com” 2018)

liquid tissue - a collection of similar cell that perform a particular function; liquid such as blood and lymph, carried food, waste, products, and hormones through the body (“Tissue & Organ Flashcards” 2018)

marker gene - a detectable genetic trait or segment of DNA that can be identified and tracked. A marker gene can serve as a flag for another gene, sometimes called the target gene. A marker gene must be on the same chromosome as the target gene and near enough to it so that the two genes (the marker gene and the target gene) are genetically linked and are usually inherited together. (“Definition of Marker gene” 2018)

molecular biology- the field of biology that studies the composition, structure and interactions of cellular molecules – such as nucleic acids and proteins – that carry out the biological processes essential for the cell’s functions and maintenance (“Molecular biology definition” 2018)

omics - informally refers to a field of study in biology ending in -omics, such as genomics, proteomics or metabolomics. (“Omics definition” 2018)

phenotyping - attribution a phenotype; the visible or observable expression of the results of genes, combined with the environmental influence on an organism’s appearance or behavior. (“Examples of Genotype & Phenotype” 2018)

receptor - in cell biology, a structure on the surface of a cell (or inside a cell) that selectively receives and binds a specific substance. There are many receptors. There is a receptor for (insulin; there is a receptor for low-density lipoproteins (LDL); etc. T (“Definition of Receptor” 2018)

subtyping - describes the smaller groups that a type of cancer can be divided into, based on certain characteristics of the cancer cells. These characteristics include how the cancer cells look under a microscope and whether there are certain substances in or on the cells or certain changes to the DNA of the cells. It is important to know the subtype of a cancer in order to plan treatment and determine prognosis. (“Definition of cancer subtype - NCI Dictionary of Cancer Terms - National Cancer Institute” 2018)

tumor purity - the proportion of cancer cells in the admixture (Aran, Sirota, and Butte 2015)

Mathematical terms

basis matrix - in cell-type deconvolution, the characteristic expression profiles for each of the cell types to be estimated used in the regression-based deconvolution

Bayes’ theorem - is a formula that describes how to update the probabilities of hypotheses when given evidence (“Bayes’ Theorem and Conditional Probability | Brilliant Math & Science Wiki” 2018)

condition number - a function with respect to an argument measures how much the output value of the function can change for a small change in the input argument. A problem with a low condition number is said to be well-conditioned, while a problem with a high condition number is said to be ill-conditioned. In linear regression the condition number can be used as a diagnostic for multicollinearity (Belsley, Kuh, and Welsch 1980).

correlation - a quantity measuring the extent of the interdependence of variable quantities; a mutual relationship or connection between two or more things (“correlation - Google Search” 2018)

covariance - of two variables x and y in a data set measures how the two are linearly related. A positive covariance would indicate a positive linear relationship between the variables, and a negative covariance would indicate the opposite. (“Covariance | R Tutorial” 2018)

deconvolution - Math:the resolution of a convolution function into the functions from which it was formed in order to separate their effects; Common: a process of resolving something into its constituent elements or removing complication (“deconvolution | Definition of deconvolution in English by Oxford Dictionaries” 2018)

diagonal matrix - a matrix having non-zero elements only in the diagonal running from the upper left to the lower right. (“diagonal matrix - Google Search” 2018)

dimension reduction - the process of reducing the number of random variables under consideration, by obtaining a set of principal variables. It can be divided into feature selection and feature extraction. (“Introduction to Dimensionality Reduction - GeeksforGeeks” 2018)

eigenvalue- a special set of scalars associated with a linear system of equations (i.e., a matrix equation) that are sometimes also known as characteristic roots, characteristic values , proper values, or latent roots. (Weisstein, n.d.)

eigenvector- are a special set of vectors associated with a linear system of equations(i.e., a matrix equation) that are sometimes also known as characteristic vectors, proper vectors, or latent vectors. Each eigenvector is paired with a corresponding eigenvalue. (Weisstein, n.d.)

matrix - a rectangular array of quantities or expressions in rows and columns that is treated as a single entity and manipulated according to particular rules. (“matrix | Definition of matrix in English by Oxford Dictionaries” 2018)

matrix factorisation - to factorize a matrix, i.e. to find out two (or more) matrices such that when you multiply them you will get back the original matrix. (“Matrix Factorization: A Simple Tutorial and Implementation in Python | Albert Au Yeung” 2018)

metagene - an aggregate pattern of gene expression (“Metagene dictionary definition | metagene defined” 2018)

metasample - an aggregation of values for individual samples

monte carlo simulation - a simulation that takes into account the variability of the inputs; mathematical technique that generates random variables for modelling risk or uncertainty of a certain system. (“Definition of Monte Carlo Simulation | What is Monte Carlo Simulation ? Monte Carlo Simulation Meaning - The Economic Times” 2018)

multicollinearity - a state of very high intercorrelations or inter-associations among the independent variables. It is therefore a type of disturbance in the data, and if present in the data the statistical inferences made about the data may not be reliable. (“Multicollinearity - Statistics Solutions” 2018)

p-value -  the probability of getting a sample similar observation, or more extreme than the tested one if the null hypothesis is true. So, we assume the null hypothesis is true and then determine how different the tested sample really is. If it is not that different (a large p-value) then the null hypothesis not rejected. As the p-value gets smaller, probability that null hypothesis is false increases (and reject the null hypothesis). (“What is a p-value? - MathBootCamps” 2018)

References

“Definition of biopsy - NCI Dictionary of Cancer Terms - National Cancer Institute.” 2018. Accessed July 23. https://www.cancer.gov/publications/dictionaries/cancer-terms/def/biopsy.

“Definition of cancer subtype - NCI Dictionary of Cancer Terms - National Cancer Institute.” 2018. Accessed July 23. https://www.cancer.gov/publications/dictionaries/cancer-terms/def/cancer-subtype.

“Definition of Cytotoxic.” 2018. Accessed July 23. https://www.medicinenet.com/script/main/art.asp?articlekey=19883.

Cambronne, Eric D, and Craig R Roy. 2006. “Recognition and delivery of effector proteins into eukaryotic cells by bacterial secretion systems.” Traffic 7 (8): 929–39. doi:10.1111/j.1600-0854.2006.00446.x.

“Gene expression definition.” 2018. Accessed July 23. https://www.nature.com/subjects/gene-expression.

“Definition of Immunosuppression.” 2018. Accessed July 23. https://www.medicinenet.com/script/main/art.asp?articlekey=3942.

“Inducer (biology) | definition of Inducer (biology) by Medical dictionary.” 2018. Accessed July 23. https://medical-dictionary.thefreedictionary.com/Inducer+(biology).

“Ligand | Define Ligand at Dictionary.com.” 2018. Accessed July 23. http://www.dictionary.com/browse/ligand.

“Tissue & Organ Flashcards.” 2018. Accessed July 23. https://www.flashcardmachine.com/tissue-organ.html.

“Definition of Marker gene.” 2018. Accessed July 23. https://www.medicinenet.com/script/main/art.asp?articlekey=6731.

“Molecular biology definition.” 2018. Accessed July 23. https://www.nature.com/subjects/molecular-biology.

“Omics definition.” 2018. Accessed July 23. https://en.wikipedia.org/wiki/Omics.

“Examples of Genotype & Phenotype.” 2018. Accessed July 23. http://examples.yourdictionary.com/examples-of-genotype-phenotype.html.

“Definition of Receptor.” 2018. Accessed July 23. https://www.medicinenet.com/script/main/art.asp?articlekey=5236.

Aran, Dvir, Marina Sirota, and Atul J Butte. 2015. “Systematic pan-cancer analysis of tumour purity.” Nat. Commun. 6. Nature Publishing Group: 8971. doi:10.1038/ncomms9971.

“Bayes’ Theorem and Conditional Probability | Brilliant Math & Science Wiki.” 2018. Accessed July 23. https://brilliant.org/wiki/bayes-theorem/.

Belsley, David A., Edwin. Kuh, and Roy E. Welsch. 1980. Regression diagnostics : identifying influential data and sources of collinearity. Wiley Series in Probability and Statistics. Wiley. http://doi.wiley.com/10.1002/0471725153.

“Covariance | R Tutorial.” 2018. Accessed July 23. http://www.r-tutor.com/elementary-statistics/numerical-measures/covariance.

“deconvolution | Definition of deconvolution in English by Oxford Dictionaries.” 2018. Accessed July 3. https://en.oxforddictionaries.com/definition/deconvolution.

“Introduction to Dimensionality Reduction - GeeksforGeeks.” 2018. Accessed July 23. https://www.geeksforgeeks.org/dimensionality-reduction/.

“matrix | Definition of matrix in English by Oxford Dictionaries.” 2018. Accessed July 23. https://en.oxforddictionaries.com/definition/matrix.

“Matrix Factorization: A Simple Tutorial and Implementation in Python | Albert Au Yeung.” 2018. Accessed July 23. http://www.albertauyeung.com/post/python-matrix-factorization/.

“Metagene dictionary definition | metagene defined.” 2018. Accessed July 23. http://www.yourdictionary.com/metagene.

“Definition of Monte Carlo Simulation | What is Monte Carlo Simulation ? Monte Carlo Simulation Meaning - The Economic Times.” 2018. Accessed July 23. https://economictimes.indiatimes.com/definition/monte-carlo-simulation.

“Multicollinearity - Statistics Solutions.” 2018. Accessed July 23. http://www.statisticssolutions.com/multicollinearity/.

“What is a p-value? - MathBootCamps.” 2018. Accessed July 23. https://www.mathbootcamps.com/what-is-a-p-value/.