| Title: | Protein Quantification in Mass Spectrometry-Based Proteomics |
|---|---|
| Description: | An implementation of the MaxLFQ algorithm by Cox et al. (2014) <doi:10.1074/mcp.M113.031591> in a comprehensive pipeline for processing proteomics data in data-independent acquisition mode (Pham et al. 2020 <doi:10.1093/bioinformatics/btz961>; Pham et al. 2026 <doi:10.1021/acs.jproteome.5c01038>). It offers additional options for protein quantification using the N most intense fragment ions, using all fragment ions, the median polish algorithm by Tukey (1977, ISBN:0201076160), and a robust linear model. In general, the tool can be used to integrate multiple proportional observations into a single quantitative value. |
| Authors: | Thang Pham [aut, cre, cph, ctb]
|
| Maintainer: | Thang Pham <[email protected]> |
| License: | BSD_3_clause + file LICENSE |
| Version: | 2.0.1 |
| Built: | 2026-05-20 09:43:58 UTC |
| Source: | https://github.com/tvpham/iq |
Two samples are directly connected if they share at least one quantified fragment ion. Two samples in a connected component are either directly connected or indirectly via other samples.
connected_component(X)connected_component(X)
X |
A matrix of intensities in the log2 space. Columns are samples and rows are fragment ions. |
A vector with length equal to the number of columns of the input containing membership of the connected components.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
For each protein, a numerical matrix is formed where the columns are samples and rows are fragment ions.
create_protein_list(preprocessed_data)create_protein_list(preprocessed_data)
preprocessed_data |
A data frame of four components as output of the |
A list where each element contains the quantitative data of a protein. The column names are sample names and the row names fragment ions.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data)data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data)
Travels through the input list and quantifies all proteins one by one.
create_protein_table(protein_list, method = "maxLFQ", ...)create_protein_table(protein_list, method = "maxLFQ", ...)
protein_list |
The input protein list |
method |
Possible values are "maxLFQ", "median_polish", "topN", and "meanInt". |
... |
Additional parameters for individual quantitation methods. |
A list of two components is returned
estimate |
A table of protein abundances for all samples in log2 space. |
annotation |
A vector of annotations, one for each protein. |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
create_protein_list, maxLFQ, median_polish, topN, meanInt
data("spikeins") # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data) result <- iq::create_protein_table(protein_list) head(result)data("spikeins") # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data) result <- iq::create_protein_table(protein_list) head(result)
Extracts annotation columns from a long-format input
extract_annotation(protein_ids, quant_table, primary_id = "PG.ProteinGroups", annotation_columns = NULL)extract_annotation(protein_ids, quant_table, primary_id = "PG.ProteinGroups", annotation_columns = NULL)
protein_ids |
A vector of protein ids. |
quant_table |
A long-format input table. The input is typically the same as input to the |
primary_id |
The column containing protein ids. |
annotation_columns |
A vector of columns for annotation. |
A table of proteins and associated annotation extracted from the input.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
data("spikeins") extra_names <- iq::extract_annotation(levels(spikeins$PG.ProteinGroups), spikeins, annotation_columns = c("PG.Genes", "PG.ProteinNames"))data("spikeins") extra_names <- iq::extract_annotation(levels(spikeins$PG.ProteinGroups), spikeins, annotation_columns = c("PG.Genes", "PG.ProteinNames"))
A fast implementation of the MaxLFQ algorithm.
fast_MaxLFQ(norm_data, row_names = NULL, col_names = NULL)fast_MaxLFQ(norm_data, row_names = NULL, col_names = NULL)
norm_data |
A list of four vectors with equal length |
row_names |
A vector of character strings for row names. If |
col_names |
A vector of character strings for column names. If |
A list is returned with two components
estimate |
A quantification result table in log2 space. |
annotation |
A vector of strings indicating membership in case of multiple connected components for each row of |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Filters out low intensities and performs median normalization.
fast_preprocess(quant_table, median_normalization = TRUE, log2_intensity_cutoff = 0, pdf_out = "qc-plots-fast.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = TRUE)fast_preprocess(quant_table, median_normalization = TRUE, log2_intensity_cutoff = 0, pdf_out = "qc-plots-fast.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = TRUE)
quant_table |
The |
median_normalization |
A logical value. The default |
log2_intensity_cutoff |
Entries lower than this value in log2 space are ignored. Plot a histogram of all intensities to set this parameter. |
pdf_out |
A character string specifying the name of the PDF output. A |
pdf_width |
Width of the pdf output in inches. |
pdf_height |
Height of the pdf output in inches. |
show_boxplot |
A logical value. The default |
A list is returned with the same components as input data in which low intensities are filtered out and median normalization is performed if requested.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
A highly efficient reading of a tab-separated text file for iq processing.
fast_read(filename, sample_id = "R.Condition", primary_id = "PG.ProteinGroups", secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"), intensity_col = "F.PeakArea", annotation_col = c("PG.Genes", "PG.ProteinNames"), filter_string_equal = c("F.ExcludedFromQuantification" = "False"), filter_string_not_equal = NULL, filter_double_less = c("PG.Qvalue" = "0.01", "EG.Qvalue" = "0.01"), filter_double_greater = NULL, intensity_col_sep = NULL, intensity_col_id = NULL, na_string = "0")fast_read(filename, sample_id = "R.Condition", primary_id = "PG.ProteinGroups", secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"), intensity_col = "F.PeakArea", annotation_col = c("PG.Genes", "PG.ProteinNames"), filter_string_equal = c("F.ExcludedFromQuantification" = "False"), filter_string_not_equal = NULL, filter_double_less = c("PG.Qvalue" = "0.01", "EG.Qvalue" = "0.01"), filter_double_greater = NULL, intensity_col_sep = NULL, intensity_col_id = NULL, na_string = "0")
filename |
A long-format tab-separated text file with a primary column of protein identification, secondary columns of fragment ions, a column of sample names, a column for quantitative intensities, and extra columns for annotation. |
primary_id |
Unique values in this column form the list of proteins to be quantified. |
secondary_id |
A concatenation of these columns determines the fragment ions used for quantification. |
sample_id |
Unique values in this column form the list of samples. |
intensity_col |
The column for intensities. |
annotation_col |
Annotation columns |
filter_string_equal |
A named vector of strings. Only rows satisfying the condition are kept. |
filter_string_not_equal |
A named vector of strings. Only rows satisfying the condition are kept. |
filter_double_less |
A named vector of strings. Only rows satisfying the condition are kept. Default PG.Qvalue < 0.01 and EG.Qvalue < 0.01. |
filter_double_greater |
A named vector of strings. Only rows satisfying the condition are kept. |
intensity_col_sep |
A separator character when entries in the intensity column contain multiple values. |
intensity_col_id |
The column for identities of multiple quantitative values. |
na_string |
The value considered as NA. |
When entries in the intensity column contain multiple values, this function will replicate entries in other column and the secondary_id will be appended with corresponding entries in intensity_col_id when it is provided. Otherwise, integer values 1, 2, 3, etc... will be used.
A list is returned with following components
protein |
A table of proteins in the first column followed by annotation columns. |
sample |
A vector of samples. |
ion |
A vector of fragment ions to be used for quantification. |
quant_table |
A list of four components: protein_list (index pointing to |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
This function reads long-format input data, filters it, and writes the results to the iq format.
long_format_to_iq_format(input_data, output_data, sample_id = "Run", primary_id = "Protein.Group", secondary_id = c("Precursor.Id"), intensity_col = "Intensities", annotation_col = NULL, filter_string_equal = NULL, filter_string_not_equal = NULL, filter_double_less = NULL, filter_double_greater = NULL, intensity_col_sep = ";", intensity_col_id = NULL, na_string = "0", normalization = "median", log2_intensity_cutoff = 0, pdf_out = "qc-plots-iq.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = FALSE)long_format_to_iq_format(input_data, output_data, sample_id = "Run", primary_id = "Protein.Group", secondary_id = c("Precursor.Id"), intensity_col = "Intensities", annotation_col = NULL, filter_string_equal = NULL, filter_string_not_equal = NULL, filter_double_less = NULL, filter_double_greater = NULL, intensity_col_sep = ";", intensity_col_id = NULL, na_string = "0", normalization = "median", log2_intensity_cutoff = 0, pdf_out = "qc-plots-iq.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = FALSE)
input_data |
See |
output_data |
Output data. |
sample_id |
See |
primary_id |
See |
secondary_id |
See |
intensity_col |
See |
annotation_col |
See |
filter_string_equal |
See |
filter_string_not_equal |
See |
filter_double_less |
See |
filter_double_greater |
See |
intensity_col_sep |
See |
intensity_col_id |
See |
na_string |
See |
normalization |
Normalization type. Possible values are |
log2_intensity_cutoff |
See |
pdf_out |
See |
pdf_width |
See |
pdf_height |
See |
show_boxplot |
See |
The output is written to a new directory named output_data.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Estimates protein abundances by aiming to maintain the fragment intensity ratios between samples.
maxLFQ(X)maxLFQ(X)
X |
A matrix of ion intensities in log2 space. Columns are samples and rows are fragment ions. |
A list of two components is returned
estimate |
A vector with length equal to the number of columns of the input containing the protein abundances in log2 space. |
annotation |
An empty string if all quantified samples are connected. Otherwise, a string of membership of the connected components is returned. |
Thang V. Pham
Cox J, Hein MY, Luber CA, et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics. 2014;13(9):2513–2526.
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Estimates protein abundances by averaging all associated ion intensities
meanInt(X, aggregation_in_log_space = TRUE)meanInt(X, aggregation_in_log_space = TRUE)
X |
A matrix of ion intensities in log2 space. Columns are samples and rows are fragment ions. |
aggregation_in_log_space |
A logical value. If |
A list of two components is returned
estimate |
A vector with length equal to the number of columns of the input containing the protein abundances in log2 space. |
annotation |
Reserved, currently an empty string. |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Estimates protein abundances using the Tukey median polish algorithm.
median_polish(X)median_polish(X)
X |
A matrix of ion intensities in log2 space. Columns are samples and rows are fragment ions. |
A list of two components is returned
estimate |
A vector with length equal to the number of columns of the input containing the protein abundances in log2 space. |
annotation |
Reserved, currently an empty string |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Tukey JW. Exploratory Data Analysis, Reading Massachusetts: Addison-Wesley, 1977.
Displays the underlying data for a protein.
plot_protein(X, main = "", col = NULL, split = 0.6, ...)plot_protein(X, main = "", col = NULL, split = 0.6, ...)
X |
Protein data matrix. |
main |
Title of the plot. |
col |
Colors of the rows of the data matrix. |
split |
Fraction of the plotting area for the main figure. The remaining one is for legend. Set this parameter to |
... |
Additional parameters for plotting. |
A NULL value is returned.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data) iq::plot_protein(protein_list$P00366, main = "Protein P00366", split = NULL)data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL) protein_list <- iq::create_protein_list(norm_data) iq::plot_protein(protein_list$P00366, main = "Protein P00366", split = NULL)
Prepares a long-format input including removing low-intensity ions and performing median normalization.
preprocess(quant_table, primary_id = "PG.ProteinGroups", secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"), sample_id = "R.Condition", intensity_col = "F.PeakArea", median_normalization = TRUE, log2_intensity_cutoff = 0, pdf_out = "qc-plots.pdf", pdf_width = 12, pdf_height = 8, intensity_col_sep = NULL, intensity_col_id = NULL, na_string = "0", show_boxplot = TRUE)preprocess(quant_table, primary_id = "PG.ProteinGroups", secondary_id = c("EG.ModifiedSequence", "FG.Charge", "F.FrgIon", "F.Charge"), sample_id = "R.Condition", intensity_col = "F.PeakArea", median_normalization = TRUE, log2_intensity_cutoff = 0, pdf_out = "qc-plots.pdf", pdf_width = 12, pdf_height = 8, intensity_col_sep = NULL, intensity_col_id = NULL, na_string = "0", show_boxplot = TRUE)
quant_table |
A long-format table with a primary column of protein identification, secondary columns of fragment ions, a column of sample names, and a column for quantitative intensities. |
primary_id |
Unique values in this column form the list of proteins to be quantified. |
secondary_id |
A concatenation of these columns determines the fragment ions used for quantification. |
sample_id |
Unique values in this column form the list of samples. |
intensity_col |
The column for intensities. |
median_normalization |
A logical value. The default |
log2_intensity_cutoff |
Entries lower than this value in log2 space are ignored. Plot a histogram of all intensities to set this parameter. |
pdf_out |
A character string specifying the name of the PDF output. A |
pdf_width |
Width of the pdf output in inches. |
pdf_height |
Height of the pdf output in inches. |
intensity_col_sep |
A separator character when entries in the intensity column contain multiple values. |
intensity_col_id |
The column for identities of multiple quantitative values. |
na_string |
The value considered as NA. |
show_boxplot |
A logical value. The default |
When entries in the intensity column contain multiple values, this function will replicate entries in other column and the secondary_id will be appended with corresponding entries in intensity_col_id when it is provided. Otherwise, integer values 1, 2, 3, etc... will be used.
A data frame is returned with following components
protein_list |
A vector of proteins. |
sample_list |
A vector of samples. |
id |
A vector of fragment ions to be used for quantification. |
quant |
A vector of log2 intensities. |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL)data("spikeins") head(spikeins) # This example set of spike-in proteins has been 'median-normalized'. norm_data <- iq::preprocess(spikeins, median_normalization = FALSE, pdf_out = NULL)
Estimates protein abundance of all proteins in the iq format data.
process_iq_format(input_data, output_filename = NULL, method = "maxlfq_bit", p1 = NULL, p2 = NULL, k = 1.345, min_M = 15, n_threads = -1, rescale_method = NULL)process_iq_format(input_data, output_filename = NULL, method = "maxlfq_bit", p1 = NULL, p2 = NULL, k = 1.345, min_M = 15, n_threads = -1, rescale_method = NULL)
input_data |
Input data in the iq format. |
output_filename |
Output file name. If |
method |
See |
p1 |
See |
p2 |
See |
k |
See |
min_M |
See |
n_threads |
See |
rescale_method |
Rescale the output intensities. See |
A tab-separated table with rows representing proteins and columns representing protein annotations and samples.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
A convenient function combining multiple steps to process a long format table using the MaxLFQ algorithm.
process_long_format(input_data, output_filename, sample_id = "File.Name", primary_id = "Protein.Group", secondary_id = "Precursor.Id", intensity_col = "Fragment.Quant.Corrected", annotation_col = NULL, filter_string_equal = NULL, filter_string_not_equal = NULL, filter_double_less = c("Q.Value" = "0.01", "PG.Q.Value" = "0.01"), filter_double_greater = NULL, intensity_col_sep = ";", intensity_col_id = NULL, na_string = "0", normalization = "median", log2_intensity_cutoff = 0, pdf_out = "qc-plots.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = TRUE, peptide_extractor = NULL, rfasta = NULL)process_long_format(input_data, output_filename, sample_id = "File.Name", primary_id = "Protein.Group", secondary_id = "Precursor.Id", intensity_col = "Fragment.Quant.Corrected", annotation_col = NULL, filter_string_equal = NULL, filter_string_not_equal = NULL, filter_double_less = c("Q.Value" = "0.01", "PG.Q.Value" = "0.01"), filter_double_greater = NULL, intensity_col_sep = ";", intensity_col_id = NULL, na_string = "0", normalization = "median", log2_intensity_cutoff = 0, pdf_out = "qc-plots.pdf", pdf_width = 12, pdf_height = 8, show_boxplot = TRUE, peptide_extractor = NULL, rfasta = NULL)
input_data |
A data frame or a filename. See |
output_filename |
Output filename. |
sample_id |
See |
primary_id |
See |
secondary_id |
See |
intensity_col |
See |
annotation_col |
See |
filter_string_equal |
See |
filter_string_not_equal |
See |
filter_double_less |
See |
filter_double_greater |
See |
intensity_col_sep |
See |
intensity_col_id |
See |
na_string |
See |
normalization |
Normalization type. Possible values are |
log2_intensity_cutoff |
See |
pdf_out |
See |
pdf_width |
See |
pdf_height |
See |
show_boxplot |
See |
peptide_extractor |
A function to parse peptides |
rfasta |
If available, calculates the sequence coverage. |
Either an input data frame is processed with fast_MaxLFQ or an input file is processed with fast_read, fast_preprocess, and fast_MaxLFQ.
Subsequently, the result is written to output_filename. The quantification values are in log2 space.
A NULL value is returned. If peptide_extractor is not NULL, fragment statistics
for each protein will be calculated based on the result of the extractor function. Counting the number of peptides contributing to a protein is possible using an appropriate extractor function. An example value for peptide_extractor is function(x) gsub("[0-9].*$", "", x), which removes the charge state and fragment descriptors in an ion descriptor to obtain unique peptide sequences. One can examine the ion component returned by the fast_read function to derive a regular expression to be used in the gsub function above.
Another example is function(x) gsub("\(UniMod:\d+\)|\d|_", "", x) which removes annotations for modifications on the peptides.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
fast_read, fast_preprocess, fast_MaxLFQ
Estimates a row vector best representing all rows in the input numeric matrix.
process_matrix(X, method, p1 = NULL, p2 = NULL, k = 1.345, min_M = 15, n_threads = -1)process_matrix(X, method, p1 = NULL, p2 = NULL, k = 1.345, min_M = 15, n_threads = -1)
X |
A matrix of intensities in the log2 space. Columns are samples and rows are fragment ions. |
method |
One of the quantification methods: |
p1 |
Method parameter. For |
p2 |
Method parameter. For |
k |
Weighting function parameter. |
min_M |
Minimum number of rows for the weighted methods. |
n_threads |
The number of threads to be used. When |
A numeric row vector.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Collapses rows with identical values in a particular column in a table. When the values in each row are proportional such as intensities of multiple fragments of a protein, the MaxLFQ algorithm is recommended.
process_wide_format(input_filename, output_filename, id_column, quant_columns, data_in_log_space = FALSE, annotation_columns = NULL, method = "maxLFQ")process_wide_format(input_filename, output_filename, id_column, quant_columns, data_in_log_space = FALSE, annotation_columns = NULL, method = "maxLFQ")
input_filename |
Input filename of a tab-separated value text file. |
output_filename |
Output filename. |
id_column |
The column where unique values will be kept. Rows with identical values in this column are merged. Rows with empty values here are removed. |
quant_columns |
Columns containing numerical data to be merged. |
data_in_log_space |
A logical value. If |
annotation_columns |
Columns in the input file apart from |
method |
Method for merging. Default value is |
Method "maxLFQ_R" implements the MaxLFQ algorithm pure R. It is slower than "maxLFQ".
Method "maxInt" selects row with maximum intensity (top 1).
Method "sum" sum all intensities.
Method "least_na" selects row with the least number of missing values.
The value of method can be a function such as function(x) log2(colSums(2^x, na.rm = TRUE)) for summing all intensities in the original space.
The result table is written to output_filename. A NULL value is returned.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
Each connected component of the input values can be rescaled to match the level of the overall input intensities.
rescale(x, X, method)rescale(x, X, method)
x |
A row vector containing the quantification result. |
X |
An input matrix of log2-transformed intensities. Columns represent samples, and rows represent fragment ions. |
method |
One of the methods for rescaling: |
A row vector of rescaled values of the same size as x, with each connected component rescaled independently.
For the "median-mean" method, single-sample
components are rescaled to the median of input data, while all other components are rescaled to the mean. For other methods, each component is rescaled to the median,
mean, or sum of the input data.
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.
A subset of the Bruderer 2015 dataset containing 12 spike-in proteins. The full dataset was exported from the Spectronaut software. The complete dataset has been median-normalized.
data("spikeins")data("spikeins")
A data frame with 18189 observations on the following 9 variables.
R.ConditionSample names.
PG.ProteinGroupsProtein identifiers.
EG.ModifiedSequenceSequence of the fragment ions.
FG.ChargeFragment group charge.
F.FrgIonFragment ions.
F.ChargeFragment charges.
F.PeakAreaQuantitative values.
PG.GenesGene names.
PG.ProteinNamesProtein names.
data("spikeins") head(spikeins)data("spikeins") head(spikeins)
Estimates protein abundances using the N most intense ions.
topN(X, N = 3, aggregation_in_log_space = TRUE)topN(X, N = 3, aggregation_in_log_space = TRUE)
X |
A matrix of ion intensities in log2 space. Columns are samples and rows are fragment ions. |
N |
The number of top ions used for quantification. |
aggregation_in_log_space |
A logical value. If |
A list of two components is returned
estimate |
A vector with length equal to the number of columns of the input containing the protein abundances in log2 space. |
annotation |
Reserved, currently an empty string. |
Thang V. Pham
Pham TV, Henneman AA, Jimenez CR. iq: an R package to estimate relative protein abundances from ion quantification in DIA-MS-based proteomics. Bioinformatics 2020 Apr 15;36(8):2611-2613.