Row matching of tables

This function extends the functionality of %in% for finding which rows in the first argument exist in the second.

Usage

row_match(x, table)

Arguments

x: A matrix or data frame which rows to be matched. Typically, this will be a matrix of assigned barcodes for each read.
table: A matrix or data frame with the rows to be matched against. Typically, this will be the top portion of a frequency table.

Value

Logical vector, for each row in x, is the same row found in table?

Details

As this function is intended to be used for data frames containing more than just the barcodes, the intersection of the column names is used for matching. As opposed to base::match(), this function is implemented more efficiently by converting each row into a numeric encoding before matching.

For technical reasons, it is not permitted for the product of the number of the unique values of the columns in table to exceed \(2^{32}-1\approx 2.1\cdot 10^{9}\).

Examples

barcode_table <- data.frame(
    read = c("seq_1", "seq_2", "seq_3", "seq_4"),
    bc1 = c("A", "B", "C", "B"),
    bc2 = c("A", "C", "A", "A")
    )

freq_table <- data.frame(
    bc1 = c("B", "B", "C", "A"),
    bc2 = c("A", "C", "A", "A"),
    frequency = c(200L, 100L, 50L, 10L)
    )

freq_cutoff <- 100L

selected_freq_table <- freq_table[freq_table$frequency >= freq_cutoff, ]

selected_rows <- row_match(barcode_table, selected_freq_table)
selected_barcode_table <- barcode_table[selected_rows, ]

Usage

Arguments

Value

Details

See also

Examples