Skip to main content
Figure 2 | BMC Research Notes

Figure 2

From: SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences

Figure 2

Computing moving medians for many window sizes. Description of our moving medians algorithm and data structures used. The figure illustrates a computation with input sequence size N = 7, and window sizes S = 3, 5, 7. (A) Rank data structure: used to store the input sequence. The Rank data structure gives access to the input sequence in its original and ranked order. It also allows fetching elements according to their rank. (B) Marker data structure: helps navigation through the sliding windows while keeping track of the median (or any other desired order-statistics). The Marker data structure is a Boolean array used to keep track of the elements that are inside a sliding window by means of crossing out the elements that are outside it. It also has a pointer that keeps track of the ith remaining element. This pointer is used to track the median. The Marker structure assumes the sequence is in ranked order. For example, if a sliding window of size 3 of a sequence of size 7 contains elements ranked 5, 1, and 6, the corresponding Marker structure has elements ranked 2, 3, 4, and 7 crossed out, and its median pointer points to element ranked 5. (C) Moving median algorithm for window size S. Our algorithm computes the moving medians for window sizes S = Smin, Smin+dS,..., Smin+n·dS, starting at S = S min . When the last sliding window of size S is reached, the algorithm proceeds to the next window size (S+dS) by inserting the elements that are in the first sliding window of size S+dS and crossing out the elements that were in the last sliding window of size S and setting the new position for the median pointer (which is element mm(S+dS) = (S+dS+1)/2). The algorithm stops after computing the medians for the largest window size.

Back to article page