Repeated DNA Sequences - Problem
The DNA sequence is composed of a series of nucleotides abbreviated as 'A', 'C', 'G', and 'T'. For example, "ACGAATTCCG" is a DNA sequence.
When studying DNA, it is useful to identify repeated sequences within the DNA. Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
You may return the answer in any order.
Input & Output
Example 1 — Basic Repeated Sequences
$
Input:
s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
›
Output:
["AAAAACCCCC","CCCCCAAAAA"]
💡 Note:
AAAAACCCCC appears at positions 0 and 10. CCCCCAAAAA appears at positions 5 and 15. Both sequences are 10 characters long and occur more than once.
Example 2 — No Repeats
$
Input:
s = "AAAAAAAAAA"
›
Output:
["AAAAAAAAAA"]
💡 Note:
The entire string is one repeated 10-letter sequence AAAAAAAAAA which appears twice (positions 0 and 1).
Example 3 — Short String
$
Input:
s = "ACGT"
›
Output:
[]
💡 Note:
String is too short (4 characters) to contain any 10-letter sequences, so return empty array.
Constraints
- 1 ≤ s.length ≤ 105
- s[i] is either 'A', 'C', 'G', or 'T'
Visualization
Tap to expand
Understanding the Visualization
1
Input DNA
Long DNA string with nucleotides A, C, G, T
2
Extract Sequences
Get all 10-character substrings using sliding window
3
Find Repeats
Return sequences that occur more than once
Key Takeaway
🎯 Key Insight: Use sliding window with hash map to efficiently track 10-character substring occurrences in one pass
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code