Biologists are studying basic patterns in DNA sequences stored in a database. Given a table Samples containing DNA sequences, you need to identify which samples contain specific genetic patterns.
Pattern Requirements:
- Start Codon: Sequences that start with
ATG(a common start codon) - Stop Codons: Sequences that end with either
TAA,TAG, orTGA(stop codons) - ATAT Motif: Sequences containing the motif
ATAT(a simple repeated pattern) - Triple G: Sequences that have at least 3 consecutive
G(likeGGGorGGGG)
Return a result table showing each sample with boolean flags (1/0) indicating which patterns are present, ordered by sample_id in ascending order.
Table Schema
| Column Name | Type | Description |
|---|---|---|
sample_id
PK
|
int | Unique identifier for each DNA sample |
dna_sequence
|
varchar | DNA sequence represented as string of A, T, G, C characters |
species
|
varchar | Species from which the DNA sample was collected |
Input & Output
| sample_id | dna_sequence | species |
|---|---|---|
| 1 | ATGCTAGCTAGCTAA | Human |
| 2 | GGGTCAATCATC | Human |
| 3 | ATATATCGTAGCTA | Human |
| 4 | ATGGGGTCATCATAA | Mouse |
| sample_id | dna_sequence | species | has_start | has_stop | has_atat | has_ggg |
|---|---|---|---|---|---|---|
| 1 | ATGCTAGCTAGCTAA | Human | 1 | 1 | 0 | 0 |
| 2 | GGGTCAATCATC | Human | 0 | 0 | 0 | 1 |
| 3 | ATATATCGTAGCTA | Human | 0 | 0 | 1 | 0 |
| 4 | ATGGGGTCATCATAA | Mouse | 1 | 1 | 0 | 1 |
Sample 1: Starts with ATG (has_start=1), ends with TAA (has_stop=1), no ATAT motif, no triple G.
Sample 2: Starts with GGG (has_ggg=1), but no other patterns match.
Sample 3: Contains ATAT at the beginning (has_atat=1), but no other patterns.
Sample 4: Has all patterns except ATAT: starts with ATG, contains GGGG, ends with TAA.
| sample_id | dna_sequence | species |
|---|---|---|
| 5 | TCAGTCAGTCAG | Mouse |
| 6 | ATATCGCGCTAG | Zebrafish |
| 7 | CGTATGCGTCGTA | Zebrafish |
| sample_id | dna_sequence | species | has_start | has_stop | has_atat | has_ggg |
|---|---|---|---|---|---|---|
| 5 | TCAGTCAGTCAG | Mouse | 0 | 0 | 0 | 0 |
| 6 | ATATCGCGCTAG | Zebrafish | 0 | 1 | 1 | 0 |
| 7 | CGTATGCGTCGTA | Zebrafish | 0 | 0 | 0 | 0 |
Sample 5: No patterns match - all flags are 0.
Sample 6: Starts with ATAT (has_atat=1) and ends with TAG (has_stop=1).
Sample 7: No genetic patterns detected - all flags are 0.
Constraints
-
1 ≤ sample_id ≤ 1000 -
dna_sequencecontains only characters'A','T','G','C' -
1 ≤ dna_sequence.length ≤ 1000 -
speciesis a non-empty string