How to Write Good Prompts in Drylab
Get Started
The Core Principle
Specificity beats brevity. A well-formed prompt gives the AI enough context to act without asking follow-up questions. Think of it as briefing a skilled scientist — tell them the data, the goal, and any constraints.
The Anatomy of a Good Prompt
[What you have] + [What you want] + [How you want it] + [Any constraints]
For Analysis Tasks
Weak vs Strong Prompts
Weak Prompt | Strong Prompt |
|---|---|
"Analyze my data" | "Load the CSV at |
"Do clustering" | "Run Leiden clustering on the AnnData object with resolution 0.5 and plot UMAP colored by cluster" |
"Make a plot" | "Plot a boxplot of |
"Run differential expression" | "Run Wilcoxon rank-sum DE between |
For Research Tasks
Weak Prompt | Strong Prompt |
|---|---|
"Find papers about cancer" | "Search for papers on KRAS G12D inhibitors in pancreatic cancer published after 2022" |
"What is BRCA1?" | "Summarize the role of BRCA1 in DNA double-strand break repair and its clinical significance in breast cancer" |
"Find drug targets" | "Query DrugBank for FDA-approved drugs targeting EGFR and return their mechanism of action and approval year" |
Prompt Templates by Task Type
Loading & Exploring Data
Load [file path or dataset name]. Show the first 5 rows, number of rows/columns, and data types of each column.
Statistical Analysis
Run [test name] comparing [group A] vs [group B] in the [column name] column. Use a significance threshold of [value] and correct for multiple testing using [method].
Visualization
Plot a [plot type] of [variable] grouped by [category]. Use a colorblind-safe palette, add axis labels, and save as PDF to the output folder.
Database Query
Query [database name] for [gene/protein/drug name]. Return [specific fields: e.g. pathways, interactions, variants].
Pipeline / Workflow
I have [data type] data at [file path]. Run [pipeline name] with [key parameters]. Save results to [output path].
7 Rules for Better Prompts
Name your files and columns — say
data/expr.csv, column"log2FC", not "my file"State the goal, not just the action — "I want to identify differentially expressed genes between tumor and normal" gives the AI context to choose the right method
Specify thresholds — FDR cutoff, minimum cell count, resolution, number of clusters
Mention the output format — "save as PDF", "return a table", "print the top 10 results"
Say what organism/genome — human, mouse, GRCh38, mm10 — this matters for databases and pipelines
Reference prior steps when continuing — "Using the filtered AnnData from the previous step, run PCA with 50 components"
One task per prompt — if you need 5 things, ask them in sequence so you can verify each result
When to Add More Context
Add context when the task involves:
Custom data — describe columns, format, units
Scientific decisions — explain the biological question
Preferences — color scheme, figure size, statistical test preference
Constraints — memory limits, time limits, specific package versions
Quick Checklist Before Sending a Prompt
[ ] Did I specify the data source (file path, database, variable name)?
[ ] Did I state the exact goal?
[ ] Did I include key parameters (thresholds, methods, group names)?
[ ] Did I specify the output format?
[ ] Is this one focused task, or should I split it into steps?


