Example SQL Queries
For users who are comfortable with SQL, you can often run queries directly through the NocoDB interface or by connecting to the database with a compatible SQL client.
Here are some common and useful queries to help you get started.
Prefer a Graphical Interface?
If you are not comfortable writing SQL, IDhub provides an interactive Swagger UI that lets you explore and query the data using a friendly web form. It is a great way to see what data is available and build queries without code.
Warning
These queries are for read-only (SELECT) operations. You do not have permission to modify data (INSERT, UPDATE, DELETE). Please exercise caution and ensure your queries are efficient to avoid impacting system performance.
1. Find a Subject's GSID Using a Local ID
This is one of the most common tasks: you have a local or consortium ID and you need to find the official Global Subject ID (GSID).
SELECT
global_subject_id,
center_id,
identifier_type,
created_at
FROM
local_subject_ids
WHERE
local_subject_id = 'A000101-130001';
What it does:
* Searches the local_subject_ids table.
* Finds the row matching the local_subject_id you provided.
* Returns the corresponding global_subject_id.
2. Get All Local IDs for a Known GSID
This is the reverse of the query above. If you have a GSID, you can find all the different local IDs associated with that person from various centers or studies.
SELECT
local_subject_id,
identifier_type,
center_id
FROM
local_subject_ids
WHERE
global_subject_id = 'GSID-1KKG7NR2Z4XB2A2J';
3. Count Data Types for a Specific Subject
This query counts how many records of different data types (genotype, sequence, etc.) are available for a given subject.
SELECT
'genotype' AS data_type,
COUNT(*) AS count
FROM
genotype
WHERE
global_subject_id = 'GSID-1KKG7NR2Z4XB2A2J'
UNION ALL
SELECT
'sequence' AS data_type,
COUNT(*) AS count
FROM
sequence
WHERE
global_subject_id = 'GSID-1KKG7NR2Z4XB2A2J'
UNION ALL
SELECT
'lcl' AS data_type,
COUNT(*) AS count
FROM
lcl
WHERE
global_subject_id = 'GSID-1KKG7NR2Z4XB2A2J';
4. Find All Sequencing Data for a Subject
This query retrieves all sequencing records associated with a specific subject.
SELECT
global_subject_id,
sample_id,
sample_type,
batch,
vcf_sample_id
FROM
sequence
WHERE
global_subject_id = 'GSID-1KKG7NR2Z4XB2A2J'
ORDER BY
created_at DESC;
5. Find Subjects Registered by a Specific Source System
This query helps you find subjects that were created by a particular data pipeline or source.
SELECT
global_subject_id,
center_id,
created_at,
created_by
FROM
subjects
WHERE
created_by = 'redcap_pipeline'
ORDER BY
created_at DESC
LIMIT 500;
What it does:
* Looks at the subjects table.
* Filters for records where the created_by field is redcap_pipeline.
* Shows the 500 most recently created subjects from that source.
Tips for Querying
- Start with
LIMIT: When exploring a table for the first time, always addLIMIT 10to the end of your query. This prevents you from accidentally trying to load millions of rows, which can be slow. - Use
WHEREto filter: Be as specific as possible with yourWHEREclauses to narrow down the data you are looking for. This is much more efficient than retrieving a large dataset and filtering it on your own machine. - Use NocoDB's Filters First: Before writing your own SQL, try to achieve your goal using NocoDB's built-in Sort and Filter features. They are powerful and safe to use.
- Check Column Names: SQL is precise. If your query fails, double-check that the column and table names are spelled correctly. You can see the exact names in NocoDB's grid view.