Query Guide

The query tool allows users to search the CDEK data by specifying filtration predicates on any table column through the use of an interactive user-interface (UI). This guide explains the various functionalities of the query tool as well as provides some examples of constructing various queries; for definitions of the table columns please see the glossary.

all UI elements in this guide can be interacted with! This may make understanding the guide a bit more intuitive.

any bold underlined text can be hovered over with the cursor to reveal more information regarding it's meaning.

Constructing a query

Using the UI, queries are constructed in a SQL-like manner by defining predicates on specific table column; these rules are then combined through boolean operators and can even be joined together into rule-groups. An empty query UI looks like:

The query construction begins by defining a query rule in a rule-group. Begin by choosing any of the table columns available in the dropdown list; the table columns are grouped by the type of object they describe. For example, all metadata for APIs are grouped under API metadata.

After choosing a table column, two input elements automatically populate to the right of the table column dropdown:

predicate operator: boolean operator that compares the table column to the predicate value
predicate value: the filtration value requested and which gets evaluated against the table column

Predicate operators

Various predicate operator are available for each of the table columns including:

equal: case-sensitive match to the entire predicate value; interpreted as the SQL = 'value' operator
begins with: case-insensitive match to the beginning of the predicate value interpreted as the SQL ILIKE 'value%' operator
contains: case-insensitive match to the any part of the predicate value; interpreted as the SQL ILIKE '%value%' operator
end with: case-insensitive match to the end of the predicate value; interpreted as the SQL ILIKE '%value' operator
is null: matches any table column equal to NULL; note that this is different than an empty string.
in: matches to any of the multiple rule-values provided; interpreted as the SQL IN ('value1', 'value2', ...)

Predicate values

Depending on the type of table column chosen, different types of predicate value inputs are available. For example, if a number type is chosen (e.g. API ID), the input box will only allow for number inputs. If a date type is chosen, a date picker will pop open when entering a value for the predicate value.

Finally, if a table column is chosen that can only contain a specific set of values (e.g. API type) the predicate value input box will auto-complete with matching values as you type; once the desired value is found in the auto-completed suggestions, it must be selected. See, for example, example #1 where the API type Small molecule was selected as the predicate value.

Adding rules and rule-groups

By setting a predicate operator and a predicate value, the query rule is fully defined. More rules can be added to a given rule-group by clicking the button; as soon as more than one rule is specified, rules in a given rule-group are evaluated by the boolean input AND OR which defaults to AND.

Additional rule-groups can be added to construct more complex queries by clicking the button. Within each rule-group multiple rules can be defined as detailed above. Just as rules are compared against one another through the boolean input, so are rule-groups. Finally, any rule or rule-group can be deleted by clicking the button.

Submitting a query

After the query has been constructed, the query is validated and submitted with the button. If any query rule is not properly validated, the offending rule will be highlighted in red with a icon; by hovering the cursor over the icon, a description of the rule violation can be read. A query will not be executed until all rules are fully validated.

Query results

An example query result can be seen below, similarly to a spreadsheet, it is comprised of rows and columns. Each table column can be sorted and some of them (e.g. MeSH descriptor will even render as a hyperlink which will take the user to more information when clicked. By default, only the first 25 results are shown; however more can be shown per page by selecting the proper number in the dropdown located at the top-left of the results table. Finally, the rendered results can be downloaded to a tab-delimited file by clicking the button and the query can be shared by clicking the button.

Depending on screen width, the results table may hide some of the resulting table columns. If that occurs, the hidden table columns can be viewed by clicking the button in the left column.

Row contents

One has to be careful interpreting the contents of a single row; every row is not a distinct API nor a distinct trial. The results table is generated through the joining of numerous relational tables. Given that an API has multiple associated names and trials, and given that a trial could have multiple associated sponsors and indications, the joining of all these data results in every combination being generated.

For example, the API 110153 has the following name & trial metadata (only a subset of data shown):

api_id	source_api_id	api_name
110153	DB06204	tapentadol
110153	CHEMBL1201776	bn-200

api_id	treatment_group	trial_id
110153	Experimental	NCT00421928
110153	Placebo Comparator	NCT00472303

When joined together (on api_id) the resulting table has 4 rows since this represents a many-to-many relationship:

api_id	source_api_id	api_name	treatment_group	trial_id
110153	DB06204	tapentadol	Experimental	NCT00421928
110153	DB06204	tapentadol	Placebo Comparator	NCT00472303
110153	CHEMBL1201776	bn-200	Placebo Comparator	NCT00472303
110153	CHEMBL1201776	bn-200	Experimental	NCT00421928

As more data are joined in (for example, the MeSH descriptors associated with trials) the number of resulting rows increases exponentially.

Some users may want a distinct API for every row, or perhaps a distinct trial ID. To do this, please choose the proper value from the "Return rows as" dropdown located at the bottom-left of the query builder. For example, if one were interested in a unique list of organizations for a given predicate, one would set the dropdown to "Org name"; see for instance example #2.

Examples

Below, you'll find various examples that walk through how to construct a query. Please note that since the UI is in fact an actual HTML object (as opposed to an image) you can modify the query as you see fit and click the button to execute the query!

Example #1

All data for small molecule APIs used in Phase 4 trials.

Explanation:

the predicate operator equal is used since we want an exact match to the predicate value.
The "return row as" dropdown is left on its default value since "all data" are requested.
The default operator AND is used to compare the two defined predicates.
The SQL equivalent is WHERE api_type = 'Small molecule' AND trial_phase = 'Phase 4'.