Query Guide

The query tool allows users to search the CDEK data by specifying filtration predicates on any table column through the use of an interactive user-interface (UI). This guide explains the various functionalities of the query tool as well as provides some examples of constructing various queries; for definitions of the table columns please see the glossary.

Constructing a query

Using the UI, queries are constructed in a SQL-like manner by defining predicates on specific table column; these rules are then combined through boolean operators and can even be joined together into rule-groups. An empty query UI looks like:

The query construction begins by defining a query rule in a rule-group. Begin by choosing any of the table columns available in the dropdown list; the table columns are grouped by the type of object they describe. For example, all metadata for APIs are grouped under API metadata.

After choosing a table column, two input elements automatically populate to the right of the table column dropdown:

  1. predicate operator: boolean operator that compares the table column to the predicate value
  2. predicate value: the filtration value requested and which gets evaluated against the table column

Predicate operators

Various predicate operator are available for each of the table columns including:

  • equal: case-sensitive match to the entire predicate value; interpreted as the SQL = 'value' operator
  • begins with: case-insensitive match to the beginning of the predicate value interpreted as the SQL ILIKE 'value%' operator
  • contains: case-insensitive match to the any part of the predicate value; interpreted as the SQL ILIKE '%value%' operator
  • end with: case-insensitive match to the end of the predicate value; interpreted as the SQL ILIKE '%value' operator
  • is null: matches any table column equal to NULL; note that this is different than an empty string.
  • in: matches to any of the multiple rule-values provided; interpreted as the SQL IN ('value1', 'value2', ...)

Predicate values

Depending on the type of table column chosen, different types of predicate value inputs are available. For example, if a number type is chosen (e.g. API ID), the input box will only allow for number inputs. If a date type is chosen, a date picker will pop open when entering a value for the predicate value.

Finally, if a table column is chosen that can only contain a specific set of values (e.g. API type) the predicate value input box will auto-complete with matching values as you type; once the desired value is found in the auto-completed suggestions, it must be selected. See, for example, example #1 where the API type Small molecule was selected as the predicate value.

Adding rules and rule-groups

By setting a predicate operator and a predicate value, the query rule is fully defined. More rules can be added to a given rule-group by clicking the button; as soon as more than one rule is specified, rules in a given rule-group are evaluated by the boolean input which defaults to AND.

Additional rule-groups can be added to construct more complex queries by clicking the button. Within each rule-group multiple rules can be defined as detailed above. Just as rules are compared against one another through the boolean input, so are rule-groups. Finally, any rule or rule-group can be deleted by clicking the button.

Submitting a query

After the query has been constructed, the query is validated and submitted with the button. If any query rule is not properly validated, the offending rule will be highlighted in red with a icon; by hovering the cursor over the icon, a description of the rule violation can be read. A query will not be executed until all rules are fully validated.

Query results

An example query result can be seen below, similarly to a spreadsheet, it is comprised of rows and columns. Each table column can be sorted and some of them (e.g. MeSH descriptor will even render as a hyperlink which will take the user to more information when clicked. By default, only the first 25 results are shown; however more can be shown per page by selecting the proper number in the dropdown located at the top-left of the results table. Finally, the rendered results can be downloaded to a tab-delimited file by clicking the button and the query can be shared by clicking the button.

Depending on screen width, the results table may hide some of the resulting table columns. If that occurs, the hidden table columns can be viewed by clicking the button in the left column.

Row contents

One has to be careful interpreting the contents of a single row; every row is not a distinct API nor a distinct trial. The results table is generated through the joining of numerous relational tables. Given that an API has multiple associated names and trials, and given that a trial could have multiple associated sponsors and indications, the joining of all these data results in every combination being generated.

For example, the API 110153 has the following name & trial metadata (only a subset of data shown):
api_id source_api_id api_name
110153 DB06204 tapentadol
110153 CHEMBL1201776 bn-200
api_id treatment_group trial_id
110153 Experimental NCT00421928
110153 Placebo Comparator NCT00472303

When joined together (on api_id) the resulting table has 4 rows since this represents a many-to-many relationship:

api_id source_api_id api_name treatment_group trial_id
110153 DB06204 tapentadol Experimental NCT00421928
110153 DB06204 tapentadol Placebo Comparator NCT00472303
110153 CHEMBL1201776 bn-200 Placebo Comparator NCT00472303
110153 CHEMBL1201776 bn-200 Experimental NCT00421928
As more data are joined in (for example, the MeSH descriptors associated with trials) the number of resulting rows increases exponentially.

Some users may want a distinct API for every row, or perhaps a distinct trial ID. To do this, please choose the proper value from the "Return rows as" dropdown located at the bottom-left of the query builder. For example, if one were interested in a unique list of organizations for a given predicate, one would set the dropdown to "Org name"; see for instance example #2.

Examples

Below, you'll find various examples that walk through how to construct a query. Please note that since the UI is in fact an actual HTML object (as opposed to an image) you can modify the query as you see fit and click the button to execute the query!

Example #1

All data for small molecule APIs used in Phase 4 trials.

Explanation:

  • the predicate operator equal is used since we want an exact match to the predicate value.
  • The "return row as" dropdown is left on its default value since "all data" are requested.
  • The default operator AND is used to compare the two defined predicates.
  • The SQL equivalent is WHERE api_type = 'Small molecule' AND trial_phase = 'Phase 4'.