Skip to main content

How to Query Relations?

Accio define three kinds of relation. ONE_TO_ONE and ONE_TO_MANY and MANY_TO_ONE (the opposite side of ONE_TO_MANY)

Query TO_ONE relation

  • Query TO_ONE relation (i.e. ONE_TO_ONE and MANY_TO_ONE). Accio use . to chaining TO_ONE relation

    Find order keys in Lineitem

    SELECT 
    orders.orderkey
    FROM
    Lineitem

    orderkey
    ----------
    6560
    23009
    134
    ...
    Use . to traverse TO_ONE relation. Orders to Customer is MANY_TO_ONE relation and customer to nation is MANY_TO_ONE relation.

    Find all nation names for customers in Orders

    SELECT 
    distinct(customer.nation.name)
    FROM
    Orders
    Directly access to_one relation column will get primary key of to_one side.
    SELECT orders from Lineitem

    orders
    --------
    6560
    23009
    134
    ...
    SQL as above provides the same functionality as SEELCT orders.orderkey FROM Lineitem

Query TO_MANY relation

  • Query TO_MANY relation (i.e. ONE_TO_MANY) TO_MANY relation represent one or more primary keys on the principal entity. Accio use array to represent TO_MANY relation. Directly access to_many relation column will get an array of primary keys of to_many side.
    SELECT lineitems FROM Orders;

    lineitems
    ----------------------------------------------------
    {373151,373152}
    {364851,364852,364853,364854,364855,364856}
    {413791,413792,413793,413794,413795,413796,413797}
    {571271}
    {428201}
    To query inner lineitems columns, use array index to get detail information. SQL as follows get linenumber and linestatus in the first lineitem element when orderkey is 1
    SELECT orderkey, lineitems[1].linenumber, lineitems[1].linestatus
    FROM Orders
    WHERE orderkey = 1

    orderkey | linenumber | linestatus
    ----------+------------+------------
    1 | 1 | O
    Accio also provides lambda/array functions to survey further insights in TO_MANY relation.

    Find the supplier with the highest income

    SELECT
    array_sum(
    transform(
    filter(Supplier.lineitems, item -> item.shipdate >= date '1995-04-01'),
    item -> item.extendedprice)) as total_extendedprice
    FROM Supplier;
    The SQL mentioned above seems a bit complicated. To make it easier to understand, let's break it down step by step. Additionally, to elaborate on each step more clearly, we'll use emojis to represent each value in the array within the lineitems. 🍎, 🥝, 🍊 are lineitem objects.
    SELECT suppkey, lineitems
    FROM Supplier

    suppkey | lineitems
    ----------+---------------------
    1 | [🍎, 🍎, 🥝, 🍊, 🍎]
    2 | [🍎, 🥝, 🍊, 🍎, 🍎, 🍊, 🥝]
    3 | [🥝, 🍎, 🍊]
    🍎 is the lineitem object that shipdate ≥ 1995-04-01, after applying filter we will filter out all emojis that aren’t 🍎.
    SELECT filter(Supplier.lineitems, item -> item.shipdate >= date '1995-04-01')
    FROM Supplier

    suppkey | lineitems
    ----------+---------------------
    1 | [🍎, 🍎, 🍎]
    2 | [🍎, 🍎, 🍎]
    3 | [🍎]
    After applying transform, 🍎 will be transformed into extendedprice per lineitem primary key in array
    SELECT
    transform(
    filter(Supplier.lineitems, item -> item.shipdate >= date '1995-04-01'),
    item -> item.extendedprice)) as total_extendedprice
    FROM Supplier;

    suppkey | lineitems
    ----------+---------------------
    1 | [💰, 💰, 💰]
    2 | [💰, 💰, 💰]
    3 | [💰]
    After applying the array_sum function, the extendedprice values in the array will be added together into a single total value
    SELECT
    array_sum(
    transform(
    filter(Supplier.lineitems, item -> item.shipdate >= date '1995-04-01'),
    item -> item.extendedprice)) as total_extendedprice
    FROM Supplier;

    suppkey | lineitems
    ----------+---------------------
    1 | 💰💰💰
    2 | 💰💰💰
    3 | 💰

Accio Functions

  • Note that all x in lambda/array functions only accept TO_MANY relation column. TO_ONE relation isn’t allowed.
  • transform(*x, lambda expression*): returns an array that is the result of applying lambda expression to each element of to-many side column:
    SELECT 
    transform(lineitems, item -> item.extendedprice * (1 - item.discount))
    FROM Supplier
  • filter(x, *lambda expression*): returns an array that is the result of filtering lambda expression to each element of to-many side column:
    SELECT 
    filter(lineitems, item -> item.linestatus = 'F')
    FROM Supplier
  • array_sort(*x, x.y, ASC/DESC*): sort element in to-many side by many side column
    • takes 3 argument, first is to-many side column name, second is column name in to-many side model to order by, third is DESC or ASC

      SELECT
      array_sort(lineitems, shipdate, DESC)
      FROM Supplier
  • first(*x, x.y, ASC/DESC*): sort element in to-many side by column in to-many side model and then get first element back
    • takes 3 argument, first is to-many side column name, second is column in to-many side model to order by, third is DESC or ASC

      SELECT
      first(lineitems, shipdate, DESC)
      FROM Supplier;
    • this function provides the same functionality as SELECT array_sort(lineitems, shipdate, DESC)[1] FROM Supplier

  • slice(*xstartlength*): get a slice of elements in to-many side column
    • takes 3 argument, first is to-many side column name, second is

      SELECT
      slice(lineitems, 2, 1)
      FROM Supplier;
  • TopN use case
    SELECT
    slice(array_sort(lineitems, shipdate, DESC), 1, 3)
    FROM Supplier
  • array aggregation functions
    • Note that all array aggregation function list as follows only works when input argument is to-many relation column or it is a accio function output. Normal array type column won’t work in these array aggregation functions.
    • array_count(*x*): returns the cardinality (size)
    • array_max(*x*): returns max value
    • array_min(*x*): returns min value
    • array_sum(*x*): returns sum value
    • array_avg(*x*): returns average value
    • array_bool_or(*x*): returns TRUE if any input value is TRUE, otherwise FALSE
    • array_every(*x*): returns TRUE if every input value is TRUE, otherwise FALSE
      SELECT
      array_sum(transform(lineitems, i -> i.extendedprice))
      FROM Supplier;

      SELECT
      array_bool_or(transform(lineitems, i -> i.linestatus = 'F'))
      FROM Supplier;

      SELECT
      array_count(lineitems)
      FROM Supplier;
  • (WIP) Currently lambda expression body can’t accept relationship chaining see https://github.com/Canner/accio/issues/275