Skip to content

TPC-DS q14 fails with timeout/OOM with UseFinalizeByKey #6041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yumkam opened this issue Jun 27, 2024 · 1 comment · Fixed by #6727
Closed

TPC-DS q14 fails with timeout/OOM with UseFinalizeByKey #6041

yumkam opened this issue Jun 27, 2024 · 1 comment · Fixed by #6727
Assignees
Labels
area/yql YQL query language issues

Comments

@yumkam
Copy link
Collaborator

yumkam commented Jun 27, 2024

q14 aggregate phase timeout/OOM with UseFinalizeByKey.
Fixed request without different timeout in $cross_items generation:

PRAGMA dq.EnableDqReplicate = "1";
PRAGMA DisableSimpleColumns;

PRAGMA dq.MaxTasksPerOperation='1700';
PRAGMA dq.HashJoinMode = "grace";
PRAGMA dq.HashShuffleTasksRatio="1.0";
PRAGMA dq.HashShuffleMaxTasks="200";

-- pragma dq.AggregateStatsByStage="false";
pragma dq.OptLLVM="off";

PRAGMA s3.UseBlocksSource="true";
pragma OrderedColumns="true";

pragma dq.MaxTasksPerStage="4";
pragma dq.ComputeActorType="async";
pragma dq.UseFinalizeByKey="true";
pragma AnsiOptionalAs="1";

-- NB: Subquerys
$bla1 = (select DISTINCT iss.i_brand_id brand_id
     ,iss.i_class_id class_id
     ,iss.i_category_id category_id
 from bindings.store_sales as store_sales
     cross join bindings.item iss
     cross join bindings.date_dim d1
 where ss_item_sk = iss.i_item_sk
   and ss_sold_date_sk = d1.d_date_sk
   and d1.d_year between 2000 AND 2000 + 2);
$bla2 = (select ics.i_brand_id brand_id
     ,ics.i_class_id class_id
     ,ics.i_category_id category_id
 from bindings.catalog_sales as catalog_sales
     cross join bindings.item ics
     cross join bindings.date_dim d2
 where cs_item_sk = ics.i_item_sk
   and cs_sold_date_sk = d2.d_date_sk
   and d2.d_year between 2000 AND 2000 + 2);
$bla3 = (select iws.i_brand_id brand_id
     ,iws.i_class_id class_id
     ,iws.i_category_id category_id
 from bindings.web_sales
     cross join bindings.item iws
     cross join bindings.date_dim d3
 where ws_item_sk = iws.i_item_sk
   and ws_sold_date_sk = d3.d_date_sk
   and d3.d_year between 2000 AND 2000 + 2);

$cross_items = (select i_item_sk ss_item_sk
 from bindings.item as item cross join
 (select bla1.brand_id as brand_id, bla1.class_id as class_id, bla1.category_id as category_id from
 $bla1 bla1 left semi join $bla2 bla2 on (bla1.brand_id = bla2.brand_id and bla1.class_id = bla2.class_id and bla1.category_id = bla2.category_id)
            left semi join $bla3 bla3 on (bla1.brand_id = bla3.brand_id and bla1.class_id = bla3.class_id and bla1.category_id = bla3.category_id)
 ) x
 where i_brand_id = brand_id
      and i_class_id = class_id
      and i_category_id = category_id
);

$avg_sales =
 (select avg(quantity*list_price) average_sales
  from (select ss_quantity quantity
             ,ss_list_price list_price
       from bindings.store_sales as store_sales
           cross join bindings.date_dim as date_dim
       where ss_sold_date_sk = d_date_sk
         and d_year between 2000 and 2000 + 2
       union all
       select cs_quantity quantity
             ,cs_list_price list_price
       from bindings.catalog_sales as catalog_sales
           cross join bindings.date_dim as date_dim
       where cs_sold_date_sk = d_date_sk
         and d_year between 2000 and 2000 + 2
       union all
       select ws_quantity quantity
             ,ws_list_price list_price
       from bindings.web_sales as web_sales
           cross join bindings.date_dim as date_dim
       where ws_sold_date_sk = d_date_sk
          and d_year between 2000 and 2000 + 2) x);

$week_seq_2001 = (select d_week_seq
                     from bindings.date_dim as date_dim
                     where d_year = 2000 + 1
                       and d_moy = 12
                       and d_dom = 15);

$week_seq_2000 = (select d_week_seq
                     from bindings.date_dim as date_dim
                     where d_year = 2000
                       and d_moy = 12
                       and d_dom = 15);


-- start query 1 in stream 0 using template query14.tpl and seed 1819994127

  select  channel, i_brand_id,i_class_id,i_category_id,sum(sales), sum(number_sales)
 from(
       select 'store' channel, item.i_brand_id i_brand_id,item.i_class_id i_class_id
             ,item.i_category_id i_category_id,sum(ss_quantity*ss_list_price) sales
             , count(*) number_sales
       from bindings.store_sales as store_sales
           cross join bindings.item as item
           cross join bindings.date_dim as date_dim
       where ss_item_sk in $cross_items
         and ss_item_sk = i_item_sk
         and ss_sold_date_sk = d_date_sk
         and d_year = 2000+2
         and d_moy = 11
       group by item.i_brand_id,item.i_class_id,item.i_category_id
       having sum(ss_quantity*ss_list_price) > $avg_sales
       union all
       select 'catalog' channel, item.i_brand_id i_brand_id,item.i_class_id i_class_id,item.i_category_id i_category_id, sum(cs_quantity*cs_list_price) sales, count(*) number_sales
       from bindings.catalog_sales as catalog_sales
           cross join bindings.item as item
           cross join bindings.date_dim as date_sim
       where cs_item_sk in $cross_items
         and cs_item_sk = i_item_sk
         and cs_sold_date_sk = d_date_sk
         and d_year = 2000+2
         and d_moy = 11
       group by item.i_brand_id,item.i_class_id,item.i_category_id
       having sum(cs_quantity*cs_list_price) > $avg_sales
       union all
       select 'web' channel, item.i_brand_id i_brand_id,item.i_class_id i_class_id,item.i_category_id i_category_id, sum(ws_quantity*ws_list_price) sales , count(*) number_sales
       from bindings.web_sales as web_sales
           cross join bindings.item as item
           cross join bindings.date_dim as date_dim
       where ws_item_sk in $cross_items
         and ws_item_sk = i_item_sk
         and ws_sold_date_sk = d_date_sk
         and d_year = 2000+2
         and d_moy = 11
       group by item.i_brand_id,item.i_class_id,item.i_category_id
       having sum(ws_quantity*ws_list_price) > $avg_sales
 ) y
 group by rollup (channel, i_brand_id,i_class_id,i_category_id)
 order by channel,i_brand_id,i_class_id,i_category_id
 limit 100;
-- second request removed

Reproduction scripts/etc:
https://gist.github.com/yumkam/9ae3f6793ab089f44a8233d09bb8c753

With tpc ds 1: both FinalizeByKey="true"/"false" finishes fine
With tpc ds 10: without FinalizeByKey finishes fine, consuming at most 3GB RSS and less than 3 minutes
with FinalizeByKey="true": OOM consuming over 50GB memory or TLEs

@yumkam yumkam self-assigned this Jun 27, 2024
@yumkam yumkam added the area/yql YQL query language issues label Jun 27, 2024
@aakulaga-ydb aakulaga-ydb assigned Tony-Romanov and unassigned yumkam Jul 1, 2024
@yumkam
Copy link
Collaborator Author

yumkam commented Jul 17, 2024

Not reproducible with #6727 applied (and similar performance issue with q23 seems also fixed by that PR)

@Tony-Romanov Tony-Romanov linked a pull request Jul 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/yql YQL query language issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants