[jira] Created: (ESPER-412) Out of memory while using group by

View: New views
3 Messages — Rating Filter:   Alert me  

[jira] Created: (ESPER-412) Out of memory while using group by

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Out of memory while using group by
----------------------------------

                 Key: ESPER-412
                 URL: http://jira.codehaus.org/browse/ESPER-412
             Project: Esper
          Issue Type: Bug
    Affects Versions: 3.2
            Reporter: Ben


Hello,

I am evalutaing esper and I am getting out of memory when using 'group by'. This out of memory issues is stopping us from using esper. We would like to use esper. Please help me with this issue.

I have an Order event as defined below.

Order
{
  int orderId;
  String symbol;
  int totalFilledQuantity;
  double price;
}

I can get multiple order event updates for the same order. orderId is the key by which order updates are combined.

My query to calculate customer position is

select symbol,
       sum(totalFilledQuantity) as totalFilled,
       sum(price*filledQuantity) as totalValue
from Order group by symbol


Input                                                                                                         Expected Output
orderid=1, symbol=IBM, totalfilledQuantity=0, price=null (new)    IBM  totalFilled=0
orderid=1, symbol=IBM, totalfilledQuantity=10, price=10 (upd)     IBM  totalFilled=10
orderid=1, symbol=IBM, totalfilledQuantity=20, price=10 (upd)     IBM  totalFilled=20
orderid=2, symbol=MSFT, totalfilledQuantity=0, price=null (new)   MSFT totalFilled=0
orderid=2, symbol=MSFT, totalfilledQuantity=10, price=10 (upd)    MSFT totalFilled=10
orderid=3, symbol=IBM, totalfilledQuantity=0, price=null  (new)   IBM  totalFilled=20
orderid=3, symbol=IBM, totalfilledQuantity=10, price=10  (upd)    IBM  totalFilled=30


Actual Output
IBM 0
IBM 10
IBM 30 (not correct)
MSFT 0
MSFT 10
IBM 30 (not correct)
IBM 40 (not correct)

This is not working because I am not using Revision type. Is that correct? Or is there any other solution without using revision?

I tried using revisions


          ConfigurationRevisionEventType configRev = new ConfigurationRevisionEventType();
          configRev.addNameBaseEventType("Order");  
          configRev.setKeyPropertyNames(new String[] {"orderId"});
          configRev.setPropertyRevision(ConfigurationRevisionEventType.PropertyRevision.MERGE_EXISTS);
          config.addRevisionEventType("OrderRevisions", configRev);


          epService.getEPAdministrator().createEPL("create window AllPositions.win:keepall() as OrderRevisions");
          epService.getEPAdministrator().createEPL("insert into AllPositions select * from Order");


          String expression = "select symbol, sum(totalFilledQuantity) as totalFilled, sum(price*filledQuantity) as totalValue from AllPositions  group by symbol");  
          EPStatement statement = epService.getEPAdministrator().createEPL(expression);  
         
          statement.addListener(new UpdateListener()
          {
            public void update(EventBean[] newEvents, EventBean[] oldEvents)
            {
                EventBean event = newEvents[0];
                System.out.println("symbol=" + event.get("symbol") + " total=" + event.get("totalFilled"));                                
            }
          });


This gives me the expected results. But when I send 50000 order updates per two seconds and if there are 1000000 unique orders, I am getting out of memory with 1G max memory. It looks like the memory is keep on increasing because it keeps all updates in memory instead of keeping only the merged events per symbol.


win:keepall() may also be a culprit. But if I remove keepall(), the data output is not correct.

Also keepall() is needed, so that I can do a query later on asking for all symbols and their totalFilled.

Is this the correct way to use esper or am I using it incorrectly?

Can you tell us how we can get the expected results and also create a window were we can keep the latest values per symbol to query later on.

Thanks
Ben

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Closed: (ESPER-412) Out of memory while using group by

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ http://jira.codehaus.org/browse/ESPER-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Bernhardt closed ESPER-412.
----------------------------------

    Resolution: Cannot Reproduce

> Out of memory while using group by
> ----------------------------------
>
>                 Key: ESPER-412
>                 URL: http://jira.codehaus.org/browse/ESPER-412
>             Project: Esper
>          Issue Type: Bug
>    Affects Versions: 3.2
>            Reporter: Ben
>
> Hello,
> I am evalutaing esper and I am getting out of memory when using 'group by'. This out of memory issues is stopping us from using esper. We would like to use esper. Please help me with this issue.
> I have an Order event as defined below.
> Order
> {
>   int orderId;
>   String symbol;
>   int totalFilledQuantity;
>   double price;
> }
> I can get multiple order event updates for the same order. orderId is the key by which order updates are combined.
> My query to calculate customer position is
> select symbol,
>        sum(totalFilledQuantity) as totalFilled,
>        sum(price*filledQuantity) as totalValue
> from Order group by symbol
> Input                                                                                                         Expected Output
> orderid=1, symbol=IBM, totalfilledQuantity=0, price=null (new)    IBM  totalFilled=0
> orderid=1, symbol=IBM, totalfilledQuantity=10, price=10 (upd)     IBM  totalFilled=10
> orderid=1, symbol=IBM, totalfilledQuantity=20, price=10 (upd)     IBM  totalFilled=20
> orderid=2, symbol=MSFT, totalfilledQuantity=0, price=null (new)   MSFT totalFilled=0
> orderid=2, symbol=MSFT, totalfilledQuantity=10, price=10 (upd)    MSFT totalFilled=10
> orderid=3, symbol=IBM, totalfilledQuantity=0, price=null  (new)   IBM  totalFilled=20
> orderid=3, symbol=IBM, totalfilledQuantity=10, price=10  (upd)    IBM  totalFilled=30
> Actual Output
> IBM 0
> IBM 10
> IBM 30 (not correct)
> MSFT 0
> MSFT 10
> IBM 30 (not correct)
> IBM 40 (not correct)
> This is not working because I am not using Revision type. Is that correct? Or is there any other solution without using revision?
> I tried using revisions
>  ConfigurationRevisionEventType configRev = new ConfigurationRevisionEventType();
>  configRev.addNameBaseEventType("Order");  
>  configRev.setKeyPropertyNames(new String[] {"orderId"});
>  configRev.setPropertyRevision(ConfigurationRevisionEventType.PropertyRevision.MERGE_EXISTS);
>  config.addRevisionEventType("OrderRevisions", configRev);
>  epService.getEPAdministrator().createEPL("create window AllPositions.win:keepall() as OrderRevisions");
>  epService.getEPAdministrator().createEPL("insert into AllPositions select * from Order");
>  String expression = "select symbol, sum(totalFilledQuantity) as totalFilled, sum(price*filledQuantity) as totalValue from AllPositions  group by symbol");  
>  EPStatement statement = epService.getEPAdministrator().createEPL(expression);  
>  
>  statement.addListener(new UpdateListener()
>           {
>             public void update(EventBean[] newEvents, EventBean[] oldEvents)
>             {
>                 EventBean event = newEvents[0];
>                 System.out.println("symbol=" + event.get("symbol") + " total=" + event.get("totalFilled"));                                
>             }
>  });
> This gives me the expected results. But when I send 50000 order updates per two seconds and if there are 1000000 unique orders, I am getting out of memory with 1G max memory. It looks like the memory is keep on increasing because it keeps all updates in memory instead of keeping only the merged events per symbol.
> win:keepall() may also be a culprit. But if I remove keepall(), the data output is not correct.
> Also keepall() is needed, so that I can do a query later on asking for all symbols and their totalFilled.
> Is this the correct way to use esper or am I using it incorrectly?
> Can you tell us how we can get the expected results and also create a window were we can keep the latest values per symbol to query later on.
> Thanks
> Ben

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Commented: (ESPER-412) Out of memory while using group by

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/ESPER-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=196345#action_196345 ]

Thomas Bernhardt commented on ESPER-412:
----------------------------------------

Can you please provide a test class, or let us know:
- statement
- configuration
- test data fed

Right approach seems simple std:unique(orderId)

Revision event types can also merge and only keep the versions required for a complete picture of properties, but don't seem required here.

> Out of memory while using group by
> ----------------------------------
>
>                 Key: ESPER-412
>                 URL: http://jira.codehaus.org/browse/ESPER-412
>             Project: Esper
>          Issue Type: Bug
>    Affects Versions: 3.2
>            Reporter: Ben
>
> Hello,
> I am evalutaing esper and I am getting out of memory when using 'group by'. This out of memory issues is stopping us from using esper. We would like to use esper. Please help me with this issue.
> I have an Order event as defined below.
> Order
> {
>   int orderId;
>   String symbol;
>   int totalFilledQuantity;
>   double price;
> }
> I can get multiple order event updates for the same order. orderId is the key by which order updates are combined.
> My query to calculate customer position is
> select symbol,
>        sum(totalFilledQuantity) as totalFilled,
>        sum(price*filledQuantity) as totalValue
> from Order group by symbol
> Input                                                                                                         Expected Output
> orderid=1, symbol=IBM, totalfilledQuantity=0, price=null (new)    IBM  totalFilled=0
> orderid=1, symbol=IBM, totalfilledQuantity=10, price=10 (upd)     IBM  totalFilled=10
> orderid=1, symbol=IBM, totalfilledQuantity=20, price=10 (upd)     IBM  totalFilled=20
> orderid=2, symbol=MSFT, totalfilledQuantity=0, price=null (new)   MSFT totalFilled=0
> orderid=2, symbol=MSFT, totalfilledQuantity=10, price=10 (upd)    MSFT totalFilled=10
> orderid=3, symbol=IBM, totalfilledQuantity=0, price=null  (new)   IBM  totalFilled=20
> orderid=3, symbol=IBM, totalfilledQuantity=10, price=10  (upd)    IBM  totalFilled=30
> Actual Output
> IBM 0
> IBM 10
> IBM 30 (not correct)
> MSFT 0
> MSFT 10
> IBM 30 (not correct)
> IBM 40 (not correct)
> This is not working because I am not using Revision type. Is that correct? Or is there any other solution without using revision?
> I tried using revisions
>  ConfigurationRevisionEventType configRev = new ConfigurationRevisionEventType();
>  configRev.addNameBaseEventType("Order");  
>  configRev.setKeyPropertyNames(new String[] {"orderId"});
>  configRev.setPropertyRevision(ConfigurationRevisionEventType.PropertyRevision.MERGE_EXISTS);
>  config.addRevisionEventType("OrderRevisions", configRev);
>  epService.getEPAdministrator().createEPL("create window AllPositions.win:keepall() as OrderRevisions");
>  epService.getEPAdministrator().createEPL("insert into AllPositions select * from Order");
>  String expression = "select symbol, sum(totalFilledQuantity) as totalFilled, sum(price*filledQuantity) as totalValue from AllPositions  group by symbol");  
>  EPStatement statement = epService.getEPAdministrator().createEPL(expression);  
>  
>  statement.addListener(new UpdateListener()
>           {
>             public void update(EventBean[] newEvents, EventBean[] oldEvents)
>             {
>                 EventBean event = newEvents[0];
>                 System.out.println("symbol=" + event.get("symbol") + " total=" + event.get("totalFilled"));                                
>             }
>  });
> This gives me the expected results. But when I send 50000 order updates per two seconds and if there are 1000000 unique orders, I am getting out of memory with 1G max memory. It looks like the memory is keep on increasing because it keeps all updates in memory instead of keeping only the merged events per symbol.
> win:keepall() may also be a culprit. But if I remove keepall(), the data output is not correct.
> Also keepall() is needed, so that I can do a query later on asking for all symbols and their totalFilled.
> Is this the correct way to use esper or am I using it incorrectly?
> Can you tell us how we can get the expected results and also create a window were we can keep the latest values per symbol to query later on.
> Thanks
> Ben

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email