[jira] Created: (SOLR-1544) Python script to post multiple files to solr using a queue and worker threads

View: New views
3 Messages — Rating Filter:   Alert me  

[jira] Created: (SOLR-1544) Python script to post multiple files to solr using a queue and worker threads

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Python script to post multiple files to solr using a queue and worker threads
-----------------------------------------------------------------------------

                 Key: SOLR-1544
                 URL: https://issues.apache.org/jira/browse/SOLR-1544
             Project: Solr
          Issue Type: Improvement
          Components: update
    Affects Versions: 1.5
         Environment: Python 2.6 and above
            Reporter: Dennis Kubes
            Priority: Minor
             Fix For: 1.5


The is a simple python script that uses a blocking queue and multiple worker threads to post updates (files) to solr.  Works when calling post.sh won't because of too many files or when
you want to throttle the speed at which you are updating solr.  Tested with runs as high as 30K files.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1544) Python script to post multiple files to solr using a queue and worker threads

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/SOLR-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dennis Kubes updated SOLR-1544:
-------------------------------

    Attachment: postqueue.py

The postqueue python script.  Requires python 2.6 or above to work correctly due to Queue module.

> Python script to post multiple files to solr using a queue and worker threads
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1544
>                 URL: https://issues.apache.org/jira/browse/SOLR-1544
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 1.5
>         Environment: Python 2.6 and above
>            Reporter: Dennis Kubes
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: postqueue.py
>
>
> The is a simple python script that uses a blocking queue and multiple worker threads to post updates (files) to solr.  Works when calling post.sh won't because of too many files or when
> you want to throttle the speed at which you are updating solr.  Tested with runs as high as 30K files.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1544) Python script to post multiple files to solr using a queue and worker threads

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/SOLR-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774345#action_12774345 ]

Chris A. Mattmann commented on SOLR-1544:
-----------------------------------------

Hey Dennis -- this sounds great!! I had the same problem with post.sh and using it on 10000s of files that i wanted to post. You can get around it with xargs, but this looks like a more complete and robust solution. +1!

Cheers,
Chris


> Python script to post multiple files to solr using a queue and worker threads
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1544
>                 URL: https://issues.apache.org/jira/browse/SOLR-1544
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 1.5
>         Environment: Python 2.6 and above
>            Reporter: Dennis Kubes
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: postqueue.py
>
>
> The is a simple python script that uses a blocking queue and multiple worker threads to post updates (files) to solr.  Works when calling post.sh won't because of too many files or when
> you want to throttle the speed at which you are updating solr.  Tested with runs as high as 30K files.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.