Inserting 'large' amounts of data

View: New views
10 Messages — Rating Filter:   Alert me  

Inserting 'large' amounts of data

by Mario Splivalo-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have a web application which allows users to upload a lot of phone
numbers. I need to store those numbers to a database. Usualy, one would
upload around 70k-100k of records, totaling around 2 MB in size.

I'm using tomcat as an application server, and JDBC to connect to pg8.3
database.

I will have around 20-50 concurent users in peek hours, and even that is
quite overestimated.

I could create the temporary file on the filesystem where database
cluster is located and then execute COPY mytable FROM
'/tmp/upload-data/uuidofsomesort.csv' WITH CSV', but the 'problem' is
that database server and tomcat reside on different physical machines.

What would one recommend as the best way to insert those data?

        Mario

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by John R Pierce :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Mario Splivalo wrote:
> I have a web application which allows users to upload a lot of phone
> numbers. I need to store those numbers to a database. Usualy, one would
> upload around 70k-100k of records, totaling around 2 MB in size.
>
> ...
> t would one recommend as the best way to insert those data?
>  


I believe you can use org.postgresql.copy.CopyIn() ...  there are
variants that use a writeToCopy() call to send the data, or a
java.io.InputStream, or a java.io.Reader ...



--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by dmp-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>
>
>I have a web application which allows users to upload a lot of phone
>numbers. I need to store those numbers to a database. Usualy, one would
>upload around 70k-100k of records, totaling around 2 MB in size.
>
>I'm using tomcat as an application server, and JDBC to connect to pg8.3
>database.
>
>I will have around 20-50 concurent users in peek hours, and even that is
>quite overestimated.
>
>I could create the temporary file on the filesystem where database
>cluster is located and then execute COPY mytable FROM
>'/tmp/upload-data/uuidofsomesort.csv' WITH CSV', but the 'problem' is
>that database server and tomcat reside on different physical machines.
>
>What would one recommend as the best way to insert those data?
>
> Mario
>
Hello Mario,
If the users already have the data in CSV format why not let them do it
via the
app. server? The connection can be made across machines if setup properly.

http://dandymadeproductions.com/projects/MyJSQLView/docs/javadocs/index.html
CSVDataImportThread.java

This class could be used as a basis, with some work. I have a bug with
data that has
semicolons, could be more robust also, but could be used for a start.

John wrote:

> I believe you can use org.postgresql.copy.CopyIn() ...  there are
> variants that use a writeToCopy() call to send the data, or a
> java.io.InputStream, or a java.io.Reader ...

This sounds a lot cleaner.

danap.

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Maciek Sakrejda :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Right. As of the 8.4 release, you have a jdbc API to the postgresql
COPY functionality. Cast your Connection to a PGConnection and call
.getCopyAPI(). You can then use the CopyManager[1] to copy data in
from your tomcat servlet.

[1] http://jdbc.postgresql.org/documentation/publicapi/org/postgresql/copy/CopyManager.html

--
Maciek Sakrejda | Software Engineer | Truviso
(650) 242-3500 Main
(650) 242-3501 F
msakrejda@...
www.truviso.com

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Mario Splivalo-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John R Pierce wrote:

> Mario Splivalo wrote:
>> I have a web application which allows users to upload a lot of phone
>> numbers. I need to store those numbers to a database. Usualy, one would
>> upload around 70k-100k of records, totaling around 2 MB in size.
>>
>> ...
>> t would one recommend as the best way to insert those data?
>
> I believe you can use org.postgresql.copy.CopyIn() ...  there are
> variants that use a writeToCopy() call to send the data, or a
> java.io.InputStream, or a java.io.Reader ...

Yes, I gues this is the way to go. I'll have to upgrade to 8.4, but
somehow I'd like to wait for a month or so before making the switch.

Thank you all.


        Mike

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Kris Jurka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Thu, 27 Aug 2009, Mario Splivalo wrote:

> Yes, I gues this is the way to go. I'll have to upgrade to 8.4, but somehow
> I'd like to wait for a month or so before making the switch.
>

You only need the 8.4 JDBC driver, not a 8.4 server.  Copy support should
work for all 7.4 and later servers.

Kris Jurka

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by John R Pierce :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Kris Jurka wrote:

>
>
> On Thu, 27 Aug 2009, Mario Splivalo wrote:
>
>> Yes, I gues this is the way to go. I'll have to upgrade to 8.4, but
>> somehow I'd like to wait for a month or so before making the switch.
>>
>
> You only need the 8.4 JDBC driver, not a 8.4 server.  Copy support
> should work for all 7.4 and later servers.

speaking of using JDBC COPY FROM STDIN via this CopyManager
interface...  how does that handle constraint violations, like duplicate
primary key?



--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Kris Jurka :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



John R Pierce wrote:
>
> speaking of using JDBC COPY FROM STDIN via this CopyManager
> interface...  how does that handle constraint violations, like duplicate
> primary key?
>

Just like a regular copy failure via psql or constraint violation.  An
Exception is thrown and the transaction is aborted (which you must
rollback).

Kris Jurka

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Maciek Sakrejda :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In the same manner as regular DML--you'll get a constraint violation:

cqdb=# create table foo(a int unique);
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "foo_a_key"
for table "foo"
CREATE TABLE
cqdb=# copy foo from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1
>> 1
>> 1
>> \.
ERROR:  duplicate key value violates unique constraint "foo_a_key"
CONTEXT:  COPY foo, line 2: "1"

(This is straight through psql, but essentially the same thing would
happen through jdbc).

--
Maciek Sakrejda | Software Engineer | Truviso
(650) 242-3500 Main
(650) 242-3501 F
msakrejda@...
www.truviso.com

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Inserting 'large' amounts of data

by Mario Splivalo-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Kris Jurka wrote:

>
>
> On Thu, 27 Aug 2009, Mario Splivalo wrote:
>
>> Yes, I gues this is the way to go. I'll have to upgrade to 8.4, but
>> somehow I'd like to wait for a month or so before making the switch.
>>
>
> You only need the 8.4 JDBC driver, not a 8.4 server.  Copy support
> should work for all 7.4 and later servers.

Works like a charm! :)

        Mike

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc