hi Erik,
It is designed to achieve this using a Transformer.
I am assuming that your API gives delta "deleted/modified/added" documents.
Always run a full-import with clean=false. Depending on the values
returned by the API your transformer can use $deleteById for deletes
etc.
$nextUrl and $hasMore can also be used to fetch more and more data .
Again these variables can be generated and put into the row by the
Transformer
we did it for one of our internal API for amessage boards using a
jsvascript transformer. you can do this with a java transformer as
well
On Thu, Jul 9, 2009 at 7:57 PM, Erik Hatcher<
erik@...> wrote:
> I'm exploring other ways of getting data into Solr via DataImportHandler
> than through a relational database, particularly the URLDataSource.
>
> I see the special commands for deleting by id and query as well as the
> $hasMore/$nextUrl techniques, but I'm unclear on exactly how one would go
> about designing a data source over HTTP that worked cleanly for full
> importing and also for delta indexing.
>
> For sake of argument, suppose I have /data.xml[?since=<some
> timestamp>][&start=X&rows=Y] and it could return documents in Solr XML (or
> really any basic format) since the last time it was updated (or all records
> if no since parameter is provided). And the service could also return which
> records to remove since that timestamp too. Can I get there from here using
> URLDataSource?
>
> Have folks been doing this? If so, anyone care to share some basic
> tips/tricks/examples?
>
> Thanks,
> Erik
>
>
--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL |
http://aol.com