Re: DDE, Debian Data Export

View: New views
2 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: DDE, Debian Data Export

by Andreas Tille :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

[debian-custom list - actually targetted at the to be created debian-blends list
  in CC]

On Wed, 11 Feb 2009, Enrico Zini wrote:

> * Maintainer <-> Source package mapping
> * Popcon rankings
> * What is in the new queue
> * Package screenshots
> * Localisation information
> * uscan status

These items of the (slightly cropped) list are extremely interesting for
what we need in the Blends task pages.
And I would like to add

   * DebTags of a package (ping: I was asking for a Python interface ...)
   * DDTP (if this is not yet includet in "Localisation information"

> * Debian Pure Blend specific information
>
> A nightmare, uh?

Until today?  Not really a nightmare - but partly really hard to obtain
and thus not finished at the extend I would like to see.

> * The solution
>
> DDE is a way to make it simple to publish and download data.  The aim is
> to be able to access all sorts of Debian information without worrying
> about data formats, protocols and access control, and to make it easy to
> discover what data is available.

Sounds great.

> DDE exports data as a big virtual tree.  You can pick a node in the tree
> by its URL and download all the data that it contains, in a format of
> your choice: currently it supports JSON/JSONP, YAML, CSV and Python
> pickled objects.

I have to admit after a (quick) view about the URLs you gave I did not
really understand how the data are entering the tree and how I can pull
the information (it's a shame that I missed your talk).

> DDE is not a competitor to UDD (http://wiki.debian.org/UDD): UDD is
> about creating a central location where all the data can be accessed,
> while DDE is about giving people a simple way to access data or subsets
> of data.

So is DDE actually using UDD as input?

> In a way, DDE and UDD complete each other: the more data enters UDD, the
> more data is available for DDE.  In turn, DDE gives a simple interface
> to the most popular and useful UDD queries.

Sound like a 'yes' to my questio above.  I clever UDD interface would
really rock!

> * The dream
>
> Here are some hints at what can be done with this:
>
> * Autocompletion in HTML fields
> * Export data to feed external sites like debtags.debian.net or
>   screenshots.debian.net
> * Have a way for package managers to easily access all sorts of data
> * Have a way to implement fancy tools that can query massive data sets
>   without needing to download them locally

Sounds really good!

> * A call for action
>
> You can add data to the DDE tree by just putting a data file in yaml,
> json or pickle format under `~/.dde`: I've written a specific guide[1]
> to this on the Debian wiki, see: http://wiki.debian.org/DDE/HomeFiles

This is the part I'm curious about.  Please explain in more detail.
I've written some code which is harvesting data for Blends - if I can
provide some input I'd be happy to do so.

> For more complicated cases (like accessing a remote database), it is
> possible to extend DDE via python plugins[1]. You can get in touch with
> me if you need to go that way.

TOUCH. ;-)

Many thanks for your work on this

        Andreas.

--
http://fam-tille.de


--
To UNSUBSCRIBE, email to debian-custom-REQUEST@...
with a subject of "unsubscribe". Trouble? Contact listmaster@...


Re: DDE, Debian Data Export

by Enrico Zini :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 11, 2009 at 05:08:52PM +0100, Andreas Tille wrote:

> On Wed, 11 Feb 2009, Enrico Zini wrote:
>
>> DDE exports data as a big virtual tree.  You can pick a node in the tree
>> by its URL and download all the data that it contains, in a format of
>> your choice: currently it supports JSON/JSONP, YAML, CSV and Python
>> pickled objects.
> I have to admit after a (quick) view about the URLs you gave I did not
> really understand how the data are entering the tree and how I can pull
> the information (it's a shame that I missed your talk).

The data enter the tree by means of plugins, that map branches of the
tree to queries to the places where the information is.  For example, if
you go to http://dde.debian.net/dde/q/udd/packages/prio-debian-lenny/debtags
then DDE will do a query to UDD for you and give you the results.

You can pull the information by simply appending ?t=FORMAT to the URL.
For example:
http://dde.debian.net/dde/q/udd/packages/prio-debian-lenny/debtags?t=yaml

The formats supported at the moment are json, yaml, csv and pickle.


>> DDE is not a competitor to UDD (http://wiki.debian.org/UDD): UDD is
>> about creating a central location where all the data can be accessed,
>> while DDE is about giving people a simple way to access data or subsets
>> of data.
> So is DDE actually using UDD as input?

UDD, apt-xapian-index, a Xapian index of apt-file information, static
files in DD's home directories, anything else one may want to write a
plugin for.


>> * A call for action
>>
>> You can add data to the DDE tree by just putting a data file in yaml,
>> json or pickle format under `~/.dde`: I've written a specific guide[1]
>> to this on the Debian wiki, see: http://wiki.debian.org/DDE/HomeFiles
> This is the part I'm curious about.  Please explain in more detail.
> I've written some code which is harvesting data for Blends - if I can
> provide some input I'd be happy to do so.

NOTE: following some thinking on IRC today, the way to publish static
data has changed.  Also, the HomeFiles wiki page has been renamed to
http://wiki.debian.org/DDE/StaticData

The good news is that the new way is even simpler.

To get started with publishing data in DDE, try this:

 1. Create some data structure in python or any other language with your
    data (you can mix dicts and lists freely, and if in doubt, use dicts).
 2. Save that data structure to disk in Yaml, Json or Pickle format, as
    you prefer.
 3. Log on merkel and put the resulting file in:
    /srv/dde.debian.net/public/sandbox/filename.[yaml|json|pickle]
    The extension should match the data format that you chose.
 4. Visit http://dde.debian.net/dde/q/sandbox/filename: you will see
    your data.

Note that http://wiki.debian.org/DDE/StaticData also contains examples
that you can run on merkel, as well as links to documentation about the
various supported data formats.


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico@...>


signature.asc (852 bytes) Download Attachment