Hello,
It's become clear to me that the API for 2070 [1] (which is currently
a mess :) has never really been discussed by the developers, and I've
been thinking about ways of organizing it such that one can add
features to file uploading without modifying the core.
Here's an overview of what I had in mind.
(Any comments/suggestions/angry cries are greatly appreciated.)
Extremely Visible Changes
================
request.set_upload_handler(<upload_handler>)
--------------------------------------------------------------
This new function will allow one to register a new FileUploadHandler
object to deal with the incoming data.
The API for the FileUploadHandler (and the default child
TemporaryFileUploadHandler) is discussed below.
request.FILES
-----------------
This is no longer a MultiValueDict of raw content, but a
MultiValueDict of UploadedFile objects.
This will probably hurt the most, as there are probably applications
assuming that they can take the content from this dict.
The API of the UploadedFile (and the default child
TemporaryUploadedFile) is discussed below.
APIs for New Objects
=============
FileUploadHandler
-----------------------
Objects of this type must define the following methods:
1. new_file(filename, content_length) -- A signal that a new file
has been reached by the parser.
2. receive_data_chunk(raw_data, start, stop) -- Some data has been
received by the parser. A non-nonzero return value indicates
that
the file upload should cease.
3. file_complete(file_size) -- A signal that the current file has
completed (no more than one file will ever be dealt with).
The expected return value is an object of UploadedFile.
4. upload_complete() -- A signal that all uploads have finished
successfully.
5. get_chunk_size() -- Returns a number that represents the maximum
number
of bytes read into memory for each chunk.
By adding a set_upload_handler() method to request, anyone can
override the default upload handler. However, this must be done before
the POST was accessed, and it is probably recommended we raise an
error if someone tries to set a new upload handler after the FILES
MultiValueDict is populated.
The default handler (TemporaryFileUploadHandler) will stream the data
to a temporary file for each part of the upload, and return a
corresponding TemporaryUploadedFile object that represents that data.
In addition, the TemporaryFileUploadHandler can know to stream it to
memory if the content_size is small enough (and start streaming it
into disk if the header lied for some grace period).
UploadedFile
----------------
Objects of this class represent files that were just uploaded and
handled by some handler above. I'm assuming that this will be the most
difficult class to nail, since it has to balance a few different
aspects:
1. Encapsulating what it means to be a uploaded file.
2. Making it easy for a developer to send it to a FileBackend and
have it Do The Right Thing.
3. Making it easy for a developer to get the content and do
something with it.
Methods of the UploadedFile:
1. open() -- Open the file for reading.
2. read([num_bytes]) -- Read a number of bytes from the file.
3. close() -- Round out the standard read file operations.
4. chunk([chunk_size]) -- Generates chunks from the file (if the
content is in memory already, it should ignore chunk_size).
5. filename() -- The filename from the content-disposition header.
6. multiple_chunks([chunk_size]) -- Whether or not you can expect
multiple chunks from chunk() (Boolean). Useful if you want to
do
something particularly when you know you can put it all into
memory at once.
7. file_size() -- The size in bytes of the uploaded file.
Added method of a TemporaryUploadedFile:
8. temporary_file_path() -- The whole path (directory and basename)
of the temporary file on disk. A FileBackend may decide to move
a file (using the OS move operations) if it has a non-empty
temporary_file_path.
Notes: After talking with Marty online, it seems that this will work
with the FileBackends, provided that they know how to deal with it.
For instance, if the upload is going to a temporary file, and the
FileBackend is a standard file, then the FileBackend for standard
files should know from temporary_file_path() that it can move, and
perform an OS move operation. Otherwise, the FileBackends should use
the chunk() iterator and do whatever it needs in chunks.
It's interesting to note that with this framework a lot of interesting
possibilities open up. I will not write any of the code to do anything
but the basic temporary disk storage, but here are a few interesting
examples of what can happen:
- Gzipping data on the fly [GZipFileUploadHandler +
GZipFileBackend].
- Saving file to another FTP Server on the fly
[FTPFileUploadHandler +
NoOpFileBackend].
- Having Cool Ajax-y file uploads [AjaxProgressUploadHandler + Any
Backend].
- Having user-based quotas [QuotaUploadHandler + Any Backend].
...
Anyway, the list immediately above isn't what's important right now,
but I'd like to get the API
close enough so we can break things once.
Cheers,
Michael Axiak
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to
django-developers@...
To unsubscribe from this group, send email to
django-developers-unsubscribe@...
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en-~----------~----~----~----~------~----~------~--~---