|
View:
New views
10 Messages
—
Rating Filter:
Alert me
|
|
|
Thoughts on GNUSpeech and possible accessibility applicationsHello,
I've been monitoring the archives for a while, but I thought it was time to subscribe to the list. I would be interested in helping to contribute beta testing to the project, as well as any ideas or experience that might be of benefit. As a computer user who happens to be blind, I have been using speech synthesis since the early 1980s. (These days, I rely more on braille devices, but speech still plays a significant role). Although there are excellent free (as in freedom) screen readers and speech-based user interfaces available for GNU/Linux, such as Emacspeak (http://emacspeak.sourceforge.net/), SpeakUp (http://www.linux-speakup.org/) and Orca (http://www.gnome.org/projects/orca/), the quality of free text to speech systems is, in my judgment at least, somewhat inadequate. To be specific about this, I haven't heard any free software that even comes close to competing with the DECTalk synthesizer which is on my desk here. Moreover, the proprietary speech synthesis systems (available as software only rather than hardware) for GNU/Linux all incur licencing fees, and owing to the lack of access to source code, bugs can't be fixed by the developers of screen readers and speech-based user interfaces, or by users with programming skills. A high-quality, free, synthesizer could also be integrated by default into GNU/LInux distributions, and made available in devices that employ free and open-source software, for example mobile telephones (http://eyes-free.googlecode.com/ exemplifies the latter, and currently uses ESpeak as its synthesizer). Is there interest among participants in the GNUSpeech project in its potential to support such applications? If so, the porting of the text to speech server to GNU/Linux would be a necessary prerequisite, but the development environment would also need to be available to enable the implementation of additional languages. I am also interested in whether the possible accessibility applications of the project might help to attract development resources. I don't know any possible sources of funding or developers at present, but I would gladly participate in any such discussions. Since the text to speech system doesn't run under GNU/Linux yet, I haven't been able to test it. However, the paper and sample files at http://pages.cpsc.ucalgary.ca/~hill/papers/avios95/menu.htm were very useful. My initial impression is that I find GNUSpeech difficult to understand, partly due to the mixture of British phonetics and North American pronunciation that leads, for example, to pronounced "r" and "l" sounds where they occur in American, but not British English. However, I like the rhythm and intonation, which I know from having read the papers are the subject of substantial research. I don't understand speech synthesis sufficiently to know whether the quality of the speech could be easily improved by fine-tuning the dictionaries and databases, making use of the part of speech information, etc. For the accessibility applications mentioned above, there are other requirements that would need to be satisfied, and again I would be pleased to contribute to such discussions in the event that they become relevant to the project. I am also aware that I am by no means alone in desiring a high-quality text to speech system suitable for such applications in a Linux environment, available as free software. _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Thoughts on GNUSpeech and possible accessibility applicationsHi Jason,
On Mon, Apr 6, 2009 at 7:03 AM, Jason White <jason@...> wrote: > Since the text to speech system doesn't run under GNU/Linux yet, I haven't > been able to test it. However, the paper and sample files at > http://pages.cpsc.ucalgary.ca/~hill/papers/avios95/menu.htm > were very useful. GnuSpeech runs on Linux, but you need to install GNUstep. The source is available at: svn://svn.sv.gnu.org/gnuspeech/gnustep/trunk I may create a source package and send it to you. Which distribution do you use? Regards, Marcelo _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Thoughts on GNUSpeech and possible accessibility applicationsMarcelo Yassunori Matuda <marcelo.matuda@...> wrote:
> GnuSpeech runs on Linux, but you need to install GNUstep. The source > is available at: > svn://svn.sv.gnu.org/gnuspeech/gnustep/trunk Thanks. I'll check that out. Has the text to speech daemon been ported yet? The difficulty is that I wouldn't be able to operate the interactive (graphical) environment using either braille or speech, so what I really need is access from the shell. > > I may create a source package and send it to you. > Which distribution do you use? Debian Sid, with reasonably regular upgrades. By way of background, I know C from having read K&R some years ago, but not Objective-C. I don't have any background in complex analysis or digital signal processing, but would like to study more mathematics at some point in the future just for fun. Currently I am writing a Ph.D. thesis in contemporary philosophy of language (i.e., foundational issues in analytic semantics). I studied some phonetics as an undergraduate back in the early 90s, which may help in these discussions, depending on how much I can remember. _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Thoughts on GNUSpeech and possible accessibility applicationsHi Jason,
On Mon, Apr 6, 2009 at 8:58 PM, Jason White <jason@...> wrote: > Has the text to speech daemon been ported yet? The difficulty is that I > wouldn't be able to operate the interactive (graphical) environment using > either braille or speech, so what I really need is access from the shell. No, but there is a command line tool that synthesizes speech. I have tested in Debian Lenny the following procedure: (I recommend that you create another user, e.g. gnuspeech, and follow the procedure inside /home/gnuspeech.) Run as root: apt-get install gnustep gnustep-devel apt-get install portaudio19-dev apt-get install libgdbm-dev The remaining steps may be run as the user gnuspeech. The files will be installed in ~/GNUstep and ~/Library. mkdir temp cd temp svn co svn://svn.sv.gnu.org/gnuspeech/gnustep/trunk cd trunk Edit the file Frameworks/GnuSpeech/TextProcessing/GSDBMPronunciationDictionary.h and replace: #include <ndbm.h> with: #include <gdbm-ndbm.h> . /usr/share/GNUstep/Makefiles/GNUstep.sh ./install.sh If you receive an "Ok", the installation has been succesful. cd Tools/GnuSpeechCLI/ ./gnuspeechcli.sh hello world Monet, the GUI editor, is not working in this Debian release, but the command line util is ok. p.s. Why are you running Sid and not "testing" or "stable"? Regards, Marcelo _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
|
|
|
Re: Re: Thoughts on GNUSpeech and possible accessibility applicationsDalmazio Brisinda <dbrisinda@...> wrote:
> The text to speech daemon has been partially ported to OS X. By > "partially" I mean given text it generates synthesized output, but there > is currently no intonation (something I intend to remedy shortly) or > other more sophisticated features. It's also accessible via OS X > services. The ability to silence speech very quickly and start speaking a new message is a particularly important additional feature that would be required for accessibility applications. Also desirable is the capacity to track which word is currently being spoken in text already submitted by the application. Control over such matters as the handling of punctuation characters (whether to announce them or simply process them as influences upon the pausing and intonation) would need to be controllable via the API, as would speech rate and any other tunable parameters. I mention these factors not as an attempt to influence development priorities (which are of course entirely at the discretion of whoever is doing the work), but as a synopsis of the kinds of API features that would be needed. > > I'm not sure about the port to Linux though, as the text to speech > daemon uses the OS X distributed objects architecture, and I don't know > to what degree this same architecture is supported on the Gnustep/Linux > platform. Needless to say this elegant IPC/RPC architecture makes writing > servers/dameons quite straightforward on OS X. http://www.gnustep.org/resources/documentation/Developer/Base/ProgrammingManual/manual_7.html This appears promising. _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Re: Thoughts on GNUSpeech and possible accessibility applications
Hi Jason,
On Apr 7, 2009, at 5:59 PM, Jason White wrote:
You may be interested to check out the paper that describes the "Touch 'n Talk" system that Dalmazio mentioned in an earlier email. The direct link is: it is an item on my university web site to which you can also navigate. I wish it were all up and working right now. It was very disappointing to lose it, along with all the other stuff we had developed, when NeXT went belly up but getting it up again is one of the goals towards which we are headed. It would meet (and hopefully exceed) the requirements you have started to outline. [snip] All good wishes. david David Hill -------- The only function of economic forecasting is to make astrology look respectable. (J.K. Galbraith) -------- _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Re: Thoughts on GNUSpeech and possible accessibility applicationsDavid Hill <drh@...> wrote:
> You may be interested to check out the paper that describes the "Touch 'n > Talk" system that Dalmazio mentioned in an earlier email. The direct > link is: > > http://pages.cpsc.ucalgary.ca/~hill/papers/ieee-touch-n-talk-1988.pdf > > it is an item on my university web site to which you can also navigate. Thank you, David, for the reference. This is an interesting paper. It also reminds me of a related solution, developed at approximately the same time, by Jim Thatcher for IBM Screen Reader, in which a separate key pad was used for reading and navigational functions. Although I never had an opportunity to use it, I recall that one of the principal difficulties in the early versions was said to be that the system wouldn't automatically read new text presented on screen, or read text in response to cursor movement - the users had to switch frequently between the qwerty keyboard and the screen reader's key pad while interacting with the application software. The research in which I, personally, find the most insight is that by T.V. Raman, first in his AsteR software (Audio System for Technical Readings: http://www.cs.cornell.edu/home/raman/) and then in Emacspeak (http://emacspeak.sourceforge.net/ and for the latest source code, http://emacspeak.googlecode.com/). Emacspeak works best with synthesizers that allow changes to be made dynamically to voice characteristics, for example the DECTalk, and it would be interesting to know whether GNUSpeech might eventually support such audio formatting techniques. Further, in his latest work at Google on the accessibility of mobile telephones, Raman has devised a means of making touch screen input achievable in an "eyes-free" context. I trust that this digression into speech interface research is not unwelcome on the list; to ensure that it remains on topic, I have sought to connect it to functional requirements of a text to speech system. I think there is a need for a free (as in freedom) tts system capable of supporting the products of past and current speech interface research, while, just as importantly, providing opportunities for future research and free software development efforts. I also agree with David's observation that many of the most important requirements are already treated in his 1988 paper, although advances such as Raman's "audio formatting" techniques create additional, desirable features. _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Thoughts on GNUSpeech and possible accessibility applicationsMarcelo Yassunori Matuda <marcelo.matuda@...> wrote:
> I have tested in Debian Lenny the following procedure: [snip] Thank you for documenting the compilation and installation procedure, which I was able to follow successfully. > p.s. Why are you running Sid and not "testing" or "stable"? I am trying to keep up to date with Gnome and related packages so as to use, and test, Orca. I have found Sid to be generally reliable, with occasional, short-lived, problems, but perhaps I should really be using Testing instead. I usually just downgrade the occasional troublesome package. If Sid starts living up to its name more often, I'll switch to Testing. _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
|
|
Re: Re: Thoughts on GNUSpeech and possible accessibility applicationsHi Jason,
On Apr 7, 2009, at 9:06 PM, Jason White wrote: > David Hill <drh@...> wrote: > >> You may be interested to check out the paper that describes the >> "Touch 'n >> Talk" system that Dalmazio mentioned in an earlier email. The direct >> link is: >> >> http://pages.cpsc.ucalgary.ca/~hill/papers/ieee-touch-n-talk-1988.pdf >> >> it is an item on my university web site to which you can also >> navigate. > > Thank you, David, for the reference. This is an interesting paper. > It also > reminds me of a related solution, developed at approximately the > same time, by > Jim Thatcher for IBM Screen Reader, in which a separate key pad was > used for > reading and navigational functions. Although I never had an > opportunity to use > it, I recall that one of the principal difficulties in the early > versions was > said to be that the system wouldn't automatically read new text > presented on > screen, or read text in response to cursor movement - the users had > to switch > frequently between the qwerty keyboard and the screen reader's key > pad while > interacting with the application software. This was not a limitation with Touch-'n-Talk which was designed to integrate control and access within a single haptic-auditory interface in as natural a way as possible. We made a direct comparison between our system and a conventional key-operated talking terminal and our target population of blind users preferred the "Touch-'n-Talk" system, and the results for all users (there were five blind subjects and twelve sighted but blindfolded subjects) were comparable, which was a useful result since it implies that tests using blindfolded subjects can be used, at least in exploratory evaluations. > > The research in which I, personally, find the most insight is that > by T.V. > Raman, first in his AsteR software (Audio System for Technical > Readings: > http://www.cs.cornell.edu/home/raman/) and then in Emacspeak > (http://emacspeak.sourceforge.net/ and for the latest source code, > http://emacspeak.googlecode.com/). I checked out some of Dr. Raman's examples -- apparently using DECTalk. Audio formatting has the advantage that no special equipment is required. But imagine if you could access those mathematical formulae by using your finger, and being able to feel the spatial relationships between the components with your hand and fingers, checking individual characters and manipulating a mark as well as a cursor. "Touch-'n- Talk was deliberately designed to make spatial cues available with the need for sight, and to allow bookmarks, words, paragraphs, pages, spelling, searching and so on to be handled easily. > > Emacspeak works best with synthesizers that allow changes to be made > dynamically to voice characteristics, for example the DECTalk, and > it would be > interesting to know whether GNUSpeech might eventually support such > audio > formatting techniques. Having an articulatory synthesiser means that many different voices can be created dynamically, from child voices through female, to male. Having said that, not all voice characteristics are well understood, and not just for excellent female voices. > > Further, in his latest work at Google on the accessibility of mobile > telephones, Raman has devised a means of making touch screen input > achievable > in an "eyes-free" context. Being not just "eyes-free" but providing equivalent facilities using touch and sound were basic design criteria for "Touch-'n-Talk" as you obviously realise from reading the paper. > > I trust that this digression into speech interface research is not > unwelcome > on the list; to ensure that it remains on topic, I have sought to > connect it > to functional requirements of a text to speech system. > > I think there is a need for a free (as in freedom) tts system > capable of > supporting the products of past and current speech interface > research, while, > just as importantly, providing opportunities for future research > and free > software development efforts. > > I also agree with David's observation that many of the most important > requirements are already treated in his 1988 paper, although > advances such as > Raman's "audio formatting" techniques create additional, desirable > features. > > I am not sure what was missing compared to Raman's approach. If you have spatial references through touch, the changing pitch is more of a distraction that a help, IMHO. Warm regards. david ------------- _______________________________________________ gnuspeech-contact mailing list gnuspeech-contact@... http://lists.gnu.org/mailman/listinfo/gnuspeech-contact |
| Free embeddable forum powered by Nabble | Forum Help |