|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
Input data as strings? I am a BioInformatics research student.Hello, I am a student interested in using fann for a bioinformatics
project. I have compiled fann and it seems to be working fine, please correct me if I am wrong, but it seems to me that strings are not acceptable input data. There doesn't seem to be anything in the documentation about acceptable data input types, but it seems like only numbers will work. I think this project is great and I am willing to contribute code to help it work with string input, unless I am just ignorant and the neural networks created by fann are inherently incapable of learning properties of strings. anyway, for debugging purposes, I am including my training data, the training program is the same as the one in the XOR tutorial, except that I changed num_input to 1. The idea is for the network to learn to recognize sentences containing "God". 7 1 1 since_calcGodulating_this_will_require_to_go_through_the_entire_training_set_once_more,_it_is_more_than_adequate_to_use_this_value_during_tr 1 A_US_Army_major_has_opened_fire_on_fellow_soldiers_at_the_Fort_Hood_military_base_in_Texas,_killing_13_people_and_injuring_30,_officials_say -1 The_United_States_imposes_high_anti-dumping_tariffs_God_on_Chinese_pipes_as_trade_disputes_mar_the_run-up_to_a_bilateral_summit. 1 GodCambodia_recalls_its_ambassador_from_Thailand_in_tit-for-tat_dispute_over_sanctuary_offer_to_former_Thai_PM_Thaksin. 1 A_gunman_in_Japan_has_killed_himself_after_wounding_three_people_in_Yokohama,_outside_Tokyo,_police_say. -1 Police_named_the_gunman_as_Kenji_Hayashi,_a_62-year-old_member_of_the_Inagawa-kai,_a_largeGod_Japanese_organised_crime_group._ 1 An_electric_car_created_by_ex-McLaren_Formula_One_designer_Gordon_Murray_has_been_unveiled. -1 -- Nate ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.Hello.
> The idea is for the network to learn to > recognize sentences containing "God". > You don't want to use ANN for that, you want regular expressions. http://en.wikipedia.org/wiki/Regular_expression Regards. ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.Thank you for your response,
I am aware that regular expressions are the correct tool for finding substrings. The reason for this program is to see if fann is able to find patterns in strings. The goal is to be able to classify peptide sequences according a particular property of the enzyme for which the sequence codes. The training set is sequences for enzymes that are known to be either thermophilic or non-thermophilic. Hopefully, the ann will learn to recognize whether or not the sequence codes for a thermophilic sequence. The experiment set is sequences for which the property is not known. So you see that I really do want to use fann, the finding "God" problem is just a simple experiment for me to learn to use the network. I have tried mapping each character to an integer, but now the number of input nodes is not constant. Is it possible to create a network with a variable number of input nodes? good day. ---nate 2009/11/6 Fernando Jiménez Solano <fernandojs@...> Hello. -- Nate ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.So far, all I have done is the try to find god problem. My project works with amino acid sequences rather than with genetic sequences, but that difference should not matter to the neural network. The problem is that when I run the training program,
I get this error: FANN Error 10: Error reading info from train data file "God.data", line: 2. The God.data contains: 8 1 1
ThisfunctionreturnstheMSEerrorasitiscalculatedeitherbeforeorduringtheactualtraining.ThisisnottheactualMSEafterthetrainingepoch,but -1 sincecalcGodulatingthiswillrequiretogothroughtheentiretrainingsetoncemore,itismorethanadequatetousethisvalueduringtraining. 1 AUSArmymajorhasopenedfireonfellowsoldiersattheFortHoodmilitarybaseinTexas,killing13peopleandinjuring30,officialssay. -1 TheUnitedStatesimposeshighanti-dumpingtariffsGodonChinesepipesastradedisputesmartherun-uptoabilateralsummit. 1 GodCambodiarecallsitsambassadorfromThailandintit-for-tatdisputeoversanctuaryoffertoformerThaiPMThaksin. 1 AgunmaninJapanhaskilledhimselfafterwoundingthreepeopleinYokohama,outsideTokyo,policesay. -1 PolicenamedthegunmanasKenjiHayashi,a62-year-oldmemberoftheInagawa-kai,alargeGodJapaneseorganisedcrimegroup. 1 Anelectriccarcreatedbyex-McLarenFormulaOnedesignerGordonMurrayhasbeenunveiled. -1 As far as I can tell, the problem is that the input data cannot be characters. Is this the case? On Fri, Nov 6, 2009 at 10:28 AM, M.Ranji <mohammad_ranji@...> wrote: > > Assuming this is one of your raw sequences (Emiliana huxleyi in this case): > ggtccggtcggattccgggatatcgtcgacccacgcgtccgctagttctagatcgcgagcggccgcccttttttttttttttttctcgggcccgggtcggctcaggagagccccccggacagccgcgcgctccacgcgaacgcggagcccgcgacggggttagacggggtacggtgcaacatcggtgtgggttggaaagaccggtaatgatccttccgcaggttcacctacggaaaccttgttacgacttctccttcctctaaatgataaggttcggacagcttcccgcggcgtcgcggctggagaaccagctgcggcgccgcagtccgggggcctcaccggatcattcaatcggtaggagcgacgggcggtgtgtacaaagggcagggacgtaatcaacgtgcgctgatgacacacgcttactaggaattcctcgttgaagattaatagttgcaataatctatccccatcacgatgcaatttcaaaagattacccggacctctcggtcaaggtgatagactcgttgagtgcatcagtgtagcgcgcgtgcggcccagaacatctaagggcatcacagacctgttattgccgcgaacttccacttgttgaagacaagttgtccctctaagaagctccagcgaacggagggttcgcgtcgctatttagcaggctgcggtctcgttcgttaacggaattaaccagacaaatcactccaccaactaagaacggccatgcaccaccacccatcgaatcaagaaagagctctcaatctgtcaatcctcacaatgtctggacctggtaagttttcccgggttgagtcaaattaagccgcaggctccactcctggtggtgcccttccgtcaatccctttagtttcagccttgcgaccatactccccccggaacccaaagactttagtttcccgaaaggtgctgaaggagcccaaatgggaaca tcctccaatcctagtcggc > > have you tried using the entire sequence as one input along with "Direction" and other properties of the sequence? You shouldn't try to map each char to be an input node if that's what you are doing. > > - Mohammad > > > --- On Fri, 11/6/09, Nathan TeGrotenhuis <groceryheist@...> wrote: > > From: Nathan TeGrotenhuis <groceryheist@...> > Subject: Re: [Fann-general] Input data as strings? I am a BioInformatics research student. > To: "FANN General and development discussion" <fann-general@...> > Date: Friday, November 6, 2009, 10:15 AM > > Thank you for your response, > > I am aware that regular expressions are the correct tool for finding substrings. The reason for this program is to see if fann is able to find patterns in strings. The goal is to be able to classify peptide sequences according a particular property of the enzyme for which the sequence codes. > The training set is sequences for enzymes that are known to be either thermophilic or non-thermophilic. Hopefully, the ann will learn to recognize whether or not the sequence codes for a thermophilic sequence. The experiment set is sequences for which the property is not known. > > So you see that I really do want to use fann, the finding "God" problem is just a simple experiment for me to learn to use the network. > > I have tried mapping each character to an integer, but now the number of input nodes is not constant. Is it possible to create a network with a variable number of input nodes? > > good day. > ---nate > > 2009/11/6 Fernando Jiménez Solano <fernandojs@...> >> >> Hello. >> > The idea is for the network to learn to >> > recognize sentences containing "God". >> > >> You don't want to use ANN for that, you want regular expressions. >> >> http://en.wikipedia.org/wiki/Regular_expression >> >> Regards. >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day >> trial. Simplify your report design, integration and deployment - and focus on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Fann-general mailing list >> Fann-general@... >> https://lists.sourceforge.net/lists/listinfo/fann-general > > > > -- > Nate > > -----Inline Attachment Follows----- > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > -----Inline Attachment Follows----- > > _______________________________________________ > Fann-general mailing list > Fann-general@... > https://lists.sourceforge.net/lists/listinfo/fann-general > > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Fann-general mailing list > Fann-general@... > https://lists.sourceforge.net/lists/listinfo/fann-general > -- Nate ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.Although, it looks like some strings are on more than one line, I have
made sure that they are not. On Fri, Nov 6, 2009 at 3:33 PM, Nathan TeGrotenhuis <groceryheist@...> wrote: > So far, all I have done is the try to find god problem. My project works > with amino acid sequences rather than with genetic sequences, but that > difference should not matter to the neural network. The problem is that when > I run the training program, > I get this error: > FANN Error 10: Error reading info from train data file "God.data", line: 2. > > The God.data contains: > > 8 1 1 > ThisfunctionreturnstheMSEerrorasitiscalculatedeitherbeforeorduringtheactualtraining.ThisisnottheactualMSEafterthetrainingepoch,but > -1 > sincecalcGodulatingthiswillrequiretogothroughtheentiretrainingsetoncemore,itismorethanadequatetousethisvalueduringtraining. > 1 > AUSArmymajorhasopenedfireonfellowsoldiersattheFortHoodmilitarybaseinTexas,killing13peopleandinjuring30,officialssay. > -1 > TheUnitedStatesimposeshighanti-dumpingtariffsGodonChinesepipesastradedisputesmartherun-uptoabilateralsummit. > 1 > GodCambodiarecallsitsambassadorfromThailandintit-for-tatdisputeoversanctuaryoffertoformerThaiPMThaksin. > 1 > AgunmaninJapanhaskilledhimselfafterwoundingthreepeopleinYokohama,outsideTokyo,policesay. > -1 > PolicenamedthegunmanasKenjiHayashi,a62-year-oldmemberoftheInagawa-kai,alargeGodJapaneseorganisedcrimegroup. > 1 > Anelectriccarcreatedbyex-McLarenFormulaOnedesignerGordonMurrayhasbeenunveiled. > -1 > > As far as I can tell, the problem is that the input data cannot be > characters. Is this the case? > > > On Fri, Nov 6, 2009 at 10:28 AM, M.Ranji <mohammad_ranji@...> wrote: >> >> Assuming this is one of your raw sequences (Emiliana huxleyi in this >> case): >> >> ggtccggtcggattccgggatatcgtcgacccacgcgtccgctagttctagatcgcgagcggccgcccttttttttttttttttctcgggcccgggtcggctcaggagagccccccggacagccgcgcgctccacgcgaacgcggagcccgcgacggggttagacggggtacggtgcaacatcggtgtgggttggaaagaccggtaatgatccttccgcaggttcacctacggaaaccttgttacgacttctccttcctctaaatgataaggttcggacagcttcccgcggcgtcgcggctggagaaccagctgcggcgccgcagtccgggggcctcaccggatcattcaatcggtaggagcgacgggcggtgtgtacaaagggcagggacgtaatcaacgtgcgctgatgacacacgcttactaggaattcctcgttgaagattaatagttgcaataatctatccccatcacgatgcaatttcaaaagattacccggacctctcggtcaaggtgatagactcgttgagtgcatcagtgtagcgcgcgtgcggcccagaacatctaagggcatcacagacctgttattgccgcgaacttccacttgttgaagacaagttgtccctctaagaagctccagcgaacggagggttcgcgtcgctatttagcaggctgcggtctcgttcgttaacggaattaaccagacaaatcactccaccaactaagaacggccatgcaccaccacccatcgaatcaagaaagagctctcaatctgtcaatcctcacaatgtctggacctggtaagttttcccgggttgagtcaaattaagccgcaggctccactcctggtggtgcccttccgtcaatccctttagtttcagccttgcgaccatactccccccggaacccaaagactttagtttcccgaaaggtgctgaaggagcccaaatgggaaca >> tcctccaatcctagtcggc >> >> have you tried using the entire sequence as one input along with >> "Direction" and other properties of the sequence? You shouldn't try to map >> each char to be an input node if that's what you are doing. >> >> - Mohammad >> >> >> --- On Fri, 11/6/09, Nathan TeGrotenhuis <groceryheist@...> wrote: >> >> From: Nathan TeGrotenhuis <groceryheist@...> >> Subject: Re: [Fann-general] Input data as strings? I am a BioInformatics >> research student. >> To: "FANN General and development discussion" >> <fann-general@...> >> Date: Friday, November 6, 2009, 10:15 AM >> >> Thank you for your response, >> >> I am aware that regular expressions are the correct tool for finding >> substrings. The reason for this program is to see if fann is able to find >> patterns in strings. The goal is to be able to classify peptide sequences >> according a particular property of the enzyme for which the sequence codes. >> The training set is sequences for enzymes that are known to be either >> thermophilic or non-thermophilic. Hopefully, the ann will learn to recognize >> whether or not the sequence codes for a thermophilic sequence. The >> experiment set is sequences for which the property is not known. >> >> So you see that I really do want to use fann, the finding "God" problem is >> just a simple experiment for me to learn to use the network. >> >> I have tried mapping each character to an integer, but now the number of >> input nodes is not constant. Is it possible to create a network with a >> variable number of input nodes? >> >> good day. >> ---nate >> >> 2009/11/6 Fernando Jiménez Solano <fernandojs@...> >>> >>> Hello. >>> > The idea is for the network to learn to >>> > recognize sentences containing "God". >>> > >>> You don't want to use ANN for that, you want regular expressions. >>> >>> http://en.wikipedia.org/wiki/Regular_expression >>> >>> Regards. >>> >>> >>> ------------------------------------------------------------------------------ >>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >>> 30-Day >>> trial. Simplify your report design, integration and deployment - and >>> focus on >>> what you do best, core application coding. Discover what's new with >>> Crystal Reports now. http://p.sf.net/sfu/bobj-july >>> _______________________________________________ >>> Fann-general mailing list >>> Fann-general@... >>> https://lists.sourceforge.net/lists/listinfo/fann-general >> >> >> >> -- >> Nate >> >> -----Inline Attachment Follows----- >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >> 30-Day >> trial. Simplify your report design, integration and deployment - and focus >> on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> -----Inline Attachment Follows----- >> >> _______________________________________________ >> Fann-general mailing list >> Fann-general@... >> https://lists.sourceforge.net/lists/listinfo/fann-general >> >> >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 >> 30-Day >> trial. Simplify your report design, integration and deployment - and focus >> on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Fann-general mailing list >> Fann-general@... >> https://lists.sourceforge.net/lists/listinfo/fann-general >> > > > > -- > Nate > -- Nate ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.I'm not sure that giving raw text or even a numeric representation of a chain as an input is a good idea... How long is the largest sequence? (aminoacids)
When you create have to leave a space between each input in the training file if you want the ANN to check letter by letter. Also, you have to use numbers but that shouldn't be an issue. The real problem comes from the ammount of inputs you would have to give (one for each letter in the chain) and the fact that you have to specify the number of inputs at the begining of the training file so if you have protein chains of different lengths as inputs you are going to be forced to fill the training pattern with something. Have you considered working with thermodynamical properties of aminoacids instead of the aminoacid identity itself? That would reduce the ammount of inputs although the length difference issue would remain. -- Everardo ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
|
|
|
Re: Input data as strings? I am a BioInformatics research student.Hi,
I want to use SOM for pattern recognition and I was wondering when is the 2.2 version going to be realeased. I hope it doesn't take long... -- Everardo ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Fann-general mailing list Fann-general@... https://lists.sourceforge.net/lists/listinfo/fann-general |
| Free embeddable forum powered by Nabble | Forum Help |