Comments on the problems i found writing this: -Festival is pretty cool, you can even try different voices for different singuers, though i couldnt figure out how to do this yet. -You need the latest version of festival for this, the command to execute is: text2wave -mode singing -o file.wav song.xml -I havent found any way to make silences, so for now I had to write a lot of XMLs -If your phrase starts with a vowel, festival screws up! try to add a previous word that doesnt and cut it out from the resulting .wav -Sometimes, festival just cant pronounce words at all. It is best to either split the word in smaller syllabes or find an alternate-better sounding word that pronounces the way we wish -Technically, for each word you write the time and note for each syllabe, but sometimes festival thinks words have more syllabes than they do. Just split the tempo in 2 and repeat the note, will sound fine. -The output sound is often clicky and noise, but you can get good results if you use a multiband EQ to filter out the undesired freqs.