Thanks to the qbiblefree app (yes, I still use my N900 - and I'm not changing it till something better - read more hackable - comes along), I found out about Zefania XML format. So now I kinda know what format the bible would be in if I were ever to consider starting the project. The next question - were to get actual yoruba bible data from was kinda solved with a quick google search. The good people at http://www.africanportal.net/ABO/BibeliAtoka/ have uploaded pdf files containing bible books. Yes, there's still the problem of breaking it down into chapters and perhaps verses. Sejda has been a little helpful there - the books can be broken down into pages. It's not a one-to-one mapping with the chapters but it's a start. I figure I can actually use the pages as they are now with a little of meta data stored in a noSQL database or just flat files.
Then there's the really crazy part of how to type of all the pdf text through a web interface. My javascript is presentable at best. Or put another way, I program better in scala than in javascript - and you won't want to gamble your life on the quality of my scala code. Any way today something cool happened. I found this site here: http://www.jawish.org/blog/archives/314-Javascript-Thaana-Keyboard-version-3.0.html. It's basically the same problem as I'm trying to solve but for a different language. So I've spent the last two hours adding little modifications and I think it's works... almost. In fact,
I realize now that there's very little excuse for me not to go ahead with this project now - that's the scary part. I even found an example of how the keyboard interface can be designed: http://www.branah.com/dhivehi. But maybe it's time I started it anyway. I know it's time to sleep - lightbulb!
I'll upload a downloadable script with tutorials and all that. For now - use the non-yoruba letters (q, z, c, x, and v) on the keyboard for the non-english letters (Ọ, Ṣ, Ẹ) in the yoruba alphabet. The x and v are for acute and grave marks (ami) - type them before typing yoruba alphabets that require ami - hope it works :)