Syllable Segmentation with Regular Expressions

I have written a Python program to segment Cornish words to syllables. I have uploaded it to my Github repository. There are no doubt a few bugs to iron out. My main goal with it was as a preliminary to a program that will transliterate from Kernewek Kemmyn to SWF and back, but I had it suggested to me by Peter Jenkin it could be developed into a tool to analyse length of syllables and prosody in verse.

I have now added indication of stress, and syllable length (taking account of some abnormally stressed words), considering a short vowel to be 1 unit, a half-long vowel 2 units, a long vowel 3 units, a single consonant cluster 1 unit and a gemminated double consonant in a stressed syllable 2 units.

Example usage:
$ python sylabelenn_ranna_kw.py testenn.txt
An ger yw: delennow
Niver a syllabelennow yw: 3
Hag yns i:
['del', 'enn', 'ow']
S1: del, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: ENN, VC, hirder = [1, 2], hirder kowal = 3
S3: ow, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 7

An ger yw: roes
Niver a syllabelennow yw: 1
Hag yns i:
['roes']
S1: ROES, CVC, hirder = [1, 3, 1], hirder kowal = 5
Hirder ger kowal = 5

An ger yw: roesweyth
Niver a syllabelennow yw: 2
Hag yns i:
['roes', 'weyth']
S1: ROES, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: weyth, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7

An ger yw: kesroesweyth
Niver a syllabelennow yw: 3
Hag yns i:
['kes', 'roes', 'weyth']
S1: kes, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: ROES, CVC, hirder = [1, 2, 1], hirder kowal = 4
S3: weyth, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 10

The text of Gwreans an Bys segmented by syllable: in a text file. The procedure used was to start at the end of a word and work backwards. This sometimes assigns a consonant to a final syllable where it may better be understood as part of the penultimate syllable. An alternative segmenation starting at the beginning of a word and working forwards is here.

GUI

Choose long mode, short mode, or line mode with the radio buttons on the left, and either forwards or backwards segmentation. In line mode, as shown, the total number of syllables in each line is counted.
In long mode, the syllable details are reported, including the actual division of the words into syllables, and the lengths of each syllable part.