My blog, imported from Blogger and converted using Jekyll.

Pennseythun Gernewek 2017

Jun 20, 2017

Back in April I attended the Pennseythun Gernewek, the annual Cornish language weekend run by Kowethas an Yeth Kernewek where Cornish speakers from around Cornwall, and beyond gather to speak the language, and participate in lessons and activities. There is something for every level of previous knowledge of the language, from Dallethoryon Lan (complete beginners) to Bagas Freth (the fluent speakers group). I definitely recommend it.
A couple of things that Bagas Freth did this year was a talk by Ken George on translating song lyrics into Cornish, and a workshop on composing a part of the lost part of Bewnans Ke (events taking place during the missing part of the manuscript before the portion that was preserved).

Bewnans Ke an Blydhynnyow Kellys

We divided into groups and composed several stanzas covering Ke in his time in Brittany before his coming to Cornwall. I partly composed this:

7 My a'th pys Duw pyth yw res     A
4 dhymm gul ragos     B
7 A wre'ta ow dannvonn mes A
4 A'n benneglos B
4 Ny wrav govynn      C
7 bywnans fethus yn palas    D
7 Argheskop gans boes a vlas     D
7 gwell yw genev rewlys tynn     C

7 Vyajya a wre'ta'n skon A
4 Bys Rosewa B
7 War an mor y glywydh son A
4 Klogh y'n le na B
4 Vydh ow seni C
7 Grasek ov bos yn ayr fresk  D
7 Y'n le skrifa orth ow desk D
7 Dhe Gernow mos a wren ni C

The numbers at the left are the number of syllables, and the letters at the right mark the rhymes.
There are two formats used in Bewnans Ke that we followed in composing these stanzas. This had a structure of 7, 4, 7, 4, 4, 7, 7,7 syllables per line. The other possible structure was 7,7,7,7,4,7,7,7. With either of these structures either one person was speaking for all 8 lines, or person 1 for 5 lines and person 2 for 3. In the first of the stanzas above, it is all Ke speaking, and the second has 5 lines of God in reply and 3 with Ke again.
A broad translation is as follows:

I pray God what must I do for thee. Will you send me away from the cathedral?
I do not ask for a luxurious life in an archbishops palace with fine food. I prefer a strict rule.
You will voyage soon, to Rosewa (placename, meaning the Roseland?) On the sea you will hear a sound of a bell ringing there.
I am grateful to be in fresh air rather than writing at my desk. To Cornwall we will go.

Output of taklow-kernewek syllable segmentation tool.

Output of taklow-kernewek syllable segmentation tool. I changed to forward segmentation due to an issue with the word 'vyajya' see below.

Detailed output for 'Vyajya' - forward segmentation. The regular expression starts at the beginning of the word and works forwards.

Detailed output for 'Vyajya' - forward segmentation. The regular expression starts at the end of the word and works backwards. Here we see the '-ya' at the end is matched first, then 'yaj' is matched with a semi-vocalic y which leaves the 'V' unmatched
The issue with the word vyajya shows that perhaps more work is needed on the syllable segmentation program, perhaps to warn the user if not all of the input is consumed by the segmentation, and also to perhaps return multiple segmentations where they exist.

Song lyrics

In another of the sessions, Ken George covered some aspects of translating a song, including observing its metre and rhyme, and the different styles of rhyme in songs. After the talk, we divided into smaller groups and tried translating a song each.
One group got a song in French "Charles Trenet - La Mer" which I think was quite a difficult challenge for them. I don't have a copy here of what they came up with.
I was part of a group working on Moon River (Louis Armstrong).
With a bit of artistic licence and adaptation:


Moon river, wider than a mile
I'm crossing you in style some day
Oh, dream maker, you heart breaker
Wherever you're goin', I'm goin' your way

Two drifters, off to see the world
There's such a lot of world to see
We're after the same rainbow's end, waitin' 'round the bend
My huckleberry friend, moon river, and me

Noting rhymes

mile, style

day, way

maker, breaker

world, world

see, me

end, bend, friend


Dowr Tamar, moy es mildir bras
Y treusyav dha dhowr glas neb jydh
Ty a wra hunros, mestres fell dell os
Pub le a wre'ta mos, ena my a vydh
Dew wandror mos a-dro dhe'n bys
Meur a'n norvys a welyn ni

Ni a helgh penn kammneves, ni a wort a-dro dhe'n kamm
Ha my a wra ri dhis amm
Dowr Tamar, ha my.

General Election 2017 - and trying out the Python pandas library

Jun 10, 2017

I am happy to report that my computer is working again, after a reformat, Windows 7 installation, then a long delay while Ubuntu partition resizer stalled, then fixed that with a GParted iso, and then reinstalled Ubuntu 17.04 and I am part way through restoring data from backups.
QGIS is working with version 2.18.

Although I have used Python csv, matplotlib and numpy libraries to read data from files and plot I hadn't used the pandas library for anything much, so I thought I'd do so. I have in previous code often built up a list manually by setting data = [] and then using append to build up the list, which can be slow for large datasets.

First I need some data, and I will use the general election results for the 6 parliamentary constituencies for the House of Commons in Cornwall:

Camborne and Redruth,EUSTICE,Charles George,Conservative Party,23001,70.96
Camborne and Redruth,WINTER,Graham Robert,Labour Party,21424,70.96
Camborne and Redruth,WILLIAMS,Geoffrey,Liberal Democrats,2979,70.96
Camborne and Redruth,GARBETT,Geoffrey George,Green Party,1052,70.96
North Cornwall,MANN,Scott Leslie,Conservative Party,25835,74.2
North Cornwall,ROGERSON,Daniel John,Liberal Democrats,18635,74.2
North Cornwall,BASSETT,Joy,Labour Party,6151,74.2
North Cornwall,ALLMAN,John William,Christian Peoples Alliance,185,74.2
North Cornwall,HAWKINS,Robert James,Socialist Labour Party,138,74.2
South East Cornwall,MURRAY,Sheryll,Conservative Party,29493,74.2
South East Cornwall,DERRICK,Gareth Gwyn James,Labour Party,12050,74.2
South East Cornwall,HUTTY,Philip Andrew,Liberal Democrats,10346,74.2
South East Cornwall,CORNEY,Martin Charles Stewart,Green Party,1335,74.2
St Austell and Newquay,DOUBLE,Stephen Daniel,Conservative Party,26856,69.3
St Austell and Newquay,NEIL,Kevin Michael,Labour Party,15714,69.3
St Austell and Newquay,GILBERT ,Stephen David John ,Liberal Democrats,11642,69.3
St Ives,THOMAS,Derek,Conservative Party,22120,76.1
St Ives,GEORGE,Andrew Henry,Liberal Democrats,21808,76.1
St Ives,DREW,Christopher John,Labour Party,7298,76.1
Truro and Falmouth,NEWTON,Sarah Louise,Conservative Party,25123,75.9
Truro and Falmouth,KIRKHAM,Jayne Susannah,Labour Party,21331,75.9
Truro and Falmouth,NOLAN,Robert Anthony,Liberal Democrat,8465,75.9
Truro and Falmouth,ODGERS,Duncan Charles,UK Independence Party,897,75.9
Truro and Falmouth,PENNINGTON,Amanda Alice,Green Party,831,75.9

Here is the Python code, which expects the above data in a file called electionresults2017.csv which it reads using csv.DictReader which produces an iterator which I convert to a list and create a pandas data frame object.
The code is also available in the dataviz-sandbox repository at my Bitbucket account.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import csv

def chooseColour(desc):
if desc == "Labour Party":
return "red"
elif "Liberal" in desc:
return "yellow"
elif "Green" in desc:
return "green"
elif "Conservative" in desc:
return "blue"
return "magenta"

with open('electionresults2017.csv', 'r') as spamreader:
dframe = pd.DataFrame(list(csv.DictReader(spamreader)))

consts = set(dframe['Constituency'])
fig = plt.figure()
plt.suptitle("Distribution of Votes in Cornwall\nGeneral Election 8th June 2017")
for p, c in enumerate(consts):
# print("Constituency of {}".format(c))
surnames = dframe.loc[dframe.Constituency == c, ['Surname']].values
forenames = dframe.loc[dframe.Constituency == c, ['Forenames']].values
descs = [d[0] for d in dframe.loc[dframe.Constituency == c, ['Description']].values]
plotcolours = [chooseColour(d) for d in descs]
forenames = [f for f in forenames]
forename = [f[0].split()[0] for f in forenames]
names = [f+" "+s for f,s in zip(forename, surnames)]
names = [n[0] for n in names]
votes = dframe.loc[dframe.Constituency == c, ['Votes']].values
votes1 = [v[0] for v in votes]
namedescsvotes = [n+"\n"+d+"\n"+v for n,d,v in zip(names, descs, votes1)]

totalvotes = np.sum(votes,
# print(totalvotes)
ax = fig.add_subplot(2, 3, p+1)
ax.set_title("{C}: {t} votes cast".format(C=c, t=totalvotes))
ax.pie(votes, radius = np.sqrt(totalvotes/50000.0), labels=namedescsvotes, colors=plotcolours, autopct='%1.1f%%')

fig2 = plt.figure()
plt.suptitle("Representation of Cornwall in the House of Commons")
for p, c in enumerate(consts):
descs = dframe.loc[dframe.Constituency == c, ['Description']].values
descs = descs[0]
plotcolours = [chooseColour(d) for d in descs]
ax = fig2.add_subplot(2, 3, p+1)
ax.pie([1], labels=descs, colors=plotcolours)

And the results:

The votes cast for the various candidates and parties in each of the 6 constituencies covering Cornwall and the Isles of Scilly. George Eustice MP uses his middle name rather than his first name Charles.
In comparison to votes cast, here are the parties represented in the House of Commons for constituencies in Cornwall.

I have also tried out matplotlib_venn. It takes as arguments the keyword subsets, which for the function venn2 expects A and (not B), (not A) and B, and (A and B). Below, voteothers is the number who voted for non-elected candidates, the second is by definition an empty set (those who didn't vote and cast a vote for the winner), and votewinner is those who voted for the winner.

Since Cornwall is a one-party state, the colours can be hard-coded.

import matplotlib.pyplot as plt
import matplotlib_venn as venn
# subsets = (Ab, aB, AB)
v = venn.venn2(subsets=(voteothers, 0, votewinner), set_colors =('lightgray', 'navy'), set_labels=('Other candidates', winnername))

The function venn3 expects a 7 element tuple as below. In this case, A is the electorate, B is those who voted for the winner, and C is those who voted for other candidates.

import matplotlib.pyplot as plt
import matplotlib_venn as venn
# subsets=(Abc, aBc, ABc, abC, AbC, aBC, ABC)
v = venn.venn3(subsets=(novote, 0, votewinner, 0, voteothers, 0, 0), set_colors =('lightgray', 'blue', 'red'), set_labels=('electoral register', winnername,'Other candidates'))

It would be nice to make the zero sets disappear, maybe there is a way to do this in the documentation somewhere.

Computer troubles and further updates to TaklowKernewek tools

Jun 4, 2017

My desktop computer is currently incapacitated, it has for the past day and a half been resizing a NTFS partition after I made the decision, after breaking my QGIS installation while attempting to upgrade it to 2.18, to go back to a dual-boot system rather than a Windows virtual machine running in Ubuntu. Meanwhile I had also broken my netbook because it crashed while trying to upgrade to Ubuntu 17.04. I fixed this by doing a reinstall from a bootable USB stick using the UbuntuMATE iso, formatting only the root partition leaving data in /home intact.

This shouldn't deter anyone from using QGIS, or Ubuntu, just that I was doing odd things with GDAL and kealib, making myself able to use multiband .kea files  in QGIS running with the system environment variables, but use conda to run RSGISLib in Python 3, and also conda to run the European Space Agency's SNAP toolbox to process Sentinel 2 images.
There was a resulting mismatch between different versions of GDAL which caused problems when I tried to update to QGIS 2.18, which failed due to conflicting dependencies, and then trying to revert to 2.14 also failed, so therefore QGIS no longer worked at all.
With my netbook, I had installed so much on it that the root partition was close to full and I think that was what started causing problems.

Fortunately all data is backed up on external drives.

However it is not so easy to do much mapping work on my netbook, although I do now have QGIS 2.18 installed on it, but I have done a bit more on TaklowKernewek, including making the netbook mode work for more of the apps, and developing the mathematics quiz app.

The netbook modes were modified to allow the corpus statistics app, and a couple of others to fit the screen of my netbook (EeePC 1005HA), by adjusting font sizes as well as some input/output box heights.
Inflecting the verb 'covhe' (in SWF Tradycyonal) - Cornish for 'remember'

Inflecting the verb 'covhe' (in SWF Tradycyonal) - Cornish for 'remember'

Basic translation memory using Skeul an Yeth 1 example sentences. The font size of three of the four buttons has been shrunk a little so that labels do not grow bigger than the box itself.

more of the output

some further output of the translation memory

word frequency table for words of 5 or more letters in 'Origo Mundi'. The font sizes are quite small to read to make sure that the boxes do not fall off the bottom of the screen.
The word frequency bar chart has been tweaked as to not use white as a colour, since matplotlib may not always outline the bars.

counting syllables

Transliterating from Kernewek Kemmyn to Standard Written Form. It is possible that goelann should instead go to goolan in SWF, it certainly did in 2008 SWF, and may still do so despite being a multisyllable word.

Mathematics Quiz app

A new difficulty level allow input numbers up to 100, and the GUI has been adjusted to work better on smaller screens (though further work on this may be needed).
The options radio buttons have been consolidated into one column in the GUI

The program now reports back the answer to the previous question that you gave if you were correct as well as if you were wrong. This still looks a little confusing since the question now in the upper box is the second question whereas the answer is for the first. If you get a question right, you get 1 point. The bonus for speed has been reduced compared to earlier versions, to get any speed bonus at all you need to answer within 10 seconds.

'Pur gales' allows the computer to choose numbers up to 100, in this version the addition and subtraction are shown as symbols, due to possible confusion with 'ha' internal to the number such as 'dew ha dew ugens' (42). 'tri ha dew ugens marnas dew ha dew ugens' might be interpreted as 43 - 2 + 40 = 81 rather than 43 - 42 = 1.

There is a need to make some further adjustments to the Tkinter GUI code since at present the widgets aren't filling the window after it is maximized.

Updating Mars 'Top Trumps' webpages

May 28, 2017

I have previously created a website for Souness Glacier Top Trumps, based on Colin Souness' work on candidate mid-latitude glaciers on Mars, and my MSc thesis on them.

One of the things covered is whether the object has coverage with the High Resolution Imaging Science Experiment on Mars Reconnaisance Orbiter.

The HiRISE team are continually releasing new images as the Mars Reconnaisance Orbiter is still operating.

I have some horribly obsfucated Python code that can match the shapefile coverage of the Souness objects, to coverage footprint shapefiles, after using QGIS to reproject to a common coordinate system.

I have recently updated my Top Trumps webpages to use shapefiles up to 4th May 2017.

However, the Mars Express tiles remain the same, since the data releases of High-Resolution Stereo Camera process to level 4 (including the derived digital terrain model) are available at NASA Planetary Data System only up to orbits up to 12th Feb 2009.

There do appear to be newer ones recently uploaded at the Freie Universität Berlin website, although they are not in the same format as the ones I used from NASA PDS and they are not in the European Space Agency Planetary Science Archive or NASA PDS yet.

The website currently says:

Archive status (highest released orbit): f836 (levels 2 & 3, PSA), 6567 (level 4, PSA), d795 (level 4 VICAR, HRSCview)

To incorporate these would require a more comprehensive reanalysis of the data, since some more Souness objects would gain digital terrain model coverage and some would gain improved resolution coverage.

An example Souness object which now has links to additional HiRISE footprints from the University of Arizona HiRISE webpage.
I also fix a bug whereby if HiRISE covered all of the bounding box of the 'context' of a Souness object, the png overlay for the HiRISE coverage would show only transparency, due to the way in which ImageMagick was used to colourise the rasterised shapefile.

The convert command was modified from
convert {i} +level-colors black,yellow {o}
convert {i} +level-colors ,yellow {o}

so that in the input file,  the value 255 is taken only as the white point, not as both black and white which produced a blank image containing only transparency.

e.g. Souness 83

Souness 83, where the bounding box is wholly covered by the footprint of the HiRISE image PSP_010345_2150.

Bilingual interface to Cornish Corpus Statistics Python GUI application and switching between Kemmyn and manuscript spelling

May 6, 2017

As part of my taklow-kernewek tools, I created an application that can do some corpus statistics on Cornish texts, and is configurable at run-time to a certain extent.

I have made a few improvements to the files and, which include a bilingual interface, and ability to switch between using Kernewek Kemmyn and manuscript spelling (or at least a reading of such).

To create a switchable bilingual interface, I overhauled the GUI code to make it more object orientated, and created a set of dictionaries where the keys each refer to another dictionary with 2 elements {'en': 'English text', 'kw': 'Cornish text'}.

A button in the GUI then runs a function that changes the interface language, and alters the text in all of the relevant widgets to use that in the new language.

It can also be specified when running, at the command line as the -e switch which will launch with English interface.

English interface. The button at the lower left allows switching betweeen the two.

Cornish interface
I have also fixed a bug that happened when there were no words longer than the specified number of letters, and the list of word frequencies is generated. Internally, what happens inside is that the length of the longest word is found, so that the output text is spaced appropriately. Now it checks whether there is an empty list of tuples of (word, frequency) to avoid an indexing error.

The other thing I have done is fix a bug when the manuscript spelling was selected (previously only via command line -m switch, but now also in the GUI, as below). There is a different list of texts available in manuscript vs. Kemmyn, which had previously caused the program to have an index error in some cases.

Unfortunately I still have an issue with TkInter, since when switching between Kemmyn <--> manuscript there is an extra empty space generated, which needs a bit of adjustment to how the widgets pack etc. I got a bit confused when trying to fix it so am leaving it in for now.
Kemmyn (top left) and manuscript (bottom right). These windows have been launched direct from the command line, with the manuscript one launched by " -m" to choose the manuscript spelling texts rather than Kemmyn.

Annoying issue with space at the left appearing after switching within the GUI to manuscript spelling.

Update 07/05/17 - Tkinter bug fixed

After spending a while looking at my copy of Programming Python I found the pack_forget() method, which I used to remove the buttons at the lower left (language and manuscript mode switch) while the text choice list is repopulated with new radio buttons, and then the buttons are repacked afterwards.
I also show in the heading above the list texts which mode the program is in.

In Kemmyn mode, showing the most frequent words of at least 5 letters in Passyon agan Arloedh

In manuscript spelling

All Posts