- Hermann Moisl
Dr. Hermann Moisl is a senior lecturer in
linguistics atNewcastle University and was educated at various institutes, includingTrinity College Dublin and theUniversity of Oxford .Research interests include, amongst others,
computational linguistics ,Natural Language Processing ,corpus linguistics , the cultural role ofliteracy andCeltic languages and history.He is an investigator of the
Newcastle Electronic Corpus of Tyneside English Project with colleague Dr Karen Corrigan.Other Expertise
Old /
Middle English andOld Irish /Middle Irish language andliterature .Current Work
Dr. Moisl current work is described on his personal page: http://www.staff.ncl.ac.uk/hermann.moisl/
Selected Publications
Moisl HL, Maguire W, Allen W. Phonetic variation in Tyneside : Exploratory multivariate analysis of the Newcastle Electronic Corpus of Tyneside English. In: Frans Hinskens, ed. Language Variation - European Perspectives : Selected Papers from the Third International Conference on Language Variation in Europe (ICLaVE 3), Amsterdam, June 2005. Amsterdam: John Benjamins Publishing Co, 2006, pp. 127-141.
Moisl HL, Maguire W. Identifying the Main Determinants of Phonetic Variation in the Newcastle Electronic Corpus of Tyneside English. Journal of Quantitative Linguistics 2008, 15, 46-69
Moisl HL, Jones VM. Cluster Analysis of the Newcastle Electronic Corpus of Tyneside English : A Comparison of Methods. Literary and Linguistic Computing 2005, 20(Suppl 1), 125-146
Moisl HL. Exploratory Multivariate Analysis. In: Lüdeling A, Kytö M, ed. Corpus Linguistics. An International Handbook (Series: Handbücher zur Sprache und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science). Berlin: Mouton de Gruyter (in press), 2008.
Moisl HL, Corrigan K, Beal J. The Newcastle Electronic Corpus of Tyneside English. Arts and Humanities Research Council, 2005. Available at: http://www.ncl.ac.uk/necte/.
Moisl HL. Linguistic Computation with State Space Trajectories. In: Stefan Wermter, Jim Austin, David Willshaw, ed. Emergent Neural Computational Architectures Based on Neuroscience : Towards Neuroscience-Inspired Computing. Springer, 2001, pp. 442-460.Projects
A Linguistic
Time-Capsule : The Newcastle Electronic Corpus of Tyneside English (NECTE), AHRB project code RE11776Project Leaders: K. Corrigan, J. Beal, H. Moisl Postgraduate SupervisionNatural language modelling and text processing. My specific areas of interest arenatural language understanding systems and multivariate analysis of text corpora.Qualifications
BA (
McGill University ) MPhil (Trinity College Dublin ) DPhil (University of Oxford ) MSc (Newcastle University )Postgraduate teaching
SEL8000
Research Methods in language and linguistic Postgraduate Supervision
Natural language modelling and
text processing . My specific areas of interest are natural language understanding systems andmultivariate analysis of textcorpora .Undergraduate Teaching
SEL1007
The Nature of Language SEL2008 Text and
Hypertext SEL 2092
Computational Linguistics Current work
Implementation of
natural language understanding systems usingdynamic attractor sequences He have been developing a strictly sequential natural language understanding architecture that dispenses with two foundational principles of generative linguistics, mainstream cognitive science, and much of artificial intelligence --that natural language strings have complex syntactic structure processed by structure-sensitive algorithms, and that this structure is crucial in determining string semantics. This sequential architecture was originally stated in terms of standard automata theory as a system of cooperating finite state automata, but more recently I have become interested in neuroscientific work which identifies chaotic attractor trajectory in state space as the fundamental principle of brain function at a level above that of the individual neuron, and which indicates that sensory processing, and perhaps higher cognition more generally, are implemented by cooperating attractor sequence processes. Some relevant publications are:
Moisl, H., (1992) 'Connectionist finite state natural language processing', Connection Science 4, 67 - 91.
Moisl, H.,(1997) 'Recurrent neural networks and natural language processing', New Methods in Language Processing, ed. D. Jones & H. Somers, UCL Press, London, 69-82
Moisl, H.,(2000) Handbook of Natural Language Processing, Marcel Dekker (with R. Dale of Macquarie University and H. Somers of UMIST)
Moisl, H.,(2001) 'Artificial neural networks andnatural language processing ', Encyclopedia of Library and Information Science, ed. M. Drake, Marcel Dekker (in press)
Moisl, H.,(2001) 'Linguistic computation with state space trajectories', in Emergent Neural Computational Architectures based on Neuroscience, ed. Stefan Wermter, Jim Austin, David Willshaw, Springer, 2001Natural language corpus creation Together with Karen Corrigan of Newcastle University and Joan Beal of Sheffield University, He have recently completed the Newcastle Electronic Corpus of Tyneside English (NECTE), a corpus of dialect speech from Tyneside in North-East England. It is based on two pre-existing corpora, one of them collected in the late 1960s by the Tyneside Linguistic Survey (TLS) project, and the other in 1994 by the Phonological Variation and Change in Contemporary Spoken English (PVC) project. NECTE amalgamates the TLS and PVC materials into a single Text Encoding Initiative (TEI)-conformant XML-encoded corpus and makes them available in a variety of aligned formats: digitized audio, standard orthographic transcription, phonetic transcription, and part-of-speech tagged. This website describes the NECTE corpus in detail, and makes it available to academic researchers, educationalists, the media in non-commercial applications, and organisations such as language societies and individuals with a serious interest in historical dialect materials.
Exploratory multivariate analysis of text corpora Since completion of the NECTE project he have been developing a methodology for sociolinguistic and dialectological study of the corpus, the aim of which is to attempt to identify interesting regularities in phonetic variation among informants in the corpus, and any correlations between such variation and associated social factors. The methodology is based on the one formulated by the originators of much of the NECTE corpus, the Tyneside Linguistic Survey (TLS). It was radical at the time and remains so today: in contrast to the then-universal and still-dominant theory driven approach, where social and linguistic factors are selected by the analyst on the basis of some combination of an independently-specified theoretical framework, existing case studies, and personal experience of the domain of enquiry, the TLS proposed a fundamentally empirical approach in which salient factors are extracted from the data itself and then serve as the basis for model construction. To implement its approach the TLS used a particular exploratory multivariate analytical technique, hierarchical cluster analysis, but its work never progressed beyond preliminary studies for a variety of theoretical and practical reasons. His development of the TLS methodology
i. uses a range of linear and nonlinear exploratory analytical methods in addition to hierarchical cluster analysis, such as multidimensional scaling and self organizing maps, and
ii. pays particular attention to issues in data creation which are crucial to the validity of analytical results: document length normalization, dimensionality reduction, and data nonlinearity.
Relevant publications are:
Moisl, H. and Beal, J.C. (2001) ‘Corpus Analysis and Results: Visualization Using Self-Organizing Maps’, Corpus Linguistics 2001, Lancaster University, 386-391. Electronic Publication.
Moisl, H, Jones, V (2005) Cluster analysis of the Newcastle electronic corpus of Tyneside English: a comparison of methods. Centre for Telematics and Information Technology [CTIT] ; TR 2005/65) {A-53328} University of Twente.
Moisl, H, Jones V., (2005) 'Cluster analysis of the Newcastle Electronic Corpus of Tyneside English: a comparison of methods', Literary and Linguistic Computing 20, 125-46. [Online journal version] [Preprint]
Moisl, H., Maguire W, Allen W., (2006) 'Phonetic variation in Tyneside: exploratory multivariate analysis of the Newcastle Electronic Corpus of Tyneside English'. In: F. Hinskens, ed. Language Variation. European Perspectives. Amsterdam: Meertens Institute. [Preprint]
Allen, W.H., Beal, J.C., Corrigan, K.P., Maguire, W. and Moisl, H.L. (2007) ‘A Linguistic ‘Time-Capsule’: The Newcastle Electronic Corpus of Tyneside English’ in Beal, J.C., Corrigan, K.P. and Moisl, H.L. (eds.) Creating and Digitising Language Corpora, Vol. 2: Diachronic Databases. Houndmills: Palgrave Macmillan, 16-48. [Preprint]
Moisl, H. (2007) Data nonlinearity in exploratory multivariate analysis of language corpora, Computing and Historical Phonology. Proceedings of the Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology, June 28 2007, ed. J. Nerbonne, M.Ellison, G.Kondrak, Association for Computational Linguistics, 93-100. [Online publication]
Moisl, H. (2008) 'Exploratory Multivariate Analysis', In: Lüdeling A, Kytö M, ed. Corpus Linguistics. An International Handbook (Series: Handbücher zur Sprache und Kommunikationswissenschaft/Handbooks of Linguistics and Communication Science). Berlin: Mouton de Gruyter. [Preprint]
Moisl, H., Maguire, W. (2008) 'Identifying the Main Determinants of Phonetic Variation in the Newcastle Electronic Corpus of Tyneside English', Journal of Quantitative Linguistics 15, 46-69. [Preprint]
Moisl, H. (2008) 'Using electronic corpora to study language variation: the problem of data sparsity', currently being reviewed for Studies in Language Variation. Selected papers from ICLaVE 4, ed. P. Pavlou, University of Cyprus [Preprint]
Moisl, H. (2008) 'Using electronic corpora in historical dialectology research: the problem of document length variation', currently being reviewed for Proceedings of the Second International Conference on English Historical Dialectology, University of Bergamo, ed. M. Dossena [Preprint]
Moisl, H. (2008) Normalization for Variation in Document Length in Exploratory Multivariate Analysis of Text Corpora, Proceedings of INFOS2008: 6th International Conference on Informatics and Systems, Cairo University, 27-29 March 2008 [Preprint]
Wikimedia Foundation. 2010.