English abstract

 

The project for the creation of the “Electronic Archive of Carlo Emilio Gadda’s Works” was started in 1994, when publisher Garzanti made their published text materials (Gadda, 1988-93), already prepared for photo-composition and for study purposes only, available to the Istituto di Linguistica Computazionale (ILC, Institute of Computational Linguistics).


The first version of the Corpus


The Corpus was created starting from these texts (33), appropriately encoded under DBT format and by applying a mark-up to point out various phenomena (italic characters, capital words of proper nouns or decided by the author, hyphens, dates, numbers, formulas, abbreviations, acronyms, pictures, author’s and editor’s footnotes, dialect and foreign words, dialogue, poetic language). The first version of the Corpus, running in DBT 3.0, was presented on November 14, 1997 (the 104st anniversary of Gadda’s birthday) at the CNR (National Research Centre) in Rome.


The first results


In 1997 we became the first users of this database, which is the only electronic archive existing to contain all the works of a contemporary Italian author. While the community of Gadda scholars is starting to demand something that cannot be distributed but is available for consultation at ILC – concordances -, we have started to:
- produce lexicographic tool standards by means of an automatic processing system;
- apply innovative methodologies onto sample subsystems.

 

The CEG Project


The CEG (Carlo Emilio Gadda) Project was created in 1999 with two objectives:
- to transfer the Archive into the new DBT version, thus allowing new functions to be used;
- to create a cultural laboratory containing texts by Gadda, bibliographical data, links and special lexicographical resources.

 

The web site

In 1999, when we had become more familiar with computer graphics and HTML or XML formats, we began to plan the creation of a web site on the works written by and about Gadda. Although we had not received an ad hoc funding, at the beginning of 2000 we started to create the CEG web site exclusively based on our know-how. The web site advertised the history of the project and made our first paper publications available online.

 

Linguistic resources in progress


While in the first phase of our database usage activity we have produced lexicographic resources mainly by applying simple system functions, the method we are prevalently using at present consists in processing the results obtained with different instruments.

Example I: contrastive concordances


The experience made with the construction of a comparison between the two versions of Quer pasticciaccio brutto de via Merulana - the first (QPL) being issued in instalments in the years 1946-47 in the Letteratura magazine and the second (QP) published by Garzanti in 1957 - was very useful to identify a method to perform virtually automatic comparisons between multiple text versions. These two texts stored in the archive have been isolated and used to create a sub-corpus. The DBT-Corpus table frequency function has been used to generate a table containing a first comparison between words. In the subsequent step, the forms found in both texts were ignored, while only those found in one of the two texts were taken into consideration, namely those of the second version. These data have been re-entered into the search system to extract the relevant contexts, which have been matched by highlighting the differences between them.

 

Example II: Iterations


A study has been conducted on Gadda’s corpus with the purpose of finding out all the places where the author used the technique of word repetition, a significant technique used in literature and, in particular, in twentieth-century literature. This research has been carried out using DBT functions; the data obtained have been further processed and presented in the web site (Gaddian iterations). Two distinct results have been obtained: in the first case, we collected the pairs of words repeated in a sequence, while in the second case we took the iterations of words separated by punctuation (in most cases) or by other words.

 

Linguistic resources available


The first group of the listed resources has been produced in both electronic and hard copy format, while the second only exists in the electronic format.


1) Hapax Legomena Inverse Index
2) L'accentazione in Gadda [Accentuation in Gadda]
3) Un primo censimento di termini gaddiani [A first census of Gaddian terms]
4) Il latino in Gadda [The use of Latin in Gadda]
5) Annotazioni su composti in -cola [Annotations on compounds terminating in -cola]


a) Concordances by form of La cognizione del dolore
b) Concordances by form of Pasticciaccio
c) Complete concordances of the question mark (contexts with right-hand arrangement by ?)
d) Complete concordances of the question mark (contexts with left-hand arrangement by ?)
e) Complete concordances of the exclamation mark (contexts with left-hand arrangement by !)
f) Co-occurrences of Giornale di guerra e prigionia
g) Index Locorum of Latin forms in Gadda
h) Latin forms in Horace and Gadda – Comparison table
i) Comparisons between the two versions of Pasticciaccio
j) Gaddian iterations
k) EJGS item in Pocket Gadda Encyclopedia [http://www.arts.ed.ac.uk/italian/gadda/Pages/resources/walks/pge/sistemacnr.html]



torna
sali