English abstract
The project for the creation of the “Electronic
Archive of Carlo Emilio Gadda’s Works” was started in 1994,
when publisher Garzanti made their published text materials (Gadda, 1988-93),
already prepared for photo-composition and for study purposes only, available
to the Istituto di Linguistica Computazionale (ILC, Institute of Computational
Linguistics).
The first version of the Corpus
The Corpus was created starting from these texts (33), appropriately encoded
under DBT format and by applying a mark-up to point out various phenomena
(italic characters, capital words of proper nouns or decided by the author,
hyphens, dates, numbers, formulas, abbreviations, acronyms, pictures,
author’s and editor’s footnotes, dialect and foreign words,
dialogue, poetic language). The first version of the Corpus, running in
DBT 3.0, was presented on November 14, 1997 (the 104st anniversary of
Gadda’s birthday) at the CNR (National Research Centre) in Rome.
The first results
In 1997 we became the first users of this database, which is the only
electronic archive existing to contain all the works of a contemporary
Italian author. While the community of Gadda scholars is starting to demand
something that cannot be distributed but is available for consultation
at ILC – concordances -, we have started to:
- produce lexicographic tool standards by means of an automatic processing
system;
- apply innovative methodologies onto sample subsystems.
The CEG Project
The CEG (Carlo Emilio Gadda) Project was created in 1999 with two objectives:
- to transfer the Archive into the new DBT version, thus allowing new
functions to be used;
- to create a cultural laboratory containing texts by Gadda, bibliographical
data, links and special lexicographical resources.
The web site
In 1999, when we had become more familiar with computer
graphics and HTML or XML formats, we began to plan the creation of a web
site on the works written by and about Gadda. Although we had not received
an ad hoc funding, at the beginning of 2000 we started to create the CEG
web site exclusively based on our know-how. The web site advertised the
history of the project and made our first paper publications available
online.
Linguistic resources in progress
While in the first phase of our database usage activity we have produced
lexicographic resources mainly by applying simple system functions, the
method we are prevalently using at present consists in processing the
results obtained with different instruments.
Example I: contrastive concordances
The experience made with the construction of a comparison between the
two versions of Quer pasticciaccio brutto de via Merulana - the first
(QPL) being issued in instalments in the years 1946-47 in the Letteratura
magazine and the second (QP) published by Garzanti in 1957 - was very
useful to identify a method to perform virtually automatic comparisons
between multiple text versions. These two texts stored in the archive
have been isolated and used to create a sub-corpus. The DBT-Corpus table
frequency function has been used to generate a table containing a first
comparison between words. In the subsequent step, the forms found in both
texts were ignored, while only those found in one of the two texts were
taken into consideration, namely those of the second version. These data
have been re-entered into the search system to extract the relevant contexts,
which have been matched by highlighting the differences between them.
Example II: Iterations
A study has been conducted on Gadda’s corpus with the purpose of
finding out all the places where the author used the technique of word
repetition, a significant technique used in literature and, in particular,
in twentieth-century literature. This research has been carried out using
DBT functions; the data obtained have been further processed and presented
in the web site (Gaddian iterations).
Two distinct results have been obtained: in the first case, we collected
the pairs of words repeated in a sequence, while in the second case we
took the iterations of words separated by punctuation (in most cases)
or by other words.
Linguistic resources available
The first group of the listed resources has been produced in both electronic
and hard copy format, while the second only exists in the electronic format.
1) Hapax Legomena Inverse Index
2) L'accentazione in Gadda [Accentuation
in Gadda]
3) Un primo censimento di termini
gaddiani [A first census of Gaddian terms]
4) Il latino in Gadda [The
use of Latin in Gadda]
5) Annotazioni su composti in -cola
[Annotations on compounds terminating in -cola]
a) Concordances by form of La cognizione del dolore
b) Concordances by form of Pasticciaccio
c) Complete concordances of the question mark (contexts with right-hand
arrangement by ?)
d) Complete concordances of the question mark (contexts with left-hand
arrangement by ?)
e) Complete concordances of the exclamation mark (contexts with left-hand
arrangement by !)
f) Co-occurrences of Giornale di guerra e prigionia
g) Index Locorum of Latin forms in Gadda
h) Latin forms in Horace and Gadda – Comparison table
i) Comparisons between the two versions of Pasticciaccio
j) Gaddian iterations
k) EJGS
item in Pocket Gadda Encyclopedia [http://www.arts.ed.ac.uk/italian/gadda/Pages/resources/walks/pge/sistemacnr.html]
|