Digitization of older Croatian dictionaries : a possible substratum for terminological neologisms ?

Considering that the present-day Croatian still frequently fails to have the exact translational equivalents for the novel ideas developed and disseminated via metalinguistic Eurospeak, the paper adopts and employs an unorthodox scientific method to refer to an articulated correlation between a conceptual framework theorized (i.e., the noninvasive library digitization projects pertaining to the select Croatian biand trilingual lexicography from the 17th to the 20th century) and the hypothetical questions addressed (i.e., their applicability to the coinage of Croatian neologisms that formationally imitate the previous paragons), with a pronounced tendency to signify a progressive replacement of the perplexingly anglicized language registers by the more decipherable formality levels. Consequently, such a succinct analysis results in a revalorization of the computerized conversion efforts and a permanent appraisal of the Croatian thesauri, which are neither antiquated nor obsolescent but may be incentively put into service for further similar studies in the subject matter.


Introduction
For two thousand years, a spoken and a written word have been a backbone of our lives and a witness to a historical progress of the humankind.In spite of all the practices and changes, all other educational aspects still cannot compete with a manuscript, a protector of times.Within the European community of states (to which Croatia is affiliated as well) and English as the principal colloquial speech these days, the paper tends to signify a permanent value of the Croatian antiquated lingual heritage not only in a translational but also in a lexicographic sense, i.e., as an incentive to the formation and systematization of Croatian expressions as a response to the necessities at the moment.Thanks to the efforts like Google Books (books.google.com),this legacy is now converted, processed, and electronically transcribed for the generations to come. 1   Namely, inventoried by the Google Books service, the pages of several invaluable Croatian dictionaries dating to the 17 th , 18 th , 19 th and the 20 th century have been digitized via customized curvature-correcting procedure and deposited in databases.Having implemented optical character recognition (OCR) to scan the full texts as a part of the searchable Library Project, a partner archival program supported by the major American and European academic libraries, Google Incorporation has enabled the users, notably scholarly linguists, to deploy the modern semantic Web technologies, i.e., distant reading or text mining, and benefit from their complex search algorithms.Integrating these thesauri in its online repository that already counts more than 25 million titles, the endeavor has significantly contributed to the general knowledge democratization, additionally providing the book peruser population with the analytical metadata toolboxes (e.g., auctorial attribution, original pagination to assist proper exported citation, publication details, etc.) and even with a possibility to store them in a personalized collection for a postponed offline retrieval.Physically and temporally, an interface designed in such a location-and zone-independent way establishes a virtual environment that permits a high-quality bibliometric navigation through its language-sensitive cultural contents.
In the same way, the data have also been clustered by the San Franciscan nonprofit digital library known as the Internet Archive (archive.org),proudly missioned to grant a "universal access to all knowledge"; however, for the purpose of this paper, we will exclusively limit ourselves to a very tiny segment of approximately three million public-domain books automatically web-crawled by this activist organization, supervising one of the globally largest projects of that type.On account of a previous prolific collaboration with the Microsoft Corporation's Live Search Books service, its thirty-odd scanning centers in five geographical areas provide for the special cultural heritage collections as the cropped-and-skewed images or in a portable document format (PDF).What is more, as opposed to the Google Books, its Web-accessible Open Library tends to become a public repository of downloadable, readable, and full-text searchable documents, too.
Yet, even prior to these encyclopedic and lexicographic deposits, the Croatian Dictionary Heritage and Dictionary Knowledge Representation (CDH), a project conceived and carried out by the information and communication scientists The service is corporately headquartered in the so-called "Googleplex" in Mountain View, California, whose name is itself a portmanteau of the words Google and complex that deliberately references the googolplex (10 googol ), i.e., a very large number.

Case study
In the paper, we have incipiently concisely taken into account the Croatian bilingual and trilingual lexicography of the 17 th and the 18 th century, when the first terminological advancements are observable, and have turned our attention toward the 19 th -and the 20 th -century works and their digitization afterwards.
We have also noted that lots of the dictionary forms, especially those from the 19 th and the 20 th century, when the Croatian linguistics had already been fully developed, have propitiously continued to be quite applicable even if they are unjustifiably considered "archaic" or "obsolete." Owing to their brevity, they could therefore be easily featured for the more elevated Croato-English stylistic purposes, e.g., in administration, in the military, or in philology, if previously orthographically modernized and rendered in a standard Shtokavian fashion.
Since we have conducted a very comprehensive and exhaustive case study which proves that numerous older Croatian lexicographers have borrowed and revived the original and less-accustomed words from the older linguistic strata and have even adapted certain dialecticisms, the examples thereof (italicized), as well as the 20 th -century neologisms devised according to this pattern (underlined), seen as a useful theoretical prerequisite for further research and studies: But why are these wordlists still relevant?In this era of misapprehensions, wherein everyone desires to communicate but is sporadically unprepared to patiently discuss, a perusal of the interesting matured glossaries, vocabularies, and volumes reminds us of an inspiration to the creative intellects that motivates us to ingeniously evaluate and experience even a cryptic civilizational inheritance.
It authorizes us to be actively communitally engaged and concentrated on a rediscovery and research of the lexica that someone has verbally mediated to us once in a long while.Confronted by the contests of the 21 st century, a bibliophile, nowadays succored by a machine intelligence as well, may thus almost effortlessly appear more educated and empowered to be altered individually while simultaneously transforming a contiguous milieu itself.These are the caches of multilingual advisory information, so one may presently frequently reach for an exemplar from a computerized "bookshelf, " imaginarily dusting off its illuminated covers not only to admire the ancient bookbinders and illustrators' artistry but also to explore its entries and acquire an anticipated stimulus to immerse in our sumptuous phrasebooks and coin innovative terminologies for the new epochs.
Therefore, the paper aims to also present a succinct practical and theoretical analysis of the Croatian and English notions of Eurospeak, i.e., a special, difficultly decipherable metalanguage of civil servants, politicians, and the public administrative institutions of the European Union.Imitating the former (Vulgar) Latin that was used as a vehicular lingua franca in a better part of our common European history, this jargon actually assimilates the lexis of several parlances, mostly the English and the French idioms, to expedite an unobstructed communication of people whose mother tongues are neither English, French, nor German, the three most recurrent out of the twenty-four official Union languages.
Primarily being an argot, Eurospeak is a philological variant that utilizes specific amalgamated or calqued coinages from various disciplines, usually informatics, journalism, etc., in addition to literature.Accordingly, it represents an unexpected opposition: due to its inclusion, it considerably eases conversation and connoisseur participation in a new situation in the Member States' territories, being spoken by all individuals from an expert's circle, but it is regularly excessively exclusive and hence hardly comprehensible to the broader spheres of ignorant recipients of the same matters without further explanations.
Conceptually, the paper's topic further problematizes an inspiring utilization of the digitized Croatian lexicographic heritage while forming a modern Croatian terminology as an attempt to gradually replace Eurospeak; however, such a stimulating modality to "exploit" tradition and excogitate a motivating solution to the contemporary terminological necessities of the Croatian language is still insufficiently valorized by the 21 st -century Croatian lexicographers, probably because of an abstract, deficient cognizance that a considerable number of the old(er) Croatian dictionaries has already been digitized.Nonetheless, Eurospeak being a European Union's metalanguage, one should tentatively realize that certain concepts are simply untranslatable by a single Croatian word.
An academic interest is obviously evinced, but scientific expectations in this respect are yet to be exceeded.To arrive at tenable, valid conclusions, a continued systematic elaboration is invoked, desirably in collaboration with the experts in Croatian studies.Exemplified by concrete research results, a two-stage process would illustrate word formation (whereby the old models could be productively reapplied) and neologism acceptance, both in general lexis and in terminology.
Still, when it comes to the new words, the method also implies an incontestable fact that some coinages, though perfectly formed, have never been commonly expanded, as was the case in Croatian with Bogoslav Šulek's lučba ("chemistry"). 4 Thus, a delicately difficult task for the modern Croatian scientists, notably assigned to the junior and senior philologists and to those in the Croatian studies and in the English studies, would be to remotely access the locally stored digital library contents, to retrieve information from the electronic media formats, and to non-perfunctorily investigate whether the digitized Croatian dictionaries might also serve as the possible substrata for the Croatian terminological neologisms that could progressively replace the anglicized metalinguistic Eurospeak.Still, however onerous the task might be, it hopefully also seems fruitfully performable, because the institutional repository software in the academia has considerably evolved from the days of the Online Public Access Catalog (OPAC) or the librarian Education Resources Information Center (ERIC) and the early attempts that have alleviated the archiving, organization, and search only.Unlike the traditional ones, the state-of-the-art digitally converted catalogs are user-friendlier in terms of their authentications and interfaces, i.e., they are constantly available, multiply accessible, and virtually have no physical limits, whereby the perennially acute conservation, preservation, and storage problems may be more successfully addressed and resolved.Likewise, the absolutely intrinsic value attached to the older Croatian digitized thesauri is even surplussed by the fact that the scanned imagery may be supplied by metadata and further enhanced to remove discolorations and improve the overall textual legibility (Gert, 2000), while the disadvantages such as the access equity, also known as the so-called "digital divide, " or system interoperability may be more adequately outweighed in a scholarly community.
According to Perrin (2015) and other authors dealing with the digitization of primary textual sources and fragile analog collections in humanities, these representations or specific images are the preciously instrumental digital surrogates, i.e., the facsimiles that minimize the time waste while embracing the so-called "lean philosophy, " especially in cases similar to the digitization (or even to the archival, librarian, or museal digital preservation) of older Croatian dictionaries printed on the progrediently acidifying wood-pulp paper, whose deterioration would otherwise be quite imminent.
In due course, the only superficially vague notion of a hypothetical correlation between the two processes implied in the paper's title (i.e., the present-day digitization of the older Croatian dictionaries on one side and a possibly parallel 4 Etymologically modeled after Medieval Latin alchymia and Arabic al-kīmiyāʾ ‫,)ءايميكلا(‬ "philosopher's stone, " i.e., after Late Greek chēmeía (χημεία), "black magic, " and Greek chymeia (χυμεία), "mixture, " it was proposed to be used instead of the borrowing kemija.
or subsequent utilization of the digital surrogates as the substrata for a coinage of the new Croatian terms on the other side), as well as a production of conceivable Croatian neologistic equivalents, which we justifiably and persistently advocate throughout the paper as an elicited ingenious response to the Eurospeak in the times in which we are inundated with Anglicisms, have, to tell the truth, already partially occurred in the Croatian lexicography, though in completely different, compelling historical circumstances and with a far less advanced technology.
To be exact, the fact is that Horvat (2004) indicates Bartol Kašić's 1599 Slovoslovlje dalmatinsko-talijansko, i.e., his Dalmatian-Italian dictionary, to presumably be among the earliest works that have reversed Faust Vrančić's previous glossaries, notably the first Croatian printed lexicon, Dictionarium quinque nobilissimarum Europae linguarum, Latinae, Italicae, Germanicae, Dalmaticae et Ungaricae, known as the quintilingual fount of "the noblest parlances" in Latin, Italian, German, Dalmatian and Hungarian.In other words, by virtue of its neologistic concurrences and calques (especially from Czech, German, Italian, Russian, Slovak and Slovene in previous epochs and increasingly from the English language nowadays), the Croatian lexicography has always terminologically kept an eye on societal developments, though lacking in the obvious amenities of today's referential digitization.This propensity can be tracked back to the monumental 50,000-word Latin-Illyric (i.Namely, as we have mention above, upon the accession of the Republic of Croatia to the European Union, many European terms that have been enigmatic heretofore became quotidian thereafter.Yet, for some of them there is no adequate Croatian equivalent or replacement so far, although Croatia is the 28 th Member State of the Union.A number of these lexes are barely translatable by a single vocable, particularly if they are themselves the acronymic and partially inconsistent European synthetic words, and a lexicographically reflected fact that Croatia was a part of an entirely different constitutionalization up to the 1990s is thereby an aggravating circumstance.
Analytically and methodologically, we will proceed in this paper while consequently chronicling the basic traductological challenges, especially following the 2009 Treaty of Lisbon and its legislative context, which practically was a fact hitherto unknown.Referenced hereby are, e.g., the Francophonic acquis and Avis, as well as the English constructs like "comitology, " "communitization, " "flexicurity, " "Fortress Europe, " "Founding Fathers, " "four freedoms, " "non-paper, " "rendezvous clause, " "two-speed Europe, " "White Papers, " etc.They necessitate a select approximation commensurate to the level of addressee's prescience, for these ostensibly ordinary paronyms assume additional meanings in the arbitrary European legalese.
It has been demonstrated that the importance of an authentic, correct Croatian rendition and the consistent application of a harmonized, prescribed nomenclature stipulated by the acquis communautaire is immense.It largely pertains to the transposition of a multifarious scope of an adopted regulation, notably the one originally compiled in English.Bearing in mind that such an onomasticon denotes a set characteristic of an entrepreneurial, professional, or scientific category, it is an indispensable basis for an efficacious improvement of the so-called "knowledge sharing" and an accompaniment to an inclusive societal expansion.
As all the speakers within an area essentially operate with the same phrased appellations for the identical notions, these locutional designations actually prevent ambiguities, explicitly if accuracy is implied as a sine qua non, e.g., in the governance and in the law.
Let it be further discussed that, since the principle of a "multi-speed Europe" is being exactly linguistically promoted in the European Union concerning the extent of attainments acceptance, also advocated is a deepening of involvement in all the domains of a communal European existence, except in case of an opting-out.
What Europe habitually unequivocally demands is epitomized in English as a "hard core, " i.e., as a close cooperation of the neighboring countries because of their similar past conditions, and it increasingly affects the civic awareness in our region, too.Recently, the pro-European tendencies in Croatia have also effectuated a totally novel lingo.By reason of these vicissitudes, an incessant activity in the perfection of Croatian metaphrases and paraphrases and an introduction to the multiple European institutions have become ubiquitous.
Though the real expenses of interpretation in the European Union's minority vernaculars are enormous, they circumstantiate a determination to respect the differences, confirmed by the Council of Europe's 2011 decision on the declaration of the European Year of Languages.A datum that it was warmly welcomed in all the 28 Member States of the Union and was later backed by the United Nations Educational, Scientific, and Cultural Organization testifies to the prominence attached to the discourse and its acquirement at any stage by the inhabitants of the aforementioned lands, be it for a personal growth or for the evolution of a society as a whole.As the 24 th official verbal intercourse modality, Croatian faces fresh opportunities in a commonwealth of more than 500 million residents, although it will still enjoy its slightly subordinate status.To begin with, the digitized Croatian lexicographic corpora, particularly those from the 18 th , 19 th , and early 20 th century, are frequently (and quite expectedly) characterized by the extraordinarily descriptive Croatian translational equivalents and encyclopedic synonymy in a form of the inspiring lexicon entries, as well as by a large quantity of Slavic adaptations and loanwords.Specifically, they accurately verify an immense, inestimable, and occasionally infinite auctorial creativity when it comes to the derivatives (being mostly the agent nouns), but they are considerably less imaginative when it comes to the compounds.A cogent reason therefor is a datum that a linguistically imported, pseudo-Germanic compounding, popularized in the 19 th century, is not an autochthonous Slavic word-formation mode but rather a literal translation of formative constituents according to a German prototype. 5Since these multilingual contributions to our lexis are still insufficiently gingerly explored (because the usage of the digitized Croatian dictionaries to conduct an original and solid research in their subject matters and word-formation patterns is still a relatively new phenomenon), a penetrating analysis could demonstrate that such neologisms have not been coined on account of an unselective purist motivation but rather for the sake of precision and an evidently exhibited validation of Croatian formational and productive capacities.Nonetheless, as both the Croatian lexicographic and grammatical (that is, word-formation) traditions have been noticeably characterized by these purist aspirations as the consequences of extralinguistic (that is, historical and political) commotions, this purism should not be contextually comprehended as dogmatic, exclusive, or automatically negative: as a deliberate cultural weltanschauung, it has actually promoted a creative, prudent utilization of Croatian formants based on a priceless and proud heritage.

Conclusion
The digitized older Croatian dictionary corpus exemplified that both the Slavic borrowings (e.g., of a Czech, Russian, Slovak and Slovene descent, etc.,) and idiolectal neologisms are findable in lieu of loanwords, including astute insertions of a classical Chakavian or Kajkavian lexical stock.Lexicalized in the Croatian spirit while avoiding extreme purism and maintaining clarity, they might be reintroduced even if temporarily unaccustomed and might serve as excellent templates for analogous paradigms in the future.
Likewise, our detailed, intensive study arrived at a tenable conclusion that numerous older Croatian lexicographers (who were active in the periods observed) have actually also borrowed and revived the original but less accustomed words from the older Croatian linguistic strata and lexical stocks or have phonologically, morphologically, syntactically and lexically adapted certain organic dialecticisms (especially the Shtokavian ones emanating from the idiom of the Dubrovnik literati circle) 6 when coining their neologisms for the new foreign (non-Slavic) expressions.
5 Namely, having replaced the predominantly German morphemes by the Croatian ones, the "father of the Croatian scientific terminology" Bogoslav Šulek calqued the terms such as kolodvor ("railroad station, " German Bahnhof), vodopad ("waterfall, " German Wasserfall), etc.In his purist awareness, Šulek thus followed his lexicographic principles and primarily selected the Croatian dialectal lexemes, having relied to other Slavic languages as a second option only.6 Sometimes, however, the exertion of influences of other dialectal traits is noticeable in the lexicographic writings produced by these Croatian polymaths as well, e.g., of the Chakavian ones, characteristic of other Dalmatian conurbations (Šibenik, Zadar), of the Kajkavian, or even of the trilingual one (Chakavian, Shtokavian, and Kajkavian), characteristic of the Ozalj literary-linguistic circle.
It is therefore worth mentioning that a series of presently commonplace Croatian words were once just the neologisms coined by Ivan Mažuranić, paving a pathway for Bogoslav Šulek and his scientific terminology.E.g., without Mažuranić's compounds and syntagmata, we would not have računovodstvo, tržišno gospodarstvo, veleizdaja and velegrad among the utterances in modern Croatian, whereas one should be grateful to the sparks of Šulek's inventive genius for the words glazba, narječje, obrazac, pojam, skladba, tvrtka, uradak, uzor, zemljovid, zdravstvo, etc.A substantive but still occasionally divisive issue is could these neologistic polyglottal procedures be optimally repeated and productively (re)applied in standard modern Croatian so that "odborovanje" may replace the Euro-English comitology or that "pozajedničenje" may stand for the Euro-English communitization.In our opinion, they could.
Nevertheless, in addition to the concerns about a possibly excessive reliance to archaisms or purisms, one should also always try to lexically, phonetically, and phonologically adhere to the principle of (Croatian) linguistic economy when it comes to the formation of terminological neologisms, as the aspects of linguistic dynamics and statics do play an active role in their adoption rate.This could be cited as a logical reason behind the fact why some Croatian translations, especially those in the ICT sector like očvrsje and sklopovlje ("hardware"), napudb(in)a ("software"), ponudnik ("menu"), pretpostavljena vrijednost ("default"), pravopisni provjernik ("spellchecker"), etc., have been accustomed less commonly, too.
Finally, in the era of realistic Amerocentric supremacy and globalizing capitalist neoliberalism, firmly supported by a considerably anglicized, formalized jargon of the European Union governance, Croatian culturohistorical identity is, inevitably, partially jeopardized and subject to these unitary tendencies as well.In such an institutionalized, politicized argot of civil servants, economists, and lawyers, terminological neologisms may be intentionally deployed, but they necessarily have to be expansively preadapted and media-propagated to reach the common public, especially if it comes to its most practical and progressive segment, i.e., to the cohorts of the digitally-dependent Millennials.In Croatia, these juveniles of Generation Y, usually born in a time span extending from the 1980s to the 1990s, are especially susceptible to abbreviations, English phraseology, and verbal truncation, comprehending the notion of the aforementioned "linguistic economy" quite literally.To surmount the gap while optimally reapplying the patterns of the Croatian 18 th -and the 19 th -century lexicography to the scientifically and technologically evolved vernacular of the new age brackets, with a possibility to compile online dictionaries, would be a pioneering but expectedly accomplishable mission.

Figure 2 .
Figure 2. Hrvatsko-njemačko-talijanski rječnik znanstvenog nazivlja (1874) by Bogoslav Šulek (the photograph elements taken from http://www.antikvarijatzz.hr/knjige/visejezicni/hrvatskonjemacko-talijanski-rjecnik-znanstvenog-nazivlja-i-ii/1973/) grouped around Damir Boras, Ph.D., Full Professor, in his capacity as the Principal Investigator, had digitized the ten most important Croatian dictionaries whose original publications are extended from 1595 to 1881 (e.g., chronologically, those by Faust Vrančić, Jakov Mikalja, Juraj Habdelić, Ivan Belostenec, Ardelio Della Bella, Ivan Mažuranić and Jakov Užarević, Bartol Kašić, Josip Altman and Stevan Bukl, as well as the one by Joakim Stulli).Thematically, their irreplaceable corpora, presented at a round table on the Use of Information Technology in Lexicography, 2 are currently also browsable and cross-searchable on the Croatian Old Dictionary Portal, 3 containing the inerlingual entries in Latin, Italian, German, Hungarian, Czech (Bohemian), Polish and in other Croatian variants (i.e., Dalmatian, etc.).These major lexica, regularly completely databased, are occasionally photographed in very high resolutions and transcribed (or transliterated, according to the modern Croatian standard), whereas a digital transformation and addition of six-odd other smaller phrasebook editions, such as those by Jakov Anton Mikoč, Božo Babić, or Milan Žepić, complements the endeavor.