-------------------------------------------------------------------------------
   README
-------------------------------------------------------------------------------

Aligned titles extracted from Wikipedia dumps May 2020 using Wikitailor Toolkit. 
The name of the files indicates the ISO 639-1 code of the languages included.
The output format from Wikitaikor is preserved.

Format. Multilingual entries, a title per line, separated by tabs. The order of
the languages corresponds to the filename. For each language, we include the 
pair "WikipediaID Title" also separated by a tab. Words in a title are joined by 
an underscore. Example:
1702653	Mar_Cinese	41265	Mar_de_la_Xina	158030	Marea_Chinei	
In file it.0.ca.0.ro.0.t0.union
 
Versions. The complete version ("union") includes all the entries. The "cleaner" 
version shows naïve cleaning on titles including years, dates, parenthesis, and 
others.

