The ALeSKo corpus is a learner corpus of written texts of Chinese L2 learners of German. Currently, it comprises 43 argumentative texts and it has an overall corpus size of 13,587 tokens. The data is complemented by 39 German L1 texts (34,155 tokens) from the Falko Essays L1 0.5 corpus.
The goal for creating the ALeSKo corpus is to investigate coherence in learner texts. Coherence manifests itself on various levels of a text: among others, it includes appropriate reference handling and a smooth flow from one sentence to the next. To this end, the data is annotated with syntactic, referential and discourse information.