Fig. 2From: We are not ready yet: limitations of state-of-the-art disease named entity recognizersComparison of the data sets with scattertext. On each axis, the frequency of a term is shown for the given documents. In Fig. 2a, the BC5CDR training set is compared to its given test set whereas in Fig. 2b, the BC5CDR training set is compared to the NCBI training set. In Figs. 2c and 2d, the BC5CDR training set and the NCBI training set are compared against a randomly chosen PubMed corpus of similar sizeBack to article page