Lemma vs. Token frequency in John’s Gospel

In this post I discuss, in brief and embryonic form, the difference between lemma and token frequencies for John’s Gospel. At the bottom of this post you’ll find an unwieldy table. I haven’t quite figured out a good way to do tables.

Anyway, these days I’m doing lots of thinking about language teaching, reading and text based approaches, etc. etc.. And I thought it would be useful for my super-baby-coding skills to pull up a list of words in John’s Gospel sorted by frequency, both the lemmata, and the actual instances. That’s what’s on the table at the bottom (only covering the 100 most frequent lemmata, and then the 100 most frequent tokens sorted accordingly). The left-most two columns are frequency and lemma, and all the columns to the right of that are tokens.

About 5 words down things get interesting. It’s much more useful to learn εἶπεν and εἶπον than filling out the Pres.Act.Ind. paradigm of λέγω. Similarly, the two highest frequency forms of εἰμί are ἐστίν and ἦν. σύ is the most interesting, because it’s two highest tokens are actually the plural, which looks nothing like σύ. ἔρχομαι is also interesting, because I actually thought its aorists forms would turn up with higher frequency, but it’s ἔρχεται that is the star performer in John.

There’s much more to be done, but since I was messing around with this, I thought it would make a good little post between Christmas and New Year.

 

Freq Lemma Freq Token
2159 557 242 τοῦ 240 τόν 145 τό 141 οἱ 140 τήν 120 112 τῷ 107 τῶν 82 τῆς 82 τά 72 τῇ 55 τούς 37 τοῖς
813 καί 813 καί
751 αὐτός 171 αὐτόν 170 αὐτῷ 169 αὐτοῦ 99 αὐτοῖς 34 αὐτῶν
507 ἐγώ 129 ἐγώ 102 μου 99 με 40 ἐμέ 39 μοι 29 ἐμοί 26 ἐμοῦ
473 λέγω 122 λέγει 112 εἶπε(ν) 40 εἶπον 36 λέγω 34 ἔλεγον 26 εἶπαν
443 εἰμί 166 ἐστί(ν) 96 ἦν 54 εἰμί 26 εἶ
406 σύ 103 ὑμῖν 68 ὑμεῖς 60 σύ 47 ὑμῶν 37 ὑμᾶς 29 σου 23 σοι
279 οὐ 279 οὐ
270 ὅτι 270 ὅτι
239 Ἰησοῦς 193 Ἰησοῦς 26 Ἰησοῦν
237 οὗτος 61 ταῦτα 52 τοῦτο 49 οὗτος
220 ἐν 220 ἐν
201 δέ 201 δέ
197 οὖν 197 οὖν
182 εἰς 182 εἰς
165 ἐκ 165 ἐκ
159 ὅς 38 36 ὅν 30
155 ἔρχομαι 38 ἔρχεται
144 ἵνα 144 ἵνα
136 πατήρ 51 πατήρ 37 πατέρα 27 πατρός
118 μή 118 μή
110 ποιέω
102 ἀλλά 102 ἀλλά
100 πρός 100 πρός
98 πιστεύω
86 ἔχω
83 θεός 46 θεοῦ
80 οἶδα
79 τίς 47 τί
78 μαθητής 36 μαθηταί
78 κόσμος 26 κόσμου 23 κόσμον
78 ἀποκρίνομαι 57 ἀπεκρίθη
75 δίδωμι
72 ὁράω
71 Ἰουδαῖος 30 Ἰουδαῖοι 25 Ἰουδαίων
70 ἐκεῖνος 39 ἐκεῖνος
67 περί 67 περί
64 πᾶς
64 γάρ 64 γάρ
59 λαλέω
59 ἐάν 59 ἐάν
59 διά 59 διά
59 ἄνθρωπος 22 ἄνθρωπος
58 ἀκούω
56 τις 31 τὶς
56 γινώσκω
55 μετά 55 μετά
54 υἱός 26 υἱός
51 οὐδείς 26 οὐδείς
51 κύριος 32 κύριε
51 γίνομαι
50 ἀμήν 50 ἀμήν
49 εἰ 49 εἰ
45 λαμβάνω
43 πάλιν 43 πάλιν
41 πολύς
41 ἐμός
40 μένω
40 λόγος
40 ἀπό 40 ἀπό
38 εἷς
37 δύναμαι
37 ἀγαπάω
36 ζωή 23 ζωήν
34 Πέτρος 23 Πέτρος
34 παρά 34 παρά
34 ζητέω
33 μαρτυρέω
33 ἐπί 33 ἐπί
32 ὑπάγω
32 πέμπω
32 ἄλλος
31 καθώς 31 καθώς
31 ἡμέρα
30 ὡς 30 ὡς
30 ὅπου 30 ὅπου
30 κἀγώ 27 κἀγώ
30 ἑαυτοῦ
29 ἐξέρχομαι
28 νῦν 28 νῦν
28 ἀποστέλλω
28 ἀποθνῄσκω
27 ἐρωτάω
27 ἔργον
26 ὥρα
26 ἄν 26 ἄν
26 αἴρω
25 Σίμων
25 ὄνομα
25 ἀλήθεια
24 πνεῦμα
24 θεωρέω
24 ἄρτος
23 φῶς
23 Ἰωάννης
23 θέλω
23 δοξάζω
22 ἐκεῖ 22 ἐκεῖ
21 ὕδωρ
21 ὅτε

 

3 responses

  1. It’s interesting to see the observations you draw out of such an activity. I am in the process of creating some software that would allow easily generating this sort of thing. I wonder, did you make this by hand, or use a feature of Accordance/Logos?

    Like

    • Cool, SBLGNT is a helpful resource. At the moment I’m interested in not just the NT but how things compare with the LXX as well.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: