The Book of Matthew
Overview
The Book of Matthew in the UPDV is a reconstructed text. The Greek Matthew that survives in all known manuscripts is not a simple translation from a Hebrew or Aramaic original — it is a Greek composition, built by a skilled Greek-speaking author who assembled multiple sources: a small collection of Aramaic sayings, the Gospel of Mark, Old Testament quotations drawn from the Greek Septuagint, and original editorial material. After this initial composition, a later editor added further layers — birth stories, prophecy fulfillment formulas, eschatological intensifications, and narrative embellishments consistent with Greco-Roman literary convention.
These modifications were made so early that every surviving manuscript contains them. The edited text became the archetype for the entire subsequent tradition, creating a transmission bottleneck before our extant manuscript lines diversified. The UPDV reconstruction identifies and removes the editorial additions, restores chronological order where the text was rearranged, draws on parallel accounts in Mark and Luke where Matthew's text is unreliable, and renumbers the resulting chapters and verses. The reconstruction retains approximately 70% of the canonical Matthew verses.
This article explains the evidence behind the reconstruction in three layers of depth: a plain summary for general readers, the historical and textual evidence, and the computational validation — including a 2026 manuscript transmission analysis that independently confirmed the composition model by detecting Aramaic translation seams only in Matthew's inherited sayings material, with zero signal in the editorial framework.
The Problem in Plain Terms
The Gospel of Matthew was not written in one sitting by one person. A Greek-speaking author — not the Apostle Matthew himself, but someone working with sources that trace back to him — assembled the gospel from at least three streams of material: sayings of Jesus preserved in Aramaic, the narrative framework of Mark's Gospel, and Old Testament quotations taken from the Greek Septuagint (not translated fresh from Hebrew). This author composed the connecting narrative, the editorial bridges, and the five-discourse structure in fluent Greek.
After this initial composition, the text underwent substantial modification. A later editorial layer added the narrative birth stories (though not the opening genealogy), inserted prophecy citations throughout, rearranged teaching material into long thematic speeches, amplified miracles and punishments, and added narrative details consistent with Greco-Roman literary convention — elements far more at home in secondary embellishment than in the inherited synoptic tradition.
These changes were made so early that every surviving Greek manuscript contains them. There is no "unedited" copy of Matthew to compare against. But the modifications left traces:
- Other Gospels don't have them. Mark and Luke record the same events without the additions. When Matthew has extra material that Mark and Luke lack, that material consistently shows patterns of literary embellishment, theological agenda, or stylistic features foreign to the rest of the book.
- Patristic reports of a shorter text. Church fathers writing in the second through fourth centuries report Hebrew or Jewish-Christian Matthean texts that, in some forms, lacked the birth narrative. Some communities preserved traditions of a substantially shorter gospel.
- The edited text creates tensions. The editorial insertions sometimes create logical tensions with surrounding material — prophecies that don't quite fit their context, doublings of characters that the parallel accounts don't support, and sayings pulled from their original settings into artificial speech collections.
The UPDV reconstruction identifies these editorial additions using multiple independent lines of evidence and removes them to recover a text closer to what the original author wrote. Where removal creates gaps, parallel accounts from Mark and Luke fill them. The resulting text covers the same events and teachings but without the later editorial overlay.
This is not a new harmony constructed from Mark and Luke, but a reconstruction of Matthew's Greek Gospel using the sources Matthew himself appears to have used. Seventy percent of the canonical text survives, including the narrative framework, the teaching material, and the genealogy. The reconstruction preserves what this author uniquely contributed while removing what a later editor imposed. For readers who want to see exactly what changed, the UPDV provides 28 chapter-by-chapter text comparisons showing the old and new text side by side, a source chart identifying which Gospel each verse was drawn from, and a verse renumbering chart mapping old verse numbers to new.
The Evidence
The Sayings Source
The church father Papias (via Eusebius, Hist. Eccl. 3.39.16) wrote that Matthew composed τὰ λόγια (ta logia, "the sayings" or "the oracles") in ἑβραΐδι διαλέκτῳ (hebraidi dialektō, "the Hebrew dialect"). For centuries this was read as claiming Matthew wrote the entire Gospel in Hebrew or Aramaic. But the word λόγια means "sayings" or "oracles" — not "gospel" or "narrative." The manuscript transmission evidence (discussed below under Computational Validation) strongly supports reading Papias as describing a sayings collection, not the gospel we have today.
The historical Apostle Matthew likely wrote down a small collection of Jesus's sayings in Aramaic. A later Greek-speaking author took that collection, combined it with the Gospel of Mark and with Old Testament quotations drawn from the Greek Septuagint, and composed the gospel that bears Matthew's name. Church fathers reference Hebrew versions that differed from the Greek — these likely reflect the original sayings collection or closely related traditions, not a Hebrew prototype of the full gospel.
It is helpful to distinguish the two streams of Aramaic sayings the Greek author imported. The first is Q, a sayings source shared with the Gospel of Luke — material like the sparrows saying (10:29) and the woe oracles (11:21). The second is a unique collection of parables and logia found only in Matthew — material like the wheat and tares (13:30) and the angels saying (18:10) — which may preserve something like what Papias referred to as τὰ λόγια. While these sources arrived through different channels, the substrate scanner detected the same Aramaic translation seams in both, confirming that both layers were originally composed in a Semitic language before being integrated into Matthew's Greek framework.
Witnesses to a Substantially Different Text
The missing opening chapters. Epiphanius of Salamis (Panarion) knew of Hebrew versions that did not contain the first two chapters. He also indicates that some versions retained the genealogy but not the birth narrative, suggesting the infancy material was a later addition to an existing text.1
The Jewish Christian tradition. In 1966, Shlomo Pines described a text reflecting the views and traditions of a Jewish Christian community. Though transmitted through later centuries, this tradition appears to reach back to the earliest period of Christianity. These texts imply that the "true" Hebrew Gospel did not contain an account of the birth and life of Jesus as found in the canonical Matthew.2
Matthew 1:16 and the virgin birth. Multiple early witnesses — including the Old Syriac (Sinaiticus), Palestinian Syriac traditions, and early patristic dialogues — show that Matthew 1:16 was modified very early to support the specific doctrine of a virgin birth. The detailed textual analysis of these variants is provided in Matthew 1:16 and The Virgin Birth.3
Dream narratives and Greco-Roman parallels. The dream stories in Matthew's infancy narrative share formal features with Greco-Roman literature, particularly the ancient novels. Chariton 2.9.6 offers a close parallel to Matthew 1:18b-24: just as Callirhoe resolves a crisis about her unborn child through a dream, Joseph resolves his crisis about Mary's child through a dream. The dream in Matthew 27:19 (Pilate's wife) uses the same terminology, linking the beginning and end of the editorial layer.4
The Synoptic Evidence
The clearest comparative evidence comes from comparing Matthew with Mark and Luke. Scholars have long recognized that Matthew drew on two primary sources: the Gospel of Mark (providing the narrative framework) and a sayings collection shared with Luke (known as Q, from German Quelle, "source"). Material found only in Matthew — called Sondergut or "M" — requires special scrutiny.
When the three Gospels are placed side by side:
- Where Matthew follows Mark, the text is generally reliable. Matthew preserves Mark's narrative order, often with minor stylistic adjustments. The UPDV retains 88% of these verses.
- Where Matthew shares sayings with Luke (Q material), the text is usually authentic but sometimes displaced. Matthew collected sayings from different occasions into large discourse blocks (the Sermon on the Mount, the Mission Discourse, the Parables Discourse, etc.), while Luke generally preserved their original contexts. The UPDV retains 76% of these verses, sometimes restoring the Lukan order.
- Where Matthew stands alone (Sondergut), the text is most suspect. Only 22% of Matthew's unique verses are retained in the UPDV. The rest consists of editorial additions: the infancy narrative, prophecy fulfillment formulas, eschatological amplifications, and narrative embellishments.
Patterns of Modification
When Matthew's unparalleled material is examined systematically, consistent patterns emerge:
- Prophecy fulfillment formulas. Fourteen times, the editor inserted a citation introduced by a phrase like "this was to fulfill what was spoken by the prophet." These formulas share a distinctive literary style that links them to the infancy narrative — the same editorial hand wrote both.5
- Sensationalization. Where Mark reports one demoniac, Matthew has two (8:28). Where Mark describes a straightforward healing, Matthew adds dramatic elements. The pattern is consistent: unparalleled details in Matthew tend to amplify rather than preserve.
- Thematic rearrangement. The editorial process gathered sayings from different occasions into artificial discourse blocks. Matthew chapter 13 collects parables spoken at different times into a single "Parables Discourse." Chapter 23 collects criticisms of the Pharisees into a single denunciation. Chapter 24-25 combines eschatological material from multiple settings. This rearrangement changes the meaning of individual sayings by removing them from their original contexts.
- Dream plot devices. Six times in Matthew, a dream redirects the story (1:20, 2:12, 2:13, 2:19, 2:22, 27:19). Every occurrence is in unparalleled material. This is a narrative technique consistent with Greco-Roman literary convention, not with the inherited synoptic tradition.
- Eschatological intensification. The phrase "weeping and gnashing of teeth" appears six times in Matthew, always in unparalleled material. The formula "the end of the age" (συντέλεια τοῦ αἰῶνος, synteleia tou aiōnos) appears five times, never in the other Gospels. These are editorial stamps — a consistent theological emphasis added throughout the book.
What Was Removed
The reconstruction removes approximately 30% of the canonical text. The major categories:
| Category | Examples |
|---|---|
| Infancy narrative | Birth story, Magi, flight to Egypt, massacre of innocents (1:18–2:23). Note: the genealogy (1:1-17) is retained as original |
| Fulfillment formulas | "This was to fulfill what was spoken by the prophet..." (14 instances) |
| Discourse transitions | "When Jesus had finished these sayings..." (5 instances) |
| Eschatological additions | Weeping and gnashing codas, end-of-age formulas (e.g., 13:39, 28:20) |
| Narrative embellishments | Pilate's wife's dream, blood curse, resurrected saints, guard at the tomb |
| Unattested teachings | Material with no parallel and showing editorial characteristics |
The Reconstruction Method
The UPDV reconstruction uses existing Matthew as the primary source wherever possible. Where the editorial layer demonstrably stripped sayings of their original context to build thematic discourses, the reconstruction draws on parallel accounts in Mark and Luke to recover the earlier chronological sequence. In some cases, readings from multiple Gospels are combined. Passages lacking any confirming witness — whether to the reading itself or its context — are not included. Some texts that appear genuine are included even where their original location is uncertain; footnotes indicate this. Slight narrative adjustment was occasionally required for transitions.
A Note on Method and Preservation
It is important to clarify that drawing upon Mark and Luke to repair narrative gaps does not create a Gospel harmony or a modern Diatessaron. Because the original Greek author of Matthew utilized Mark and a shared sayings source (Q) as his foundational texts, using these same texts simply restores the original source material that the later editor displaced or destroyed. Furthermore, while this chronological reconstruction removes certain authentic parables unique to Matthew (such as the Ten Virgins or the Sheep and the Goats) because the editor stripped them of their original context, these words of Jesus have not been discarded. They are fully preserved in the chapter-by-chapter text comparisons, where each parable's authenticity evidence is discussed alongside the reasoning for its exclusion from the running text — ensuring the preservation of genuine tradition without forcing an artificial narrative placement.
What About the Parables?
The most debated category is the parables found only in Matthew: the Workers in the Vineyard, the Ten Virgins, the Sheep and Goats, the Unmerciful Servant, the Two Sons, the Hidden Treasure, the Pearl, and the Dragnet. Davies and Allison identify these as a probable pre-Matthean parable collection — genuine traditions that Matthew's editor inherited, not editorial inventions.6
Computational analysis supports this: the parables cluster stylistically with the authentic core of the Gospel, not with the editorial layer. Their language is Semitic in character, they lack the editorial vocabulary fingerprints found in the fulfillment formulas and infancy narrative, and several use βασιλεία τοῦ θεοῦ (basileia tou theou, "kingdom of God") rather than Matthew's characteristic βασιλεία τῶν οὐρανῶν (basileia tōn ouranōn, "kingdom of heaven") — suggesting they predate the editor's theological vocabulary.
However, knowing that a parable is authentic does not tell us where it originally belonged. Matthew's editor arranged these parables thematically, not chronologically. Restoring them to the text would require choosing a placement, and every placement changes the meaning of the parable and its surrounding context. The UPDV has chosen not to include material whose original context cannot be determined, even when the material itself is likely genuine. This remains the one area where the reconstruction may be more conservative than the evidence strictly requires.
Renumbering
The chapter and verse numbering in Matthew has been changed. When it is necessary to refer to both numbering systems, the old system is indicated by "(old)" or "(old numbering)" and the new system by "(new)" or "(new numbering)." Where neither is specified, the new system should be assumed.
Full reference materials — including the verse renumbering chart, source chart, and chapter-by-chapter text comparisons — are available in the Matthew Reconstruction Reference.
Computational Validation (2026)
In 2026, a comprehensive computational analysis was performed on the Greek text of Matthew to test whether the 2005 reconstruction could be independently validated using quantitative methods. The analysis used the PROIEL Treebank (a linguistically annotated corpus of ancient Greek texts in Universal Dependencies format) as its data source, providing word-level lemmatization, part-of-speech tagging, and morphological features for every token in the Gospel.
Change Point Detection
A sliding-window analysis (600 tokens, step 100) was applied across the full text of Matthew, measuring shifts in function word frequencies and part-of-speech patterns. The algorithm — a kernel-based change point detector operating on PCA-reduced feature space — identified eight statistically significant breaks in the text. All five of Matthew's major discourse blocks were detected blindly, without any prior information about the Gospel's structure. The strongest break corresponded to the end of the Sermon on the Mount.
A micro-windowing analysis of chapters 1-5 (250-token window) revealed that chapters 1-2 are completely isolated in the feature space from chapters 3-4, confirming that the infancy narrative is stylistically distinct from the rest of the Gospel.
Translation Greek Classification
A classifier trained to distinguish Translation Greek (Greek translated from a Semitic original, using the Septuagint as training data) from native Koine composition (using Josephus and the Greek novels) was applied to Matthew chapter by chapter.
Result: All of Matthew classifies as Translation Greek — every chapter scores above the classifier's Semitic threshold. Chapters 1-2 scored 70-77% Translation; the narrative core scored even higher. This refutes the hypothesis that the infancy narrative was composed in a classical Greek literary style (the "Pindaric" hypothesis — the idea that it was composed in elevated classical Greek), but it also requires careful interpretation. The CBGM substrate analysis (below) found only 6 actual translation seams in 1,068 verses — meaning the classifier detected a register, not a source language. Matthew's author wrote in deliberate Septuagintal style, echoing the syntax of the Greek Old Testament throughout the gospel. The classifier correctly identified this Semitic syntax; the substrate scanner correctly showed it was imitation, not translation. The editorial conclusion for chapters 1-2 (later addition) still stands, but the entire gospel shares this biblically-inflected Greek register — the infancy narrative's editor matched the existing style.7
Synoptic Layer Separation
When Matthew's text is divided by source — Mark-parallel material, Q material (shared with Luke), and Sondergut (unique to Matthew) — and each layer's stylistic profile is measured independently:
- Mark-parallel material clusters with the full text of Mark (PCA distance 0.93)
- Q material clusters with Luke's Q sections (distance 0.98)
- Sondergut floats alone, matching neither
This confirms that Matthew was assembled from identifiable source layers whose syntactic fingerprints survive the compilation process. The compiler smoothed the vocabulary but failed to overwrite the underlying syntax of his sources.
The Editorial Fingerprint
The most significant finding: the 14 fulfillment formulas scattered throughout the Gospel share a micro-syntactic fingerprint with the infancy narrative (chapters 1-2). PCA distance between the fulfillment formulas and the infancy narrative is 0.93 — closer than any Gospel Core layer is to any other Gospel Core layer (internal distances 0.45-0.61). Bootstrap validation (2,000 iterations) confirmed this clustering at p < 0.001 with non-overlapping confidence intervals.
Conclusion: The fulfillment formulas and the infancy narrative were written by the same editorial hand. This hand is stylistically distinct from the compiler of the Gospel Core.5
CBGM Substrate Analysis
The most decisive computational finding came from an analysis that used no linguistic data at all — only the behavior of scribes copying manuscripts over centuries.
The Coherence-Based Genealogical Method (CBGM), developed by the Institute for New Testament Textual Research (INTF), builds genealogical relationships between manuscripts based on their patterns of agreement and disagreement. Using CBGM matrices covering 222 Greek manuscripts of Matthew, a three-layer detection algorithm was applied across the entire New Testament to find verses where the Greek text shows signs of having been translated from a Semitic source:
1. Variation unit count. Verses with seven or more variation units (places where manuscripts diverge) indicate abnormal scribal activity — scribes were independently "fixing" something.
2. Syntactic fragmentation filter. For each variation unit, the algorithm checks whether the variation is syntactic (scribes restructuring the grammar — different verb forms, word orders, pronoun choices) or orthographic (scribes misspelling the same word). Syntactic fragmentation indicates a translation seam; orthographic noise indicates an unfamiliar vocabulary item.
3. Orphan rate. For each minority reading, the algorithm checks whether the scribe's closest genealogical relative reads the same way. If the closest relative reads differently, the minority reading is "orphaned" — it has no ancestor in the manuscript tradition and was independently invented. An orphan rate above 75% means scribes were independently generating new readings because the underlying Greek construction felt wrong.
The algorithm is blind. It reads no Greek, knows no grammar, has no training data, and cannot distinguish Matthew from Philemon. It simply counts variation units, computes ratios, and runs a genealogical binary. The algorithm is mathematically blind to theology.
While these heuristic thresholds (7 VUs, 75% orphan, 0.35 syntax) are calibrated to isolate maximum signal, they are not natural constants. It is possible for translated material to be transmitted cleanly enough to avoid detection (false negatives), or for uniquely difficult Greek to trigger the alarm for reasons unrelated to translation (false positives). However, the 8:1 disparity between John and Matthew persists across various threshold adjustments, suggesting it reflects a fundamental difference in the texts' historical origins rather than an artifact of the algorithm.
Results for Matthew: Of 1,068 verses, only 6 triggered all three filters — the fewest of any Gospel:
| Verse | VUs | Orphan Rate | Syntax Score | Content |
|---|---|---|---|---|
| 10:29 | 7 | 94.4% | 0.50 | "Two sparrows sold for a farthing" — Q saying |
| 16:3 | 7 | 90.0% | 0.50 | Weather proverb — omitted by Sinaiticus and Vaticanus |
| 18:10 | 7 | 88.9% | 0.50 | "Their angels behold the face" — saying unique to Matthew |
| 13:30 | 10 | 92.3% | 0.35 | Wheat and tares parable — 65 unique readings across 109 manuscripts |
| 11:21 | 9 | 88.0% | 0.39 | "Woe to you, Chorazin" — Q woe oracle |
| 26:60 | 10 | 93.1% | 0.35 | False witnesses at trial — passion tradition |
Every site is in inherited material — sayings from Q (shared with Luke), parables from a special source, or the pre-Matthean passion tradition. Not a single site falls in material composed directly in Greek — whether the original structural framework like the genealogy and discourses, or the later editorial additions like the birth narrative and fulfillment formulas. The author's own composition produced zero translation seams because it was composed in Greek.
Contrast with John: The same algorithm applied to John's Gospel — which has a comparable manuscript density (215 witnesses) — found 47 confirmed substrate sites, an 8:1 ratio. John's substrate sites are in narrative action sequences (dipping and giving at the Last Supper, arrest scenes, trial dialogues), meaning the storytelling itself breaks at translation seams. Matthew's 6 sites are all in embedded sayings — discrete units that were translated from Aramaic and inserted into a Greek framework.
This distinction provides the clearest independent evidence. John was composed closer to Aramaic source material; the narrative itself was translated. Matthew was composed in Greek by a fluent Greek author who imported pre-translated Aramaic sayings. The substrate scanner traced the boundary between the inherited sayings core and the Greek composition that surrounds it — independently confirming the two-source composition model through manuscript transmission data alone.
What about the Translation Greek classifier? The earlier finding that "all of Matthew classifies as Translation Greek" is not wrong, but it requires reinterpretation. The classifier was trained to detect Semitic syntax patterns using the Septuagint as its Translation Greek exemplar. Matthew's author wrote in deliberate Septuagintal style — Greek that echoes the syntax of the Greek Old Testament. This is not the same as translating from Aramaic. It is a fluent Greek author choosing to write in a biblically-inflected register. The classifier detected the register; the substrate scanner detected the actual translation seams. When these two signals diverge — Semitic-sounding Greek that produces no scribal fragmentation — the correct diagnosis is stylistic imitation, not translation.8
Seven-Axis Evaluation Framework
Every verse in Matthew (1,068 total) was scored across seven independent axes of evidence:
| Axis | What It Measures |
|---|---|
| External Attestation | Is this verse paralleled in Mark and/or Luke? |
| Synoptic Divergence | How much does Matthew's version differ from its parallels? |
| Fulfillment Formula | Does this verse contain a prophecy citation formula? |
| Contextual Displacement | Is this verse out of order relative to the parallels? |
| Computational Layer | What does the NLP layer analysis say? |
| Semitic Substrate | Does this verse show Hebrew/Aramaic syntactic patterns? |
| Redaction Profiler | Does this verse contain known editorial vocabulary fingerprints? |
Each axis scores from 0.0 (strongest evidence for authenticity) to 1.0 (strongest evidence for editorial origin). The axes are not equally weighted: external attestation carries the highest weight (3.0), followed by fulfillment formula detection (2.5), synoptic divergence and NLP composite (2.0 each), contextual displacement and the redaction profiler (1.5 each), and Semitic substrate markers (1.0). The weighted composite produces tier assignments:
| Tier | Meaning | Verses | % |
|---|---|---|---|
| A | Highest confidence authentic | 496 | 46.4% |
| B | High confidence authentic | 430 | 40.3% |
| C | Mixed signals | 93 | 8.7% |
| D | Multiple editorial indicators | 42 | 3.9% |
| E | Strongest editorial indicators | 7 | 0.7% |
The framework agrees with the 2005 reconstruction on 73% of verses. Where it disagrees, the disagreements fall into predictable categories: the framework identifies some omitted parables as Tier A/B (authentic but lacking external attestation), and some retained editorial transition verses as Tier D/E.
The Redaction Profiler
The profiler is a deterministic rule engine that scans each verse for known editorial patterns — not just vocabulary, but behavioral tells (recurring narrative devices that function as editorial shortcuts). Key detections:
- Dream plot device (ὄναρ, onar): 6 occurrences, all in editorial material. 100% hit rate. The editorial layer uses dreams as a convenient narrative shortcut — a device to abruptly redirect the plot without needing to establish character motivation.
- Fulfillment formula: πληρόω (plēroō, "fulfill") + ῥηθέν (rhēthen, "spoken") or προφήτης (prophētēs, "prophet"). The signature editorial insertion.
- Weeping and gnashing: κλαυθμός (klauthmos) + βρυγμός (brygmos). Appears 6 times, always in unparalleled material. An editorial punishment formula.
- Kingdom of Heaven substitution: The editorial layer systematically replaced βασιλεία τοῦ θεοῦ ("kingdom of God," used by Mark and Luke) with βασιλεία τῶν οὐρανῶν ("kingdom of heaven"). The few places where "kingdom of God" survives in Matthew (e.g., 21:31) are evidence of pre-editorial tradition showing through.
Of 1,068 verses, 834 (78%) show no editorial fingerprints. The remaining 234 (22%) are flagged with specific explanations of which patterns triggered.
Scholarly Confirmation
The reconstruction's approach corresponds closely to the critical consensus as expressed in the standard scholarly commentary on Matthew:
"Two facts are immediately apparent. First, a great portion of the material found in neither Mark nor Luke is redactional or partly redactional." — Davies and Allison6
Davies and Allison identify the same source layers the computational analysis detected: Mark as narrative backbone, Q as teaching source, and M (Sondergut) as a mixture of editorial composition and a small collection of pre-existing traditions (primarily parables). They reject the hypothesis of a unified "M document," concluding that Matthew's unique material comes from multiple sources — some genuine, most editorial.
On the displacement of Q material:
"Despite the different arrangements (which we attribute almost exclusively to the Matthean redaction), the material in Luke and Matthew reflects a common order." — Davies and Allison6
This confirms that the chronological rearrangement in the UPDV — particularly the restoration of sayings to contexts closer to Luke's order — has scholarly support.
Summary of Computational Findings
| Finding | Method | Significance |
|---|---|---|
| Chapters 1-2 stylistically isolated | Change point detection | Confirms infancy narrative is a distinct layer |
| All of Matthew is Septuagintal Greek | Supervised classifier | Refutes Pindaric hypothesis; detects biblical register, not Aramaic source |
| Only 6 translation seams in 1,068 verses | CBGM substrate scanner | All 6 in inherited sayings; zero in editorial framework — demonstrates Greek composition |
| John has 47 substrate sites vs Matthew's 6 | CBGM substrate scanner | 8:1 ratio rules out manuscript density as explanation; confirms different composition models |
| Source layers detectable in syntax | PCA clustering | Confirms Two-Source Hypothesis computationally |
| Fulfillment formulas = same hand as infancy | Bootstrap validation | p < 0.001; non-overlapping confidence intervals |
| M parables cluster with Gospel Core | Stylometric clustering | Pre-Matthean tradition, not editorial invention |
| 78% of verses have no editorial fingerprints | Redaction Profiler | Most of Matthew is transmitted tradition, not editorial |
| Framework agrees with 2005 reconstruction 73% | 7-axis scoring | Independent validation of ad-hoc methodology |
The computational analysis, performed two decades after the original reconstruction, independently validates the macro-level editorial decisions while providing quantitative precision the original methodology could not. The 2026 substrate analysis added a third dimension: manuscript transmission behavior confirms that Matthew is a Greek composition with Aramaic sayings imports, not a translation of a Semitic original — resolving the tension between the Translation Greek classifier (which detected Septuagintal style) and the historical evidence (which points to Greek composition). The one area where the computational evidence challenges the reconstruction — the M parables — remains unresolved due to the displacement problem: authentic material whose original context cannot be recovered.
Conclusion: The Recovered Gospel
The Gospel of Matthew in the UPDV is neither a traditional critical text nor a mere compilation of parallels. It is a forensic restoration of the original text hidden beneath layers of second-century editorial expansion. By removing the infancy narratives, the later prophecy fulfillment formulas, and the Greco-Roman narrative embellishments, the UPDV recovers a text that is chronologically coherent and stylistically consistent. The 2026 computational data confirms that while the editorial shell is natively Greek, the core of the Gospel contains a genuine Aramaic sayings tradition — a sayings tradition plausibly connected with the logia attributed to Matthew. What remains in these pages is approximately 70% of the canonical text: a Gospel that preserves the unique perspective of its original Greek compiler and the authentic voice of the historical Matthew, apart from the theological and literary expansions of a later age.
Notes
- Epiphanius of Salamis, Panarion (Adversus Haereses). On the Hebrew Gospel used by the Ebionites and Nazarenes.
- Pines, Shlomo. The Jewish Christians of the Early Centuries of Christianity According to a New Source. Jerusalem: Israel Academy of Sciences and Humanities, 1966. Pages 21, 23.
- The Dialogue of Timothy and Aquila (TA), Manuscripts R and O at 17.3ab. Also: Old Syriac (Sinaiticus), Palestinian Syriac witnesses, and Von Soden's critical text.
- Dodson, Derek S. "Dreams, the Ancient Novels, and the Gospel of Matthew: An Intertextual Study." Perspectives in Religious Studies 29 (Spring 2002): 46-47, 51.
- Computational stylometry analysis (2026). PROIEL Treebank data, bootstrap validation (2,000 iterations).
- Davies, W. D. and Dale C. Allison Jr. A Critical and Exegetical Commentary on the Gospel according to Saint Matthew. ICC. 3 vols. Edinburgh: T&T Clark, 1988–1997. Vol. 1, pp. 121–127.
- The earlier characterization of the infancy narrative's style as matching "the scholia on Pindar" (Abel, Scholia recentia in Pindari epinicia, 1891) identified a real stylistic anomaly but misdiagnosed the mechanism. Computational classification shows the syntax is Semitic (Translation Greek), not classical. The anomaly is Septuagintal imitation — the editor wrote in a deliberately archaizing biblical Greek, not secular literary Greek.
- CBGM substrate analysis (2026). Three-layer detection algorithm applied to INTF coherence matrices covering 222 Greek manuscripts of Matthew. The method uses no linguistic data — only variation unit counts, syntactic fragmentation ratios, and genealogical orphan rates computed via the CBGM's
find_relativesfunction. The same algorithm applied to John (215 manuscripts) found 47 substrate sites, confirming the method's sensitivity. Full results cover all 13 NT books with 30+ CBGM witnesses.