Think differently
   |
EDHEC Vox
 |
Research

In defense of more and better AI in the production and review of research

Jana Thiel , Associate Professor

In this article, Jana Thiel, Associate Professor at EDHEC, proposes alternative readings to a recent publication on how generative AI is reshaping the academic publishing process. In a nutshell: AI isn't the root of every problem in academics, used well, it could even help dismantle long-standing barriers and bring genuinely new ideas to light faster.

Reading time :
19 May 2026
Share

The editorial board of Organization Science, a leading international research journal, has recently published an analysis of how generative AI is reshaping the academic publishing process (1). Drawing on nearly 7,000 manuscripts and over 10,000 peer review reports, their findings document a voluntary peer review systems that is headed for crisis: rising submission volumes, declining writing quality, and a drift toward more rather than better research.

 

Jana Thiel, Associate Professor at EDHEC Business School, while sharing the board's concerns and endorsing their core conclusions, argues that some of their data invites a closer reading. In this piece, she offers three alternative interpretations: on what the data actually says about non-native speaker disadvantage, on the ambiguous middle band of AI use, and on the role AI already plays in the review process itself. She then proposes three concrete responses—a journal-trained AI co-editor to provide authors with targeted feedback; an invitation-based submission model using AI pre-screening to reduce cultural and linguistic bias; and an AI sparring partner for reviewers, to free human judgment for the assessments that matter most. 

 

The argument, in brief, is that AI is not the source of all problems, it is part of the solution. Used skillfully, it could help resolve long-standing exclusions and accelerate the publishing of genuinely novel insights.

Recently, I have been the unfortunate reviewer who had to spend time on a largely AI-generated paper that also plagiarized prior work. Spending time on this was not only aggravating but also soul-crushing. Surely this should not even have reached me.

 

Editors and reviewers at all journals get swamped with paper submissions. A recent publication by the editorial board of Organization Science shows the dynamics in impressive numbers (1). They analyzed five years of paper submissions and peer review reports (close to 7,000 manuscripts and more than 10,000 reviews). Using Pangram, a leading detection tools for AI-generated text, they tracked how generative AI has changed the journal’s pipeline since ChatGPT entered our collective consciousness in late 2022. 

Here is what they found:

  • Submissions rose 42% since 2022, driven almost entirely by AI-assisted writing.
  • The AI-assisted writing scores worse on the Flesch Reading Ease metric.
  • Higher AI-written content (>30% AI score) is desk-rejected at drastically higher rates.
  • Reviewer reports are also increasingly AI-assisted, featuring lower readability and a narrower focus. AI reviews tend to gravitate toward theory rather than empirics.
  • Schools with strong publication incentives show the sharpest rise in AI-heavy submissions.

 

Based on this, the editorial team concluded that AI is pushing the field toward more rather than better research. The current volume is unsustainable for the volunteer review system. They recommend that universities should reward quality over quantity, and that journals need better triage tools, for example, submission fees.

A dire outlook. I can feel the exasperation of the editors and their trusted reviewers. I fully share the editors’ concern about a surge of papers that include substantial amounts of low-quality, AI-assisted work, which burdens the review system.

However, the system has always run on thin capacity (2). The AI is just the accelerant. To that extent, I would have loved if the editors could have looked at some of their impressive data through a slightly different lens. I would like to offer some alternative readings here

 

Re-reading 1: the non-native speaker analysis

In their analysis, the editors construct the “All Non-Native English Authors” variable and find that it predicts desk rejection, independent of the AI score. That is a key finding! The editors acknowledge that this might reflect “topic relevance, theoretical framing, methods, differences in writing style, among others,” but they do not explore this dimension in depth. I think this finding warrants a deeper look, especially in the context of AI writing quality.

 

Choosing the right metrics

The key metric used for writing style—the Flesch Reading Ease—is a measure of Anglo-American readability preferences: short sentences, linear argument structures, common words, and active voice. It is not a measure of argumentative depth, theoretical sophistication, or conceptual precision (3).

  • A researcher trained in the German philosophical tradition, or in French grandes écoles argumentation, will routinely produce text that scores poorly on Flesch but possibly quite brilliantly on intellectual substance.
  • Reading ease is not a metric for assessing intellectual contribution. It is a metric that can predict how easily an intellectual contribution may be received and how far it may travel in an audience with Anglo-American reading preferences.
  • The funny thing is that the editorial inadvertently proves this point: The analysis finds that AI writing scores higher on specificity and lower on hedging and passive voice–all quite desirable outcomes to some degree. Yet, it also scores lower on Flesch.

To make any claim toward the case that all these additional AI-assisted manuscripts are of substantially worse quality, one would need to analyze not only the Flesch Reading Ease but also the intellectual contributions put forward.

 

The term writing can be code for something else

Many non-native authors are familiar with writing critique from reviewers. This is curious. Many of us may not truly have a writing problem in the fundamental sense that we don’t know how to craft arguments or write proper English. Instead, we had—and continue to have - a cultural reproduction problem (4).

Non-native writers may produce new knowledge in thought patterns that structure complex arguments differently than the community’s preferred register. Using AI to then translate this thinking into English prose may not help if it does not hit the Anglo-American scholarly register, which itself is a cultural artifact, not a universal quality standard (5).

With that in mind, one must be careful with judgment. As a non-native speaker, using AI may be a rational strategy to wrestle with the ever-present critique of our writing. We may just be using the AI editing wrongly. The analysis would need to distinguish this from low-quality AI substitution.

 

Re-reading 2: the 30-70% band of AI use

Table 1 in the editorial article is revealing. By their own judgment, the editors sort the paper into four buckets of AI use: 0-15%, 15-30%, 30-70%, and above 70%. About 1,200 papers (ca. 25% of the post-2022 sample) fall into the latter two. The editorial asserts that above the 30% threshold writers are giving AI considerably more control, moving increasingly away from human-led work.

 

The editorial’s headline

Above 30% AI use, rejection rates start climbing. AI use appears to significantly reduce the already low R&R rates of 11-13% for non- and low-AI papers down to 3-5% for higher AI use. 

Both the thresholds and associated outcomes need deeper examination:

  • One can retain primary intellectual authorship while crossing the 30% AI writing threshold in a paper. Anyone who has been seriously working with the frontier tools as a draft-producing aid, not as a substitute for own thinking, understands that.
  • Just because a text is heavily AI-written does not immediately mean that its arguments were not human-led. The assumption that, after the 30% mark, you begin to give up intellectual ownership lacks a clear foundation. It may be true, but it may also simply mean that the author let the AI editor type out the prose.

The increase in rejection rate may correlate with, but not be caused by, AI-assisted writing.

 

The finding that could have been a headline

Something else seems to be going on when we look at the increasing rejection rates in the 30-70% bucket. The finding that could have been an alternative headline is buried in footnote 8. When comparing the writings by the same author submitted with high vs. low AI scores, the effect of AI use on rejection rates disappears entirely. 

This means: the AI effect on rejection is entirely about who uses AI heavily, not about AI use in and of itself. If AI use itself isn’t the problem, then any recommendations aimed at reducing AI use may not be targeting the primary driver of the issue.

 

The 30-70% zone - a vast, underexplored land of AI use

In essence, this 30–70% zone seems messy, with lots of stuff happening. It is an awfully broad bandwidth that would have been so rich to explore deeper:

  • The described “sharp divide” could be partly an artifact of where the editors chose to draw the line. It is not empirically discovered in a strong sense.
  • My conjecture is that in the 30-70% zone, there will be a non-negligible number of papers of genuine production. Especially, non-native speakers may turn to AI to render complex thinking into English faster than they did without AI.
  • For some portion of these papers, the functioning principles of the current AI tools can explain the perceived lack of writing quality. Our new friends adapt well to their main “owners.” They simply write in a style that matches their users' cognitive profile (6). That may be too dense and complex for the Flesch-trained academic, while the non-native user considers it perfectly fine English. And it is, just densely so.

 

So, the lower R&R rate in this band may, to some degree, be a function of the author's profile and their way of expressing new ideas, rather than AI use per se, bringing us back to the beginning. This deserves a proper analysis, which the editors do not provide.

 

Re-reading 3: the review analysis

The current standard of reviewing is to not use AI. Against the data, this means: Many authors use AI to write up their manuscripts, but the reviewer is supposed to use unaided human judgment to evaluate it. It is not obvious to me how this is a superior arrangement.

 

Establishing symmetry between production & review

If authors have already committed their intellectual work to the AI systems, a reviewer who refuses AI assistance is not necessarily protecting a pristine human process. Rather, the reviewer uses exhausted human judgment to evaluate something the author may not have treated with the same respect for privacy.

If journals required author consent for AI-assisted review—which most authors would probably grant, given their own usage patterns—you’d be creating a more symmetrical evaluation environment, not a less rigorous one.

 

The case for an AI co-editor

A key objection to AI-based reviewing is the lack of deep domain expertise required for good peer review. Maybe. But the counterevidence comes from the paper itself: the fact that now over 30% of the reviews at Organization Science show some degree of AI assistance is worth highlighting more. We should assume that reviewers do purposefully sign off on these reviews, retaining full final judgment, deeming their reviews adequate.

Likewise, worth highlighting is the editorial team's finding that AI-written reviews have no significant effect on rejection decisions. The editors interpret this charitably—they claim that editors are compensating with their own judgment. But it still means AI assisted reviews are not systematically distorting outcomes.

In my view, this is a mild argument for structured AI review: if uncontrolled, opaque AI review is already neutral in its effects, a deliberate, transparent, journal-trained AI review system might do considerably better. Because the real counterfactual is not “AI review versus expert human review”—it is “AI versus tired, overloaded human operating on compressed time.”

To summarize my prior arguments: We have a proportion of non-native authors who write with poor Flesch scores and following different argumentative conventions. They may still produce interesting new ideas. Such writing lands on the desk of a reviewer who is statistically likely to be from a native English-speaking institution or otherwise steeped in a specific prose register. Most likely, this reviewer is reading their fourth manuscript of the week. The bias operating here is not malicious—it is cognitive and cultural: Dense prose signals intellectual seriousness in some traditions; poor writing in others (7). 

An AI co-editor, specifically trained on the journal’s published corpus and the editorial team's strategic direction, could be tasked with separating prose register from intellectual substance. This would be doing something genuinely different. It could flag: “the contribution is novel, and the empirical identification is sound, but the argument is organized in a way that will be unfamiliar to this journal’s readership,” which is more useful to the author and editor than a rejection based on reading ease.

 

The Caveat

Of course, intellectual content is not always separable from writing quality. Sometimes, the inability to write clearly does reflect a genuine lack of clarity in thought. The “writing is thinking” argument of the editors speaks to this issue.

A journal-trained AI reviewer would need to distinguish “complex thought rendered in unfamiliar prose” from “vague thought hidden behind verbose AI prose.” These can look similar on surface metrics. But this is an argument for a more sophisticated AI review architecture—not against AI review in principle. In fact, this seems a harder problem for an exhausted human reviewer than for a system specifically trained to separate these signals.

 

Reframing the debate

The issue raised by the Org Science editorial team is real and severe: our system of knowledge production is under stress. We do need new ideas for restructuring our processes for the age of AI. The old system is not going to accommodate a growing academic community. The Academy of Management envisions to expand from currently 21 to 30k members by 2030 (8). Many of these new members will want to take part in the publishing process, irrespective of the incentive systems of their schools.

To this end, the editorial piece was somewhat cautious, suggesting AI detection as triage support,  and possibly introducing submission fees to channel the inbound traffic near-term. I fear this may just reproduce the cultural bottleneck that has always shaped what counts as publishable knowledge.

I believe, as a scholarly community, we can be bolder in reimagining what might be possible if we zero in on AI and use it strategically to help mitigate the systematic exclusions—of non-native scholars, of heterodox thought, and of researchers outside the core institutional network. 

Let me sketch here a few alternative ideas around a more AI-assisted funnel:

  • How about training an AI co-editor specifically in the journal’s style and the editorial tastes and strategies? If each journal could get their own private AI system, such co-editor could be a tremendous support for the editorial team. The co-editor could communicate patiently, clearly, and very precisely to authors where their manuscripts may fall short of the intellectual frontier and quality expectations. I would not mind having a certain “token” budget available to have a deeper conversation with the journal’s AI co-editor about my submission. Even if rejected, this may help me better understand and learn.
  • How about asking authors to pre-register papers and/or pass a conversation with the AI co-editor? This could help human editors to decide whether—in principle—the work would fit the journal strategy and then issue an invitation for full paper submission (10). Using an invitation model may make the review process less culturally loaded. Today, some reviews can read as if a paper was a personal insult to the reviewer. An invited paper model may shift presumptions and may change how reviewers engage with manuscripts, especially from non-native authors.
  • How about, making the AI co-editor available to the human reviewer as a sparring partner? The co-editor may help on literature and theory issues should the reviewer need support. This could free up human capacity to carefully review the methods, paper integrity, and special kinks the AI cannot capture. I do sometimes get review requests where I would not mind having a conversation partner who has a cited paper ready for me to inspect before I pass judgment.

 

In sum, I believe AI, properly deployed, is a promising tool for addressing the bottlenecks of the review process. By using AI more skillfully, we may, in fact, be better equipped to resolve current exclusions and finally speed up not just the production but also the publishing of interesting, novel, and useful insights. I welcome more and better AI.

References

(1) Gartenberg, C., Hasan, S., Murray, A., & Pierce, L. (2026). More Versus Better: Artificial Intelligence, Incentives, and the Emerging Crisis in Peer Review. Organization Science - https://doi.org/10.1287/orsc.2026.ed.v37.n3

(2) Bottlenecks of the peer review process, including reviewer overload and fatigue, have long been examined and critically discussed as a structural problem, e.g.: Lindebaum, D. & Jordan, P.J. (2023). Publishing more than reviewing? Organization, 30(2), 396–406 - https://doi.org/10.1177/13505084211051047

Squazzoni, F., Bravo, G. & Takács, K. (2013). Does incentive provision increase the quality of peer review? Research Policy, 42(1), 287–294 - https://doi.org/10.1016/j.respol.2012.04.014

(3) In this seminal paper, Kaplan (1966) describes different cultural writing conventions and the impact on paragraph and argument structuring: Kaplan, R.B. (1966). Cultural Thought Patterns in Intercultural Education. Language Learning, 16: 1–20 - https://doi.org/10.1111/j.1467-1770.1966.tb00804.x

(4) Management scholarship has long been aware of the cultural dominance of English language publishing: Boussebaa, M. & Tienari, J. (2021). Englishization and the politics of knowledge production. Journal of Management Inquiry - https://doi.org/10.1177/1056492619835314

Tietze, S. & Dick, P. (2013). The victorious English language: hegemonic practices in the management academy. Journal of Management Inquiry, 22(1), 122–134 - https://doi.org/10.1177/1056492612444316

(5) Jammulamadaka, N. & Dick, P. (2026). Decolonizing academic publishing. Human Relations, 79(4), 411–434 - https://doi.org/10.1177/00187267261433753

(6) For evidence how the writing complexity of LLM output adapts to the perceived knowledge level of a user (communication complexity about a domain), see: Thakkar, J. et al. (2024). Evaluating the adaptability of large language models for knowledge-aware question and answering. International Journal on Smart Sensing and Intelligent Systems, 17(1) - https://doi.org/10.2478/ijssis-2024-0021

(7) Pudelko & Tenzer (2019) demonstrate that language barriers, not competence gaps, constrain non-native scholars’ careers: Pudelko, M. & Tenzer, H. (2019). Boundaryless Careers or Career Boundaries? The Impact of Language Barriers on Academic Careers in International Business Schools. Academy of Management Learning & Education, 18(2), 213–240 - https://doi.org/10.5465/amle.2017.0236

(8) The growth outlook is stated as part of the Vision 2030 - https://www.aom.org/about-aom/history/

(9) Using AI detection as triage comes with its own complexity. Current research (Liang et al, 2023; 2025) indicates that non-native writers are more likely to be flagged as a false positive for AI-based writing: Liang, W. et al. (2023). GPT detectors are biased against non-native English writers. Patterns - https://doi.org/10.1016/j.patter.2023.100779, Liang, W. et al. (2025). Quantifying large language model usage in scientific papers. Nature Human Behaviour, 9, 2599–2609 - https://doi.org/10.1038/s41562-025-02273-8

(10) Using an editorial pre-approval as upstream triage is currently in use, for example, at Management and Organization Review; their processes may serve to translate into a more broadly re-imagined funnel for other management journals as well - https://www.cambridge.org/core/journals/management-and-organization-review/information/author-instructions/submitting-your-materials

 

 

Other items you may be
interested in

Compassion-based social campaigns: What are their real effects? What are their limitations?

  • Carmen Valor , IIT-Universidad Pontificia Comillas, Alberto Aguilera, 23, 28015, Madrid, Spain
  • Benedetta Crisafulli , Birbeck University of London
  • Paolo Antonetti , Professor