RE: The AI Hype Trap – Why Actual Productivity Gains Are Still Elusive

5 days ago

You are viewing a single comment's thread:

Setting-specific factors
We caution readers against overgeneralizing on the basis of our results.
The slowdown we observe does not imply that current AI tools do not often improve developer’s productivity—we find evidence that the high developer familiarity with repositories and the size and maturity of the repositories both contribute to the observed slowdown, and these factors do not apply in many software development settings. For example, our results are consistent with small greenfield
projects or development in unfamiliar codebases seeing substantial speedup from AI assistance.

AI-specific factors
We expect that AI systems that have higher fundamental reliability, lower
latency, and/or are better elicited (e.g. via more inference compute/tokens, more skilled prompting/scaffolding, or explicit fine-tuning on repositories) could speed up developers in our setting (i.e. experienced open-source developers on large repositories).

Agents can make meaningful progress on issues
We have preliminary evidence (forthcoming) that fully autonomous AI agents using Claude 3.7 Sonnet can often correctly implement the core functionality of issues on several repositories that are included in our study, although they fail to fully satisfy all requirements (typically leaving out important documentation, failing linting/styling rules, and leaving out key unit or integration tests). This represents immense progress relative to the state of AI just 1-2 years ago, and if progress continues apace (which is a priori at least plausible, although not guaranteed), we may soon see significant speedup in this setting.

From that same paper, you are doing what they warn against.

0.000

0 comments