Skip to content

Conversation

@pierotofy
Copy link
Member

@pierotofy pierotofy commented Jan 26, 2026

The current policy is not sensible. E.g. OpenDroneMap/WebODM#1820

These tools are changing the field and we're turning away good contributions.

I propose to change the stance on AI usage.

@smathermather
Copy link
Contributor

I'm definitely interested in a process that helps us leverage the intent of such pull requests without introducing the liabilities of prompt engineering, e.g. technical debt, copyright dilution, infringement, and digital sovereignty concerns, among others. But the process cannot be to just to revise our policy to permissively allow pull requests that introduce the above liabilities.

None of my questions and concerns from the prior pull request have been addressed in discussion. For clarity of discussion, I have copied my questions and concerns from pull request 3 below.


Practical questions:

  • How do we address the ethical issues with LLMs as part of the project? (Happy to provide links to some of the ethics issues with LLMs as needed.)
  • How do we discern (and how does the "author" discern) possible infringment?
  • How do we address questions of copyright dilution through the use of machine generated content?
  • How do we ensure long term maintainability and bug challenges associated with machine generated code in the absence of having a code author?
  • Who is accountable for the code when machine generated?

Tools like copilot are becoming popular tools among developers and they are going nowhere. In time they will become more common, not less. From a practical standpoint, it would be like mandating to not allow code written with IDEs and only allow code written with Vim (from a tooling point of view, I understand the comparison is not the same).

It is quite possible that in 6 months or 5 years time, we'll be removing LLMs from all our assistive tools based on the bugs they introduce. Or assistive tools will be significantly constrained / modified / replaced by tools that aren't LLM based, but may be branded the same. Or it could be that on github or other corporate environments, copilot will be a pay-only option trap for those who have become dependent. It's hard to predict the future.

But, inevitability is one of the myths / propaganda points of AI/LLMs. I am not attached to any expectation that LLM assisted coding is inevitable over the long term. Do I think it will go away? No. Do I think it has to be inevitable for our work? No.

Aside, if we impose a complete ban, people will probably just lie and say they didn't use any AI, which is worse, because now we will never know.
The copyright / licensing part is important too. A pencil can be used to infringe on copyright, but we don't ban pencils for that reason. It's the artists' job to make sure no work is being infringed.

LLMs aren't just a tool like a pencil -- they are owned and curated by particular corporations and trained on a corpus of work with a variety of licenses. When I use a pencil, I generally know if I'm infringing. How does one verify infringement of code that one has "written" with an AI?

But, the core point I'm making isn't infringement. Infringement is a portion of the question, but independent of infringement, the ability to apply copyright and therefore defend copyleft is a long term liability for projects like ours.

To date in the US, the Copyright Office draws a distinction between "Assistive Tools" which enhance human creativity and tool prompts. This interpretation is a useful framing: https://perkinscoie.com/insights/update/copyright-office-solidifies-stance-copyrightability-ai-generated-works

EU law also draws a distinction between machine generated content and human creative output, with copyrightability applied to the second and not to the first. So even if the US courts decide that "prompt engineering" and other heavily assisted code development strategies result in copyrightable content, the EU will likely decide differently.

Having "interesting" legal problems to deal with isn't an outcome I want to encourage.

I think knowing how much of a contribution is AI generated can be used as a metric to evaluate risk of infringement.

Agreed that knowing proportion of contributions is useful for a variety of reasons from possible infringement to whether the project is copyrightable. I'm not sure the best approach to knowing this, but agree with Brett that explicitly endorsing LLM generated code isn't something I want to do in policy or practice.

@pierotofy
Copy link
Member Author

pierotofy commented Jan 27, 2026

Valid points.

How do we address the ethical issues with LLMs as part of the project? (Happy to provide links to some of the ethics issues with LLMs as needed.)

Yes I think specific topics should be highlighted. Is this about risk of license laundering? Lack of attribution?

How do we discern (and how does the "author" discern) possible infringment?

I think since LLMs are trained on public data, a Google search / GitHub search of relevant samples of a piece of work could quickly provide indications of possible infringement. It's also worth noting that not all code is equal, e.g. boilerplate code for a user interface is different than an algorithm performing a very domain specific task. I would scrutinize the latter more than the former.

How do we address questions of copyright dilution through the use of machine generated content?

By labeling and disclosing the parts that are machine generated (which is addressed in the proposed language change). Git and GitHub can be used to track changes. If machine code is not copyrightable, then code like OpenDroneMap/WebODM#1820 can be used by others, but the sum of machine code + AGPLv3 code remains bound by AGPLv3. Just like when we vendor MIT or BSD code into AGPLv3.

How do we ensure long term maintainability and bug challenges associated with machine generated code in the absence of having a code author?

We don't accept bot generated pull requests without a human in the loop.

Who is accountable for the code when machine generated?

The author/developer, obviously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants