Artificial intelligence models are trained using vast datasets of text, images and other content – much of which may be protected by copyright.
If the material is used without a license, does the training process itself infringe copyright?
Courts overseas have recently begun answering that question. While Australian courts have not yet ruled on the issue, these decisions are likely to influence how the legality of AI training is approached here.
GEMA v OpenAI (Germany)
AI training can constitute “reproduction” under copyright law.
The Munich Regional Court recently considered a claim brought by GEMA, Germany’s music collecting society, against OpenAI, which operates ChatGPT.
The case concerned the use of lyrics from nine famous German songs to train the AI model.
GEMA argued that these lyrics had been ‘memorised’ in the ChatGPT model and could be reproduced almost verbatim in response to a user’s prompt, which GEMA claimed amounted to an infringement under German copyright law.
The court largely upheld GEMA’s claims. A systematic comparison of the training data against outputs showed full reproduction. The Court granted an injunction and damages. It was the first decision in Europe to address such a question and is considered a landmark judgment.
Notably, the Court found that AI training constitutes a “reproduction” under German copyright law. It rejected suggestions that training of AI models using copyright protected works was an ordinary or expected use of the works, to which the respective copyright owners could be said to have implicitly consented. It found the operator (OpenAI) liable, rather than end-users.
OpenAI has announced plans to appeal the decision.
Getty v Stability AI (UK)
Territorial location of training matters, and models are not automatically “infringing copies”.
On 4 November 2025, the English High Court delivered its first AI-related copyright decision. Getty, a global visual content creator and marketplace, sued Stability AI, a generative AI company, over its use of Getty-owned images when training its AI model Stable Diffusion.
The Court accepted that Getty’s images were used in the training process without Getty’s permission and that this training involved copying. However, that training took place outside the UK (in the USA), hence there was no territorial basis for a case of primary infringement. Getty had to narrow its claims to “secondary infringement”, effectively a claim that importing infringing articles into the UK, infringes copyright.
Getty argued that Stable Diffusion contained copies of Getty’s images, and that making it available for download in the United Kingdom constituted importation of infringing copies, and as a result, infringed its copyright.
Stability AI argued that its model could not be an infringing “article” because it is not a physical object. This was rejected by the judge, who found that both tangible and intangible assets could be “articles” for this purpose. (An Australian court is likely to find similarly).
The next issue it had to consider was whether the AI model was “an infringing copy”. The judge found that to be an infringing copy it would need to physically contain a reproduction of Getty’s images. This was not the case here – the model learned from training data but did not actually store any copies. As such, Getty lost on its claim of secondary infringement.
The Court did find for Getty in respect of its claim to trade mark infringement due to Getty’s watermarks appearing in outputs, but this is a minor victory only.
The judgment establishes that the geographical location of the training is critical, and this is likely to be followed in Australia. The decision leaves some key questions unanswered, such as whether training on copyright material within the UK would have amounted to infringement.
Bartz et al v Anthropic PBC (US)
“Fair use” may protect training on lawfully acquired works – but piracy is fatal.
A leading developer of AI Large Language Models, Anthropic, recently reached a settlement in a class action brought in California by authors and publishers who alleged that Anthropic’s training process infringed copyright in around 500,000 books (which, incidentally, included works of Australian authors).
The copyright content in question included not only books, copies of which had been lawfully purchased by Anthropic and then scanned, but also around seven million unauthorised digital copies of books it found on pirating sites.
Anthropic argued that its use was ‘fair use’ and therefore not an infringement of US copyright. In June 2025 the judge issued a summary judgment that held that the use of lawfully purchased books did qualify as ‘fair use’. However, the judge held that the fair use defence did not extend to allow scanning of pirate copies of works, saying that piracy of copyright works is “inherently, irredeemably infringing” regardless of whether those works were used for training. Anthropic found itself facing damages in the tens of billions of dollars (based on $3,000 per pirated work).
In August 2025, the dispute settled for the amount of US$1.5 billion, the largest copyright settlement in US history. In addition to the payout, the settlement included terms that Anthropic destroy the pirated works and any derivative copies originating from those sources.
It is thought that the $3,000 per-work figure may serve as a benchmark in future cases (including outside California) and that the settlement will be influential when pirated works are the subject of a training dispute.
Some Australian developments
At the time of writing, no Australian court has ruled on any AI copyright infringement dispute.
Australia’s legal position under the Copyright Act stands in contrast to the position in the USA, where there is a broad concept of permissible “fair use”. Fair use is a fairly flexible doctrine, when compared with the position under the Australian Copyright Act, whereby specified narrow uses of copyright works can be eligible for a “fair dealing” defence, such as reporting the news, parody or satire, or criticism and review.
A text and data carve-out from infringement for AI training was proposed in the Productivity Commission’s August 2025 report, however it encountered a backlash from the creative sectors and was ultimately emphatically rejected by the Government. The Government is unlikely to include training as a “fair dealing” or implement a blanket exception to copyright infringement enabling AI businesses to train their models on Australian creative work without permission.
The Attorney-General Michelle Rowland, when announcing the rejection of the AI training exception, indicated that additional measures to protect copyright owners could be considered, such as the implementation of a collective or voluntary licensing framework ensuring that creatives be remunerated for the use of their works, increased clarity around what is and what isn’t permitted under copyright law, and a proposal to establish a copyright small claims forum to adjudicate small-scale copyright disputes.
What this means in practice
Until Australian courts provide clarity, risk management is critical.
For AI developers:
- Use licensed materials used for training AI models wherever possible.
- Never train using pirated materials.
- Maintain detailed records of training datasets, model architecture, data provenance in case an allegation of copyright infringement is made
- Consider jurisdictional exposure when determining where training occurs
- Implement safeguards (such as prompt filtering and monitoring) to reduce risk of infringing outputs.
For copyright owners:
- Clearly state your position on AI training in website terms.
- Use watermarks and technological measures.
- Deploy bot detection and blocking tools.
For users of AI services
- Review terms of use carefully.
- Seek robust contractual protections (including warranties and indemnities again allegations of copyright infringement) from AI providers.
- Obtain advice on the levels of protection and risk.
Further judicial guidance (including from Australian courts) is inevitable. Until then, caution and documentation remain the safest course.
Businesses engaging with AI should consider seeking tailored advice to understand how existing copyright frameworks apply to their particular circumstances and to ensure appropriate safeguards are in place.
Our commercial copyright team is closely monitoring these developments and regularly advises clients on managing AI-related copyright risk.
Contact: Daniel Kovacs, dkovacs@kcllaw.com.au, (03) 8600 8859
