KI-Training trotz Nutzungsvorbehalt, Maschinenlesbarkeit, maschinenlesbar, Künstliche Intelligenz, IT-Recht, Urheberrecht, Rechtsanwalt

Ki training

despite

from

Can you download photos from the internet to train AI models? When is a reservation of use against AI training effective? And why do classic GTC clauses fail due to the hurdle of machine readability?

Ki training with photos

The starting point was a photo taken by a professional photographer and marketed online via a well-known photo agency. The image was publicly available on the agency’s website – but only as a watermarked preview version, flanked by terms of use that prohibited automated access and scraping.

In 2021, a registered association dedicated to the research and development of artificial intelligence accessed this photo as part of an automated process. The aim was not to publish or reuse the image itself, but to incorporate it into a huge data set with billions of image-text pairs. For this purpose, the photo was downloaded briefly, automatically compared with an existing image description and then not saved. Only metadata and a reference to the original location on the Internet were included in the data set.

The photographer saw this as an unauthorized use of his work to create AI training data and demanded an injunction. After the Hamburg Regional Court dismissed the claim, he pursued his request on appeal, forcing the Hanseatic Higher Regional Court to fundamentally address the question of when a reservation of use against text and data mining is actually effective.

Opt-out only if machines understand it

German copyright law generally permits text and data mining without a license, provided the work is legally accessible. However, this legal permission can be excluded. The prerequisite is a reservation of use. A decisive restriction applies to works that are accessible online: the reservation is only effective if it is in machine-readable form.

This formulation may sound technical, but it is legally highly charged. This is because it shifts the effectiveness of a copyright prohibition from the traditional world of contractual texts to the sphere of automated systems. This is precisely where the ruling comes in.

In the case in question, the photo agency through which the photographer distributed his work had included an unambiguous passage in its terms of use. Automated access, scraping, indexing – all of this was prohibited. Clearly for human readers. But for machines too?

What the OLG Hamburg understands by machine readability

With its Judgment of 10.12.2025 – Ref. 5 U 104/24 the Higher Regional Court of Hamburg makes it clear that machine readability is not to be equated with mere machine readability. It is not sufficient for a text to be technically readable. Rather, the decisive factor is whether an automated system can find and interpret the reservation and reliably take it into account when deciding on its use.

The court is thus shifting the benchmark away from the human perspective. The question is not: “Can the ban be read?”, but rather: “Can an automated process clearly implement it?”

This distinction is key. It explains why a GTC text is not sufficient even if it is clearly formulated. Machines must not only recognize words, but also correctly classify their legal meaning – without interpretation, contextual knowledge or legal subsumption.

The OLG does not expressly formulate this line in technical terms, but it is dogmatically precise. Machine readability is to be understood functionally. It only exists if the reservation is suitable for actually controlling automated processes.

Scale for machine readability

When does the question of machine readability arise? The Higher Regional Court of Hamburg clarifies: The decisive factor is not the current state of the art, but the time of the use complained of. The photo was downloaded in the second half of 2021.

The photographer argued that modern tools or AI systems are perfectly capable of evaluating GTC texts. However, the OLG did not accept this argument. The decisive factor was whether such possibilities reliably existed at the time. There was a lack of substantiated arguments in this regard.

The court thus clearly assigns the risk. Anyone who invokes a reservation of use must be able to demonstrate that it was machine-readable during the relevant period. Subsequent technological progress does not remedy this defect.

The rights holder must demonstrate that the reservation of use chosen by him or in any case attributable to him was machine-readable at the time of the challenged act of use.

This temporal fixation gives the judgment additional sharpness. It shows that machine readability is not an abstract ideal, but a concrete fact that requires proof.

GTC, source code, metadata – not enough

It is noteworthy that the OLG does not assume an effective reservation even if the text was not only contained in the GTC, but also in the source code of the website. Even this is not sufficient as long as it is not apparent that automated systems could actually recognize the reservation as a legally relevant signal.

The Senate thus makes it clear that machine readability is not a question of storage location, but of structure.

From the court’s perspective, a text formulated in natural language remains open to interpretation. It is precisely this need for interpretation that contradicts the purpose of the regulation. This is because text and data mining should be possible fully automatically – and should also be prevented fully automatically if there is an effective opt-out.

AI training allowed

The Senate begins with a self-evident fact: the automated downloading of a photo constitutes a reproduction. The fact that the association only kept the image in the system for seconds and then only transferred the metadata to the data set does not make this act any less relevant under copyright law.

At the same time, the OLG considers the use to be covered by the text and data mining restriction. The purpose of the download was clear: the image was to be automatically compared with the existing text description. The activity therefore qualified as text and data mining – and was generally permitted as long as the work was lawfully accessible. This was the case here due to the lack of a machine-readable reservation of use.

Almost incidentally, the OLG also confirmed that the defendant association can also invoke the research limitation under Section 60d UrhG. The validation and quality assurance of data sets is a genuinely scientific process, even if the results are later used in commercial AI systems. The OLG rejected the occasionally expressed concern that Section 60d does not apply to private research institutions.

Significance for practice

If you want to prevent content from being used for text and data mining – and thus indirectly for AI training – you must ensure that the reservation of use is machine-readable and be able to prove this in case of doubt.

Conclusion

Even after this decision, the question of what a machine-readable reservation of use for the training of AI must look like remains unclear. The court makes it clear that it depends on the respective point in time. Due to the exponential development of artificial intelligence, the facts of the case could be assessed differently today. Moreover, not every AI has the same capabilities.

As long as there is no uniform standard, it is also unclear whether means such as robots.txt files, structured metadata or specific TDM opt-out protocols are sufficient.

In order to eliminate legal uncertainty, binding standards would therefore be important to create more legal certainty for rights holders and operators of AI systems.

We are happy to

advise you about

AI!

Our services

Advice on non-disclosure agreement and NDA

We can advise you on all legal issues relating to NDAs and non-disclosure agreements.

Mehr erfahren

Advice on artificial intelligence

We advise you on all legal issues relating to artificial intelligence (AI). From development to training and the use of AI systems.

Mehr erfahren

GTC for e-commerce

We create, check and design customized and legally compliant GTC for your e-commerce project and advise you on all questions of GTC law.

Mehr erfahren

Advice on competition law

We advise you on all questions relating to competition law and unfair competition law, examine advertising measures and advise you on advertising measures.

Mehr erfahren

Advice on patent law

We advise you on all questions of patent law, in particular licensing and enforcement of patent claims. We work together with external patent attorneys on applications and searches.

Mehr erfahren

Successful against infringement of trade secrets

We defend your know-how and trade secrets and take action against infringements to combat them quickly and effectively.

Mehr erfahren

Relevant posts

Do you have any questions?

We are happy to help you.

Contact

Maximum file size: 10MB