An ongoing issue regarding copyrights is to sue the aggregators, like Google, who scan the books, and then share some information from them. The latest is a suit against OpenAI and Meta, based upon the scans of hundreds of thousands of novels. The attempt is to stop derivative works, as well as excessive usage of information beyond the acceptable levels of Fair Use. This will need to be worked out in the courts, for sure, as neither side is likely to back down. There is a potential payday for whoever wins as well. I lean towards the side of the copyright owner, but I also recognize that behind the scenes sampling of texts will improve AI and how it formulates answers that are less derivative.
AI companies are being bombarded by legal challenges that will decide the legality of the way large language models are trained.