Was this email forwarded to you? Sign up here

The courts rule in favor of Bright Data over Meta. We have clarity for now on what can and can’t be scraped from the web.

The court rules in favor of web scraping by Bright Data, which Meta had used and then sued. Given how much web scraping is happening these days and how important it is for AI, this ruling is interesting. One thing that was clear from this ruling is that web scraping is permissible if you don’t have to log in to a site. For sites where you have to log in and the terms of service say that you cannot web scrape, then I believe you are in clear violation. All that being said, webscraping is happening everywhere by so many companies.

When I worked in the hedge fund industry we had many ‘rules of the road’ for our internal teams to do web scraping or for using one of the dozens of outsourced web scraping companies we partnered with. A few rules were pretty clear were the following:

You can’t web scrape to avoid paying for data. If the company or website you are scraping offers a paid API, then you must engage and pay the company. - Pretty fair and straightforward
If the company requests via its robots.txt page for you to not scrape a specific page of its website, you must obey. - I have heard many firms don’t obey robots.txt, but I think it is fair to follow the website’s request. (For those unfamiliar, type robots.txt after any website to see what they want web crawlers to scrape or not scrape. For example https://www.ibm.com/robots.txt
If a company requires you to log in or has a click-thru agreement, then you cannot web scrape. - This is pretty similar to this court ruling

With the value of data going up every day and the amount of data-hungry AI models being trained, the web scraping rules are going to come up more and more. This is the first of many lawsuits. This is the first of many discussions of robots.txt and I think we will see a lot of laws come into question and discussion in the coming years. At the end of the day, I think the most clear rule people should follow is, if the company or website sells data, then you shouldn’t scrape to get around compensating them.

Yes, this platform’s 77% return is an outlier – but the rest may surprise you

The news is true: Masterworks’ 15th sale just weeks ago returned an impressive 77% to investors. While such a high return is an outlier for the blue-chip art investing platform, you might be wondering what their prior sales delivered. Glad you asked…

Every one of their sales has returned a profit to investors, with 12 of them delivering double-digit returns, and 1 delivering triple-digit annualized returns.

In full, Masterworks has over 300 paintings and their 16 exits have delivered: 32%, 39.3%, 36.2%, 27.3%, 9.2%, 33.1%, 21.5%, 17.8%, 13.9%, 35%, 10.4%, 325.5%, 4.1%, 17.6%, 77.3%, and 13.4% net annualized returns.

Every sale but one outperformed the stock market in the period from when it was offered to when it was sold.

With performance like that, offerings on the platform can sell out in minutes. However, readers can skip the waitlist to join with this exclusive link.

Past performance is not indicative of future returns, investing involves risk. See disclosures masterworks.com/cd

Web Scraping - Court Ruling

Yes, this platform’s 77% return is an outlier – but the rest may surprise you

Reply

Keep Reading

The Rollup