Getting 403s on downloads with wget

Broken link? Feature request? Anything related to the Spectrum Computing website here.
Post Reply
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Getting 403s on downloads with wget

Post by NickH »

Hi there,

Recently, over the past couple of days or so, I've been getting 403s when I do requests such as:

Code: Select all

$ wget https://spectrumcomputing.co.uk/pub/sinclair/screens/load/j/gif/JetSetWilly.gif
--2023-11-26 12:46:04--  https://spectrumcomputing.co.uk/pub/sinclair/screens/load/j/gif/JetSetWilly.gif
Resolving spectrumcomputing.co.uk (spectrumcomputing.co.uk)... 104.21.93.28, 172.67.203.126, 2606:4700:3032::ac43:cb7e, ...
Connecting to spectrumcomputing.co.uk (spectrumcomputing.co.uk)|104.21.93.28|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2023-11-26 12:46:04 ERROR 403: Forbidden.
I use ZXDB data in order to build content for the ZXVintage Twitter and Blusky feeds - is this deep-linking restriction intentional? If so, what is the current approved way to retrieve files from the SC archive automatically?

Many thanks,

Nick
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Re: Getting 403s on downloads with wget

Post by NickH »

Looks like a Cloudflare thing - entering that URL into a private browser window brings up a "spectrumcomputing.co.uk needs to review the security of your connection before proceeding." window.
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Re: Getting 403s on downloads with wget

Post by NickH »

Looking at my Twitter feed, this is the first post that failed:

It's only a single download, so I'm wondering if there's some sort of daily download limit?
User avatar
PeterJ
Site Admin
Posts: 6879
Joined: Thu Nov 09, 2017 7:19 pm
Location: Surrey, UK

Re: Getting 403s on downloads with wget

Post by PeterJ »

Hi,

I've set Cloudflare to medium risk as we had a lot of unexpected traffic over the weekend.. I will put back to normal at the appropriate time.

We prefer users access files via the front end of the website. We are not a download repository. The only limit is one download every 10 seconds. If you use wget a lot then you may get blocked temporarily.

Edited to add the word temporarily.
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Re: Getting 403s on downloads with wget

Post by NickH »

OK, thanks for letting me know, Peter.
User avatar
PeterJ
Site Admin
Posts: 6879
Joined: Thu Nov 09, 2017 7:19 pm
Location: Surrey, UK

Re: Getting 403s on downloads with wget

Post by PeterJ »

Just to explain the reason for the extra Cloudflare checks.

5 Million visits from the USA in the last 24 hours!

Image
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Re: Getting 403s on downloads with wget

Post by NickH »

Yeah, I'd say that's suspicious :D

Out of interest, what's considered the main Speccy download repository these days?
User avatar
PeterJ
Site Admin
Posts: 6879
Joined: Thu Nov 09, 2017 7:19 pm
Location: Surrey, UK

Re: Getting 403s on downloads with wget

Post by PeterJ »

If you want lots of titles at one time look for @Lady Eklipse on archive.org.

You are welcome to download as much as you like from here, just preferably without using external tools.
User avatar
Seven.FFF
Manic Miner
Posts: 744
Joined: Sat Nov 25, 2017 10:50 pm
Location: USA

Re: Getting 403s on downloads with wget

Post by Seven.FFF »

I'm not intending to speak for Peter or Einar here, but from my perspective, the real jewel in the SC crown is the ZXDB database of titles and metadata, which has been made available to use for free by anybody. Hosting and downloading of the indexed game files is a separate thing, and usage goes straight onto the SC hosting contract charges. So it's pretty normal with these kinds of things to keep the files you host for use by your own site, otherwise people can start to abuse it and cause you to incur extra costs that aren't necessarily funded. Particularly so with external bulk downloads (which you don't appear to be doing here).

If you wanted to make your own automated service to post game details and downloads to twitter, it could be pretty easy to use a local copy of ZXDB, or the ZXInfo API which has also been made available to provide ZXDB data via a json API, and marry that up with your own locally hosted copies of the files which you sourced elsewhere. That way you're using the part which has been willingly given, without taking the part that has not.
Robin Verhagen-Guest
SevenFFF / Threetwosevensixseven / colonel32
NXtel NXTP ESP Update ESP Reset CSpect Plugins
NickH
Drutt
Posts: 18
Joined: Mon Aug 10, 2020 10:04 pm

Re: Getting 403s on downloads with wget

Post by NickH »

Yeah, the ZXVintage bots I've written use a locally-stored ZXDB database, and local copies of everything apart from inlays and screenshots, so it's VERY light-touch when it comes to downloads. I think I tripped over the ten-second limit, so I've put in a delay after each download. If it causes any more problems, I'll sort out a Plan B. I'm curious what all the apps which uses ZXDB do when it comes to getting and displaying files within them.
User avatar
SkoolKid
Manic Miner
Posts: 407
Joined: Wed Nov 15, 2017 3:07 pm

Re: Getting 403s on downloads with wget

Post by SkoolKid »

PeterJ wrote: Sun Nov 26, 2023 5:06 pm You are welcome to download as much as you like from here, just preferably without using external tools.
If there are any acceptable external tools, hopefully SkoolKit's tap2sna.py is one of them. Of the 13093 t2s files currently in the t2sfiles repository, 3690 download from this site. My policy is to use worldofspectrum.net, worldofspectrum.org or tzxvault.org if the desired tape file is available there, and this site as a last resort, but it turns out that a lot of tapes (the ones for newer titles especially) can only be found here.
SkoolKit - disassemble a game today
Pyskool - a remake of Skool Daze and Back to Skool
User avatar
PeterJ
Site Admin
Posts: 6879
Joined: Thu Nov 09, 2017 7:19 pm
Location: Surrey, UK

Re: Getting 403s on downloads with wget

Post by PeterJ »

Hi @SkoolKid,

Sorry, I maybe didn't explain very well Any tool is fine in moderation, but what I'm trying to discourage is bulk downloading. I believe often referred to as leeching.
User avatar
PeterJ
Site Admin
Posts: 6879
Joined: Thu Nov 09, 2017 7:19 pm
Location: Surrey, UK

Re: Getting 403s on downloads with wget

Post by PeterJ »

NickH wrote: Sun Nov 26, 2023 6:27 pm I'm curious what all the apps which uses ZXDB do when it comes to getting and displaying files within them.
One of the big users of ZXDB zxinfo.dk doesn't offer any downloads. worldofspectrum.net get most of their files from archive.org and a few (with agreement) from us.
Post Reply