The ZX Spectrum Explored by Tim Hartnell
Re: The ZX Spectrum Explored by Tim Hartnell
yes, I have another book from Tim Hartnell and tried this program https://www.naps2.com with a flatbed scanner. It is a time consuming process and quite hard for the book spine. I would like to know some tips to keep all pages the same size, remove yellow backgrounds and letters from the other side of a page that shine through,etc. Is there a quick way to scan books?
Edit:
I use this kind of glue to repair books https://www.mundoceys.com/producto/48/c ... nca-rapida
white tail for wood,fabric,cardboard and paper which remains elastic
Edit:
I use this kind of glue to repair books https://www.mundoceys.com/producto/48/c ... nca-rapida
white tail for wood,fabric,cardboard and paper which remains elastic
Re: The ZX Spectrum Explored by Tim Hartnell
there are services like this http://overnight-scanning.eu/book/order/
scanning a book with 270 pages may cost like 35 pounds or less if you allow them to destroy it in the process ouch
scanning a book with 270 pages may cost like 35 pounds or less if you allow them to destroy it in the process ouch
Re: The ZX Spectrum Explored by Tim Hartnell
Another option would be using a scanner app. Never tried any of these https://www.makeuseof.com/tag/mobile-do ... nner-apps/
Re: The ZX Spectrum Explored by Tim Hartnell
Good idea, that might be an ideal solution if the camera is good enough. It would probably require a bit more work to get all the pictures though.
Re: The ZX Spectrum Explored by Tim Hartnell
If you're on a budget then use a proper flatbed book-scanner like the Plustek Opticbook 3800 or 3900. These have a very small distance between the edge of the scanner and the start of the scanning plate, approximately 5mm, that allows you to get in real tight with the guttering without having to break the spine. The Opticbook has a fast scan mechanism as well, twelve seconds to scan, process and save a colour A4 page at 300dpi but smaller greyscale book can be processed a lot faster. All-In-One printers are basically useless for any kind of book scanning as are most of the flatbeds that you can buy in the shops. If you do use an Opticbook then try and get hold of a small program that came with older versions of the driver software called Scanner Utility, it allows you to micro-shift the scan head alignment up to and beyond the edge limit of the scanner.
You don't usually have to break the spine for use on an Opticbook, the angle of page opening for scanning on thinner books comes in at around 95 degrees which won't harm the book at all. BUT the glue on cheap perfect-bound books dries out over time and will crack leading to the pages coming loose if you open it more than about 45 degrees so you should test it first. If it can make an L-Shape with the book open without problems then it should be OK for the Opticbook. If not and you want to keep the book intact then you'll have to try a non-destructive method like a multiple camera\glass plate-based system (good results but expensive) or cheaper photo-scanning hardware. These always look great in the ads and demos but they're usually using a magazine or book that can open somewhere close to 180 degrees, anything shallower introduces a lot of shadows and warping of the image.
Scan Tailor (free software) is the best way to process the scanned pages, it can automatically batch split, deskew, select content, add margins and dewarp pages. This is how you make sure all the pages end up being the same size. There is some rudimentary despeckling and Equalization functions but the latter can sometimes be too heavy on photographs causing them to white-out in some cases. Illumination Equalization should get rid of some text bleeding through from a backing page and remove some of the problems associated with foxing but it's not perfect.
You should feed the results from Scan Tailor into a proper batch imaging program, ImBatch is both free and pretty good at cleaning up old scans but you will have to experiment to get the best results. A typical filter flow for old, yellowed paperbacks (you can remove the yellowing by saving as greyscale in Scan Tailor) would be: Adjust the white and black curves to remove shadowing and text bleeding, apply Gaussian blur to soften up the lettering, adjust Brightness and Contrast for fine tuning, Sharpen to bring the text back into focus. Sometimes running the filters in reverse order can bring better results, especially with crisper source material, but it's all down to trial and error in the end if you want the best results.
PDF creation all depends on whether you want to make the text searchable or not, OCR it if you do or use a simple JPG to PDF creator if you don't. Or convert it to the faster displaying .CBR or .CBRZ format if you're not going to OCR it.
You don't usually have to break the spine for use on an Opticbook, the angle of page opening for scanning on thinner books comes in at around 95 degrees which won't harm the book at all. BUT the glue on cheap perfect-bound books dries out over time and will crack leading to the pages coming loose if you open it more than about 45 degrees so you should test it first. If it can make an L-Shape with the book open without problems then it should be OK for the Opticbook. If not and you want to keep the book intact then you'll have to try a non-destructive method like a multiple camera\glass plate-based system (good results but expensive) or cheaper photo-scanning hardware. These always look great in the ads and demos but they're usually using a magazine or book that can open somewhere close to 180 degrees, anything shallower introduces a lot of shadows and warping of the image.
Scan Tailor (free software) is the best way to process the scanned pages, it can automatically batch split, deskew, select content, add margins and dewarp pages. This is how you make sure all the pages end up being the same size. There is some rudimentary despeckling and Equalization functions but the latter can sometimes be too heavy on photographs causing them to white-out in some cases. Illumination Equalization should get rid of some text bleeding through from a backing page and remove some of the problems associated with foxing but it's not perfect.
You should feed the results from Scan Tailor into a proper batch imaging program, ImBatch is both free and pretty good at cleaning up old scans but you will have to experiment to get the best results. A typical filter flow for old, yellowed paperbacks (you can remove the yellowing by saving as greyscale in Scan Tailor) would be: Adjust the white and black curves to remove shadowing and text bleeding, apply Gaussian blur to soften up the lettering, adjust Brightness and Contrast for fine tuning, Sharpen to bring the text back into focus. Sometimes running the filters in reverse order can bring better results, especially with crisper source material, but it's all down to trial and error in the end if you want the best results.
PDF creation all depends on whether you want to make the text searchable or not, OCR it if you do or use a simple JPG to PDF creator if you don't. Or convert it to the faster displaying .CBR or .CBRZ format if you're not going to OCR it.
"He made eloquent speeches to an audience consisting of a few depressed daffodil roots, and sometimes the cat from next door."
Re: The ZX Spectrum Explored by Tim Hartnell
I've not tried this Android application, but it gets good reviews and a friend of mine scanned an MSX book with it.
https://www.camscanner.com
There is also Microsoft Office Lens (again not tried it personally) which is free.
https://www.microsoft.com/en-gb/p/offic ... verviewtab
https://www.camscanner.com
There is also Microsoft Office Lens (again not tried it personally) which is free.
https://www.microsoft.com/en-gb/p/offic ... verviewtab
- PROSM
- Manic Miner
- Posts: 473
- Joined: Fri Nov 17, 2017 7:18 pm
- Location: Sunderland, England
- Contact:
Re: The ZX Spectrum Explored by Tim Hartnell
Thanks for all the advice! I'm probably going to have to do some experimenting with these techniques on a few different books first before I try this with the Tim Hartnell book. I do have a USB flatbed scanner, but I'll need to see which method gives the best results. If that means removing the binding, then so be it, but not before I've had the chance to read it a bit! That Android app looks interesting, though I wonder if my phone camera would give a clear enough image.
All software to-date
Working on something, as always.
Working on something, as always.
- Einar Saukas
- Bugaboo
- Posts: 3070
- Joined: Wed Nov 15, 2017 2:48 pm
Re: The ZX Spectrum Explored by Tim Hartnell
A .CBR is just a set of images (usually .JPG) compressed in a RAR file. And a .CBZ is just a set of images compressed in a ZIP file. They just change the file extension to indicate "intended to be viewed like a book".
For ZXDB, we prefer a set of book scan images stored as PDF instead of CBR/CBZ. The PDF is a more standard format, thus making it much easier for people to read it online or offline. Unless someone can recommend a free good online reader for CBR/CBZ that websites based on ZXDB could use?
Re: The ZX Spectrum Explored by Tim Hartnell
Having had a lot of experience with creating .CBR files I find them much easier to work with than comparable files of the same format i.e. image files in a PDF wrapper. They load faster, display faster, navigate faster and most CBR viewing software comes with a load of image enhancement filters to get the best out of the pages. PDFs are designed to look the same everywhere, they're the MacDonalds of the document encoding world and it's up to people to decide if that's a good or bad thing.Einar Saukas wrote: ↑Sun Apr 28, 2019 5:49 pm A .CBR is just a set of images (usually .JPG) compressed in a RAR file. And a .CBZ is just a set of images compressed in a ZIP file. They just change the file extension to indicate "intended to be viewed like a book".
For ZXDB, we prefer a set of book scan images stored as PDF instead of CBR/CBZ. The PDF is a more standard format, thus making it much easier for people to read it online or offline. Unless someone can recommend a free good online reader for CBR/CBZ that websites based on ZXDB could use?
Until there is a decent online CBR reader to consider then stick with PDF, as you say it's a standard format especially when those PDFs are more than just wrapped bitmaps. But some of us just like to do things a bit differently
"He made eloquent speeches to an audience consisting of a few depressed daffodil roots, and sometimes the cat from next door."
Re: The ZX Spectrum Explored by Tim Hartnell
A couple open source cbz,cbr readers. There might be more.
https://github.com/codedread/kthoom
https://afzafri.github.io/Web-Comic-Reader/
https://github.com/codedread/kthoom
https://afzafri.github.io/Web-Comic-Reader/
Re: The ZX Spectrum Explored by Tim Hartnell
I was most upset when WOS changed from individual page images to PDF files - rather than navigate or bookmark a pertinent page, I'd have to download the damned PDF every time I wanted to refer to it. Most annoying, and as the books I'd submitted were intended to be single images I was a little miffed that some idiot had converted them to PDF.
I use CBR/CBZ with Perfect Viewer on a Samsung Galaxy Tab S4 these days - nothing else installed, just that and my books/comics/magazines. It's brilliant, and can handle PDF files for those ridiculous now-impossible-to-separate Speccy books/mags.
I mean, what's the issue with individual png/jpegs? It's not like WOS didn't have a damned reader built in, is it?
I use CBR/CBZ with Perfect Viewer on a Samsung Galaxy Tab S4 these days - nothing else installed, just that and my books/comics/magazines. It's brilliant, and can handle PDF files for those ridiculous now-impossible-to-separate Speccy books/mags.
I mean, what's the issue with individual png/jpegs? It's not like WOS didn't have a damned reader built in, is it?
Re: The ZX Spectrum Explored by Tim Hartnell
it may be as easy as uploading books in the format of cbr,cbz to archive.org or use their viewer from your own host assuming the script is free to use.
see this magazine for example https://archive.org/details/Scream.Quee ... BR-GREASY/
see this magazine for example https://archive.org/details/Scream.Quee ... BR-GREASY/
Re: The ZX Spectrum Explored by Tim Hartnell
I used to love the WoS built in reader too. We have something similar which allows the forward and back, but it's not as feature risk as that was. You can also bookmark pages as well as the URL has the magazine, issue and page number. For example:
https://spectrumcomputing.co.uk/mag.php ... 54&page=31
I'm with the PDF camp for general use. Especially when the files have been OCR enabled. Browsers like Chrome display PDFs without any additional software needed which just suits my needs.
Re: The ZX Spectrum Explored by Tim Hartnell
Hi Peter. Let me insist as I believe Archive.org makes the OCR from any zipped folder with images that you upload. Besides, you can also updoad other files like pdf. Have you tried the magazine above? The search function works really well. Perhaps you just have to upload a zipped folder with images inside and they generate the pdf for you as well.
Re: The ZX Spectrum Explored by Tim Hartnell
I'm not sure I follow [mention]hikoki[/mention] sorry. But we do.offer PDFs of Magazines to download to, as well as the JPEGs. All are hosted on archive.org.
Re: The ZX Spectrum Explored by Tim Hartnell
Do you provide books? Do you upload pdf files? Please see these instructions
https://help.archive.org/hc/en-us/artic ... ake-a-book
all need is to upload a zipped folder (which is actually a cbz file) and their system generates ocr, pdf, torrent, ebook, etc
For some books it may worth the effort to take existing PDFs and make CBZ files with Imagemagick or any other software, then let Archive.org make all the OCR and PDF generation process.
https://help.archive.org/hc/en-us/artic ... ake-a-book
all need is to upload a zipped folder (which is actually a cbz file) and their system generates ocr, pdf, torrent, ebook, etc
For some books it may worth the effort to take existing PDFs and make CBZ files with Imagemagick or any other software, then let Archive.org make all the OCR and PDF generation process.
Re: The ZX Spectrum Explored by Tim Hartnell
Seems like a lot of effort where PDFs exist, but if you want to do some I'm very happy to host them here for you.
Re: The ZX Spectrum Explored by Tim Hartnell
A couple of points.
Contributors have no control over the creation of PDFs on archive.org, the PDF derived by their software for the Scream Queen magazine link you gave above is barely 1.3MB and look absolutely awful.
Ripping images out of PDFs only works properly if the PDFs are simple bitmap wrapped files. A lot of PDF creation software use methods to reduce the final size of the file which causes results that aren't always obvious. Using the link above again if you rip the bitmaps out of it you'll find them a blobby mess, because archive.org uses Omnipage as the OCR engine and makes use of the image optimization features it has. It looks OK through a PDF viewer but awful when you've pulled the images out.
"He made eloquent speeches to an audience consisting of a few depressed daffodil roots, and sometimes the cat from next door."
Re: The ZX Spectrum Explored by Tim Hartnell
Good points [mention]Bizzley[/mention]
I did try downloading PDFs of Popular Computing Weekly and Home Computing Weekly some time back and the quality was indeed terrible.
https://archive.org/details/home-computing-weekly
https://archive.org/details/popular-computing-weekly
I did try downloading PDFs of Popular Computing Weekly and Home Computing Weekly some time back and the quality was indeed terrible.
https://archive.org/details/home-computing-weekly
https://archive.org/details/popular-computing-weekly
Re: The ZX Spectrum Explored by Tim Hartnell
OK fellows. I'll try to upload some book later to see if there is some way to get decent results Maybe the pdf will be better if the images have a high DPI
Re: The ZX Spectrum Explored by Tim Hartnell
Done. I took this pdf book that Peter shared in this post viewtopic.php?f=6&t=564#p7787
opened it with PDF-XChange Viewer, went to file>export>export to image>save as png ; export mode:save each page to a separate single page image file ; resolution: 600 dpi
resulting in a 20 gb folder which takes up 670 mb after being zipped. Then I changed the extension from .zip to .cbz and uploaded to archive.org
the file is being processed at the moment https://archive.org/details/CrackingCodeZXSpectrum
the comic viewer, ocr powered search,pdf,epub should appear later.. let's see the quality of the automatic generated pdf.
I've read the images within the zip folder are converted to jpeg2000 format so that's why the pdf may be of poor quality. Anyway your cbz file remains in place to download and it seems possible to edit files according to this page, https://help.archive.org/hc/en-us/artic ... ting-item-
so I could just add the original pdf file.
Most importantly you have a nice comic viewer and their pdf is small and good enough to read online.
opened it with PDF-XChange Viewer, went to file>export>export to image>save as png ; export mode:save each page to a separate single page image file ; resolution: 600 dpi
resulting in a 20 gb folder which takes up 670 mb after being zipped. Then I changed the extension from .zip to .cbz and uploaded to archive.org
the file is being processed at the moment https://archive.org/details/CrackingCodeZXSpectrum
the comic viewer, ocr powered search,pdf,epub should appear later.. let's see the quality of the automatic generated pdf.
I've read the images within the zip folder are converted to jpeg2000 format so that's why the pdf may be of poor quality. Anyway your cbz file remains in place to download and it seems possible to edit files according to this page, https://help.archive.org/hc/en-us/artic ... ting-item-
so I could just add the original pdf file.
Most importantly you have a nice comic viewer and their pdf is small and good enough to read online.
Re: The ZX Spectrum Explored by Tim Hartnell
Another alternative could be to upload pdf books to openlibrary.org
For instance see these z80 assembler books, https://openlibrary.org/search?q=Z80+as ... ltext=true
You could just provide such "read" links which open pdf books with a nice reader.
For pdf books in the wos mirror that don't have OCR, you can add OCR with PDF-XChange Viewer which is free, o maybe it is not necessary as openlibrary.org may add OCR when you upload pdf files, I don't know.
For instance see these z80 assembler books, https://openlibrary.org/search?q=Z80+as ... ltext=true
You could just provide such "read" links which open pdf books with a nice reader.
For pdf books in the wos mirror that don't have OCR, you can add OCR with PDF-XChange Viewer which is free, o maybe it is not necessary as openlibrary.org may add OCR when you upload pdf files, I don't know.
Re: The ZX Spectrum Explored by Tim Hartnell
There you have the Internet Archive BookReader https://openlibrary.org/dev/docs/bookreader
One can download the sources and just edit links within the javascript.
I think I'll stop here : ) thanks for your time
One can download the sources and just edit links within the javascript.
I think I'll stop here : ) thanks for your time
Re: The ZX Spectrum Explored by Tim Hartnell
The search function in the above book is ready! It took the text three days to get indexed, which was expected when reading the FAQ.
Search inside is not working for a book?
Books are not always indexed quickly and sometimes, due to factors such as size, cannot be indexed at all. Sometimes it is a bug and we would appreciate being notified.
Anyways forget the cbz comic hassle. Just upload the original pdf and that's it, all the files including the image viewer and OCR will be generated by the system. Don't forget to add OCR to the original file with pdf-xchange viewer so users can search after downloading it.
If you want the book to be listed on openlibrary.org, you'll have to create an entry in there and associate it with the book link in archive.org
Search inside is not working for a book?
Books are not always indexed quickly and sometimes, due to factors such as size, cannot be indexed at all. Sometimes it is a bug and we would appreciate being notified.
Anyways forget the cbz comic hassle. Just upload the original pdf and that's it, all the files including the image viewer and OCR will be generated by the system. Don't forget to add OCR to the original file with pdf-xchange viewer so users can search after downloading it.
If you want the book to be listed on openlibrary.org, you'll have to create an entry in there and associate it with the book link in archive.org
- PROSM
- Manic Miner
- Posts: 473
- Joined: Fri Nov 17, 2017 7:18 pm
- Location: Sunderland, England
- Contact:
Re: The ZX Spectrum Explored by Tim Hartnell
I've received the book. Right now, I'm starting to type in the programs for the archive. Once those are done, I'll begin to scan the book itself.
All software to-date
Working on something, as always.
Working on something, as always.