I have a multilingual manual for a new tractor mounted mower. It's almost unreadable because the 'multilingual' is done by having five columns of text (one of each language) down the pages.
So I thought it would be easy to scan just the English columns and paste them together, five columns to a page, to make an English only version of the manual.
It's not at all easy!
There are very few programs that will 'merge' PDF pages and those that do seem to assume one is just trying to pack two or four complete PDF pages onto one sheet to save paper.
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
On 28/02/2019 22:21, Chris Green wrote:
I have a multilingual manual for a new tractor mounted mower. It's almost unreadable because the 'multilingual' is done by having five columns of text (one of each language) down the pages.
So I thought it would be easy to scan just the English columns and paste them together, five columns to a page, to make an English only version of the manual.
It's not at all easy!
There are very few programs that will 'merge' PDF pages and those that do seem to assume one is just trying to pack two or four complete PDF pages onto one sheet to save paper.
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
Would I be getting to wrong end of this stick by suggesting opening it with Okular and copy and pasting the text you want into another document?
Bev.
On Fri, Mar 01, 2019 at 09:12:28AM +0000, Bev Nicolson wrote:
On 28/02/2019 22:21, Chris Green wrote:
I have a multilingual manual for a new tractor mounted mower. It's almost unreadable because the 'multilingual' is done by having five columns of text (one of each language) down the pages.
So I thought it would be easy to scan just the English columns and paste them together, five columns to a page, to make an English only version of the manual.
It's not at all easy!
There are very few programs that will 'merge' PDF pages and those that do seem to assume one is just trying to pack two or four complete PDF pages onto one sheet to save paper.
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
Would I be getting to wrong end of this stick by suggesting opening it with Okular and copy and pasting the text you want into another document?
Not entirely (the wrong end of the stick that is). I don't have Ocular because I don't run KDE but I could install it, however I guess there are alternatives in Gnome/XFCE.
However it's the destination is an issue isn't it? Can I copy several bits and then paste them into one 'page' which can be converted back to PDF? I want to end up with a multi-page document which I can print double sided so PDF would seem to be a good output format.
On 01/03/2019 10:15, Chris Green wrote:
[SNIP]
Not entirely (the wrong end of the stick that is). I don't have Ocular because I don't run KDE but I could install it, however I guess there are alternatives in Gnome/XFCE.
I use LMC as my daily driver, and Okular runs fine, from the repositories.
Cheers, Laurie.
On Thu, 28 Feb 2019 at 23:36, Chris Green cl@isbd.net wrote:
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page.
pdfjam might do what you want.
If you look through the ALUG archives it came up before when I was needing to reformat a royal mail pricelist and from what I can recall it'll probably have the tools you need if I understand your goal correctly.
Although as a one-off I'd probably end up looking at Bev's suggestion and just combine jpegs using Gimp or similar.
On Fri, Mar 01, 2019 at 10:06:45AM +0000, Mark Rogers wrote:
On Thu, 28 Feb 2019 at 23:36, Chris Green cl@isbd.net wrote:
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page.
pdfjam might do what you want.
If you look through the ALUG archives it came up before when I was needing to reformat a royal mail pricelist and from what I can recall it'll probably have the tools you need if I understand your goal correctly.
It hits the same problem as other PDF manipulation programs I've tried, if you save the resulting PDF 'page' then the size is all wrong for printing. PDFs seem to take their page sizing with them somehow.
Although as a one-off I'd probably end up looking at Bev's suggestion and just combine jpegs using Gimp or similar.
The best I've managed so far is to scan the columns of text as JPEG images, then use ImageMagick 'convert +append ....' to concatenate them into a single image, then ImageMagick again to convert the JPEG image to PDF. This results in a single page PDF file which I can then add to my resulting multi-page PDF document.
On 28/02/2019 22:21, Chris Green wrote:
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
If you can get it into a regular image format the ImageMagick "montage" command line tool is good for stitching images together.
On Fri, Mar 01, 2019 at 10:45:35AM +0000, Bill Hill wrote:
On 28/02/2019 22:21, Chris Green wrote:
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
If you can get it into a regular image format the ImageMagick "montage" command line tool is good for stitching images together.
Yes, I think that's the way to go, scan to JPEG (or other image format), stitch together into a single image using ImageMagick and then convert to a PDF page (also using ImageMagick).
Thanks all.
On Thu, 2019-02-28 at 22:21 +0000, Chris Green wrote:
I have a multilingual manual for a new tractor mounted mower. It's almost unreadable because the 'multilingual' is done by having five columns of text (one of each language) down the pages.
So I thought it would be easy to scan just the English columns and paste them together, five columns to a page, to make an English only version of the manual.
It's not at all easy!
There are very few programs that will 'merge' PDF pages and those that do seem to assume one is just trying to pack two or four complete PDF pages onto one sheet to save paper.
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
I'd be inclined to feed the scanned images into Tesseract and make them into actual text.
Hi,
First of all I must clarify that you have tried to find this user manual online and it's not available :)?
If the above is true why don't you scan the pages extract clean text with tools mentioned before and then paste all to LibreOffice which will allow to save file as document. PDF is just an envelope in which you can have selectable text, images, etc. So you could even use scanned columns of text and align them on A4 in Writer and then save as pdf. Imagemagic seams to be a sledgehammer bit to big for what you need :)
Cheers, Bart
W dniu sobota, 2 marca 2019 Huge huge@huge.org.uk napisał(a):
On Thu, 2019-02-28 at 22:21 +0000, Chris Green wrote:
I have a multilingual manual for a new tractor mounted mower. It's almost unreadable because the 'multilingual' is done by having five columns of text (one of each language) down the pages.
So I thought it would be easy to scan just the English columns and paste them together, five columns to a page, to make an English only version of the manual.
It's not at all easy!
There are very few programs that will 'merge' PDF pages and those that do seem to assume one is just trying to pack two or four complete PDF pages onto one sheet to save paper.
So can anyone suggest how I can do this? I can very easily scan the English bits as required, what I need is a handy tool for putting a number of scanned images onto a page. I was trying to do this using PDF but it may well be that other file types would be easier, I can scan to JPG, TIFF, PNM, etc. easily enough.
I'd be inclined to feed the scanned images into Tesseract and make them into actual text.
-- Today is Sweetmorn, the 61st day of Chaos in the YOLD 3185 'O you who turn the wheel and look to windward, Consider Phlebas, who was once handsome and tall as you.'
main@lists.alug.org.uk http://www.alug.org.uk/ https://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!
On Sun, Mar 03, 2019 at 11:18:27PM +0000, B D wrote:
Hi, First of all I must clarify that you have tried to find this user manual online and it's not available :)?
Yes, the provider actually told me explicitly that they don't provide (or even have) PDF/on-line versions.
If the above is true why don't you scan the pages extract clean text with tools mentioned before and then paste all to LibreOffice which will allow to save file as document. PDF is just an envelope in which you can have selectable text, images, etc. So you could even use scanned columns of text and align them on A4 in Writer and then save as pdf.
I did consider going this route but it would have been more laborious than what I actually did.
Imagemagic seams to be a sledgehammer bit to big for what you need :)
Possibly, but the sequence needed to make my all English printable PDF wasn't all that difficult:-
Scan enough bits (often simply five individual columns of text, but sometimes fewer columns plus some images) of the original to JPEG images. My scanner allows very simple selection of areas to scan using a 'marquee' so this was very easy.
Use ImageMagick's convert utility's 'append' option to combine the scanned JPEG images into a single image. Once I'd worked out the required options this was just repeating the same command line each time.
Use ImageMagick's convert utility to convert back to a single PDF page.
When all done combine all the PDF pages into a single PDF document.
OK, it's not searchable text but apart from that I have exactly what I want.