Database nerds look away now.
I have two loads of data in .ods or .cvs. Basically two versions of a membership list, one with >500 rows, the other >700. I want a list of approx 200 who are on the big list but not the smaller. 'Compare' in LO just gives me a list of thousands of cells that are different, that's no help.
Can anyone help? Lots on t'internet about merging and tracking changes but nothing about my case I can find.
On Tue, Mar 29, 2022 at 11:39:34AM +0100, Phil Thane wrote:
I have two loads of data in .ods or .cvs. Basically two versions of a membership list, one with >500 rows, the other >700. I want a list of approx 200 who are on the big list but not the smaller. 'Compare' in LO just gives me a list of thousands of cells that are different, that's no help.
Can you normalise the list of names to some kind of a standard format from the 2 lists? If you can, just export them as plain text, merge the two lists and run them through sort and uniq.
Adam --
On Tue, Mar 29, 2022 at 11:39:34AM +0100, Phil Thane wrote:
Database nerds look away now.
I have two loads of data in .ods or .cvs. Basically two versions of a membership list, one with >500 rows, the other >700. I want a list of approx 200 who are on the big list but not the smaller. 'Compare' in LO just gives me a list of thousands of cells that are different, that's no help.
Can anyone help? Lots on t'internet about merging and tracking changes but nothing about my case I can find.
I hate to say it but the obvious answer is that the data shouldn't be in a spreadsheet! :-)
Can you export them (via CVS) to two tables in a database, it would then be trivial to get the answer you want.
On 29/03/2022 14:42, Chris Green wrote:
On Tue, Mar 29, 2022 at 11:39:34AM +0100, Phil Thane wrote:
Database nerds look away now.
I hate to say it but the obvious answer is that the data shouldn't be in a spreadsheet! :-)
If Phil ever needs to share it though, Calc will be easier to work with because Base remains incompatible with MS Access. Otherwise, yes.
Bev.
On Tue, Mar 29, 2022 at 03:32:57PM +0100, Bev Nicolson wrote:
On 29/03/2022 14:42, Chris Green wrote:
On Tue, Mar 29, 2022 at 11:39:34AM +0100, Phil Thane wrote:
Database nerds look away now.
I hate to say it but the obvious answer is that the data shouldn't be in a spreadsheet! :-)
If Phil ever needs to share it though, Calc will be easier to work with because Base remains incompatible with MS Access. Otherwise, yes.
Don't use either Access or Base, the GUI just adds complication and complexity.
Export to CSV, import into a sqlite3 database and write a one-liner in sqlite3 CLI to get the result you want. It might be worth using something like sqliteman to do the CSV import but otherwise just use command line.
The whole thing is an arse ache. Great Ouse Boating Assoc has had two terrible years with ill/dying committee members, data lost, muddled and bank accounts suspended. The latest cock up was the Direct Debit list was sent to the bank to draw down members subs. A copy of the list went to me (ex-membership sec but helping out as both my replacements have resigned in hurry due to ill health) and the Chair. We have been updating the membership record showing who has paid based on that list. THEN we were told actually 178 were declined (many members having given up, retired, died or figured out how to pay online!) But not which were declined, just a new list of those paid. So the task was to subtract one list from the other to discover who had not. Now I have undo many updated records of people who are not really members, or who paid by other means. Then I'm resigning again!
Thanks for trying to help all of you. I tried to learn SQL in my 50s. 70 in 3 weeks and can't be arsed to try again.
Phil Thane www.pthane.co.uk Tweet @pthane 01767 449759 07582 750607
On 29/03/2022 16:17, Chris Green wrote:
On Tue, Mar 29, 2022 at 03:32:57PM +0100, Bev Nicolson wrote:
On 29/03/2022 14:42, Chris Green wrote:
On Tue, Mar 29, 2022 at 11:39:34AM +0100, Phil Thane wrote:
Database nerds look away now.
I hate to say it but the obvious answer is that the data shouldn't be in a spreadsheet! :-)
If Phil ever needs to share it though, Calc will be easier to work with because Base remains incompatible with MS Access. Otherwise, yes.
Don't use either Access or Base, the GUI just adds complication and complexity.
Export to CSV, import into a sqlite3 database and write a one-liner in sqlite3 CLI to get the result you want. It might be worth using something like sqliteman to do the CSV import but otherwise just use command line.
On 29/03/2022 11:39, Phil Thane wrote:
Database nerds look away now.
I have two loads of data in .ods or .cvs. Basically two versions of a membership list, one with >500 rows, the other >700. I want a list of approx 200 who are on the big list but not the smaller. 'Compare' in LO just gives me a list of thousands of cells that are different, that's no help.
Can anyone help? Lots on t'internet about merging and tracking changes but nothing about my case I can find.
OK,
[Insert the usual comments about not using spreadsheets, it's the wrong tool, use a database etc etc etc.]
A quick and dirty solution that you may be able to make work, which I have used very successfully in Excel, and I bet it'll work in Calc.
1) Make a backup of everything, just in case.
New spreadsheet, copy the names from one spreadsheet into 1st Column. Copy names from the other spreadsheet into 2nd column. Select the whole of the two columns. Use Custom Formatting. Look for the Highlight Duplicate Values section, and colourise them.
Then BINGO! All your duplicates are highlighted. Then you can see what matches, and what doesn't.
That may get you part of the way to where you want to be.
Here's an writeup for Excel. There are others: https://theexcelclub.com/ultimate-guide-compare-two-lists-or-datasets-in-exc...
Hope that helps.
Steve
On Tue, 29 Mar 2022 at 22:20, steve-ALUG@hst.me.uk wrote:
A quick and dirty solution that you may be able to make work, which I have used very successfully in Excel, and I bet it'll work in Calc.
Another way would be to have the shorter list in column A, the longer list in column B, and a third column that does a VLOOKUP - C1 does a VLOOKUP of B1 in column A, C2 does a VLOOKUP of B2 in col A, etc. That would give you a value against each col B record telling you if it existed in col A, from which you could sort/filter out the ones which didn't have a match.
Mark
On Wed, 30 Mar 2022 at 08:55, Mark Rogers mark@more-solutions.co.uk wrote:
Another way would be to have the shorter list in column A, the longer list in column B, and a third column that does a VLOOKUP
Example (LibreOffice Calc file): https://drive.google.com/file/d/1MSVEh8FRAlx8Pv7RyBD8Dz5WVHHNm24U/view?usp=s...
Mark
Sorry if this is a bit late, but another way to do it is to use the CSV class in Ruby.
Script and two example CSVs attached.
$ ruby delta_csv.rb big_list.csv little_list.csv
Should print out the records in the big_list that aren't in the little_list.